How exactly does that work? There are two (major) types of Image Generation AIs.
1 The GAN (Generative Adversarial Networks) way works with a Generator and a Discriminator. The Generator generates fake images. The Discriminator looks at the image and says: real or fake. If the Generator can generate an image that the Discriminator cannot distinguish from fake, you have generated a good image. The better the Discriminator can distinguish fakes, the better the output.
2 Diffusion Models. These generate an image by scattering pixels and systematically converting noise into structured images.
The prompt is also interpreted with an NLP, a vector is created from it which will later be used as a rulebook to see that all necessary elements are present and if they interact with each other (e.g. apple under tree, apple on head).