Prompt engineering exists because a) LLMs are trained to optimize for statistica...

Prompt engineering exists because a) LLMs are trained to optimize for statistically average accommodation of the dataset and b) Sturgeon's law: "ninety percent of everything is crap". Therefore, LLMs out-of-the-box will give worse-than-ideal results by design.

The initial proof that prompt engineering worked was around the VQGAN + CLIP days, where simply adding "world-famous" or "trending on ArtStation" was more than enough to objectively improve generated image quality.

The workaround to prompt engineering is RLHF/alignment of the LLM, but everyone who has played around with ChatGPT knows that isn't sufficient.