I view it more as fun and spicy. Now we are moving away from the paradigm that t...

root_axis · 2025-11-20T07:23:50 1763623430

I won't begrudge anyone having fun with their tools, but folk magic definitely isn't a necessary step for understanding anything, it's one step removed from astrology.

mbrock · 2025-11-20T11:29:16 1763638156

I see what you mean, but I think it's a lot less pernicious than astrology. There are plausible mechanisms, it's at least possible to do benchmarking, and it's all plugged into relatively short feedback cycles of people trying to do their jobs and accomplish specific tasks. Mechanical interpretability stuff might help make the magic more transparent & observable, and—surveillance concerns notwithstanding—companies like Cursor (I assume also Google and the other major labs, modulo self-imposed restrictions on using inference data for training) are building up serious data sets that can pretty directly associate prompts with results. Not only that, I think LLMs in a broader sense are actually enormously helpful specifically for understanding existing code—when you don't just order them to implement features and fix bugs, but use their tireless abilities to consume and transform a corpus in a way that helps guide you to the important modules, explains conceptual schemes, analyzes diffs, etc. There's a lot of critical points to be made but we can't ignore the upsides.

jack_pp · 2025-11-20T12:08:16 1763640496

I'd say the only ones capable of really approaching anything like scientific understanding of how to prompt these for maximum efficacy are the providers not the users.

Users can get a glimpse and can try their best to be scientific in their approach however the tool is of such complexity that we can barely skim the surface of what's possible.

That is why you see "folk magic", people love to share anecdata because.. that's what most people have. They either don't have the patience, the training or simply the time to approach these tools with rational rigor.

Frankly it would be enormously costly in both time and API costs to get anywhere near best practices backed up by experimental data let alone having coherent and valid theories about why a prompt technique works the way it does. And even if you built up this understanding or set of techniques they might only work for one specific model. You might have to start all over again in a couple of months

root_axis · 2025-11-20T20:36:44 1763671004

> That is why you see "folk magic", people love to share anecdata because.. that's what most people have. They either don't have the patience, the training or simply the time to approach these tools with rational rigor.

Yes. That's exactly the point of my comment. Users aren't performing anything even remotely approaching the level of controlled analysis necessary to evaluate the efficacy of their prompt magic. Every LLM thread is filled with random prompt advice that varies wildly, offered up as nebulously unfalsifiable personality traits (e.g. "it makes the model less aggressive and more circumspect"), and all with the air of a foregone conclusion's matter-of-fact confidence. Then someone always replies with "actually I've had the exact opposite experience with [some model], it really comes down to [instructing the model to do thing]".