It sounds like what makes the pipeline in the article effective is the second st...

tptacek · 2026-03-30T20:42:13 1774903333

There are at least three differences:

* Carlini's team used new frontier models that have gotten materially better at finding vulnerabilities (talk to vulnerability researchers outside the frontier labs, they'll echo that). Stenberg was getting random slop from people using random models.

* Carlini's process is iterated exhaustively over the whole codebase; he's not starting with a repo and just saying "find me an awesome bug" and taking that and only that forward in the process.

* And then yes, Carlini is qualifying the first-pass findings with a second pass.

tomjakubowski · 2026-03-31T18:36:50 1774982210

Thanks, I hadn't considered the second point.

I guess the broader point I wanted to make is about the people responsible for the deluge of LLM-reported bugs and security vulnerabilities on countless open-source projects (not only on curl): they weren't considerate or thoughtful security researchers, they were spammers looking to raise their profile with fully automated, hands-off open source "contributions". I would expect that the spammers would continue to use whatever lowest common denominator tooling is available, and continue to cause these headaches for maintainers.

That doesn't mean frontier models and tooling built around them aren't genuinely useful to people doing serious security research: that does seem to be the case, and I'm glad for it.