Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It sounds like what makes the pipeline in the article effective is the second stage, which takes in the vulnerability reports produced by the first level and confirms or rejects them. The article doesn't say what the rejection rate is there.

I don't think the spammers would think to write the second layer, they would most likely pipe the first layer (a more naive version of it too, probably) directly to the issue feed.



There are at least three differences:

* Carlini's team used new frontier models that have gotten materially better at finding vulnerabilities (talk to vulnerability researchers outside the frontier labs, they'll echo that). Stenberg was getting random slop from people using random models.

* Carlini's process is iterated exhaustively over the whole codebase; he's not starting with a repo and just saying "find me an awesome bug" and taking that and only that forward in the process.

* And then yes, Carlini is qualifying the first-pass findings with a second pass.


Thanks, I hadn't considered the second point.

I guess the broader point I wanted to make is about the people responsible for the deluge of LLM-reported bugs and security vulnerabilities on countless open-source projects (not only on curl): they weren't considerate or thoughtful security researchers, they were spammers looking to raise their profile with fully automated, hands-off open source "contributions". I would expect that the spammers would continue to use whatever lowest common denominator tooling is available, and continue to cause these headaches for maintainers.

That doesn't mean frontier models and tooling built around them aren't genuinely useful to people doing serious security research: that does seem to be the case, and I'm glad for it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: