Try it on a million line code base where it's not so cut and dry to even determi...

Closi · on Oct 16, 2023

"A tool is only useful if I can use it in every situation".

LLM's don't need to find every bug in your code - even if they found an additional 10% of genuine bugs compared to existing tools, it's still a pretty big improvement to code analysis.

In reality, I suspect the scope is much higher than 10%.

chrisco255 · on Oct 17, 2023

If it takes you longer to vet hallucinations than to just test your code better, is it an improvement? If you accept a bug fix for a hallucination that you got too lazy to check because you grew dependent on AI to do the analysis for you, and the bug "fix" itself causes other unforeseen issues or fails to recognize why an exception in this case might be worth preserving, is it really an improvement?

Closi · on Oct 17, 2023

What if it takes you longer to vet false positives from a static analysis tool rather than just testing your code better?

Cthulhu_ · on Oct 17, 2023

What if indeed. Most static analyses tools (disclaimer: anecdotal) have very little false positives these days. This may be much worse in C/C++ land though, I don't know.

yjftsjthsd-h · on Oct 16, 2023

Is it better or worse than a human, though?

inopinatus · on Oct 16, 2023

It’s slightly worse than a junior developer, and just as confidently incorrect, but much faster to iterate.

Either is better than no assistant at all. With circumstantial caveats.

OJFord · on Oct 16, 2023

Sounds like it will go far!

SirMaster · on Oct 16, 2023

I would imagine worse, because a human has a much, much, much larger context size.

jameshart · on Oct 16, 2023

But also a much much shorter attention span and tolerance for BS.

If you ask the LLM to analyze those 1000000 lines 1000 at a time, 1000 times, it’ll do it, with the same diligence and attention to detail across all 1000 pages.

Ask a human to do it and their patience will be tested. Their focus will waver, they’ll grow used to patterns and miss anomalies, and they’ll probably skip chunks that look fine at first glance.

Sure the LLM won’t find big picture issues at that scale. But it’ll find plenty of code smells and minor logic errors that deserve a second look.

chrisco255 · on Oct 17, 2023

Ok, why don't you run this experiment on a large public open source code base? We should be drowning in valuable bug reports right now but all I hear is hype.

Cthulhu_ · on Oct 17, 2023

While true, on the other hand an AI is a tool, and can have a much larger context size, and it can apply all of that at once. It also isn't limited by availability or time constraints, i.e. if you have only one developer that can do a review, and the tooling or AI can catch 90% of what that developer would catch.

dzhiurgis · on Oct 16, 2023

I've separated 5000 line class into smaller domains yesterday. It didn't provide end solution, it wasn't perfect, but gave me a good plan where to place what.

Once it is capable to process larger context windows it will become impossible to ignore.

ushakov · on Oct 16, 2023

You can’t, it has a context size window of 8192 tokens. That’s like 1000 lines depending on programming language