If a tech works 80% of the time, then I know that I need to be vigilant and I will review the output. The entire team structure is aware of this. There will be processes to offset this 20%.
The problem is that when the AI becomes > 95% accurate (if at all) then humans will become complacent and the checks and balances will be ineffective.
80% is good enough for like the bottom 1/4th-1/3rd of software projects. That is way better than an offshore parasite company throwing stuff at the wall because they don't care about consistency or quality at all. These projects will bore your average HNer to death rather quickly (if not technically, then politically).
Maybe people here are used to good code bases, so it doesn't make sense that 80% is good enough there, but I've seen some bad code bases (that still made money) that would be much easier to work on by not reinventing the wheel and not following patterns that are decades old and no one does any more.
I think defining the places where vibe-coded software is safe to use is going to be important.
My list so far is:
* Runs locally on local data and does not connect to the internet in any way (to avoid most security issues)
* Generated by users for their own personal use (so it isn't some outside force inflicting bad, broken software on them)
* Produces output in standard, human-readable formats that can be spot-checked by users (to avoid the cases where the AI fakes the entire program & just produces random answers)
We are already there. The threshold is much closer to 80% for average people. For average folks, LLMs have rapidly went from "this is wrong and silly" to "this seems right most of the time so I just trust it when I search for info" in a few years.
It is frankly scary seeing novices adopt AI for stuff that you're good at and then hearing about the garbage it's come up with and then realising this problem is everywhere.
Gell-Mann amnesia. After I saw the subtle ways LLMs can off mark on things I know about, I am very wary to use it for any subject I don't dominate. I don't want to learn some plausible nonsense.
Except that we see people in this very thread claiming they shouldn't review code anymore, just the prompts. So however good it is now is enough to be dangerous to users.
If a tech works 80% of the time, then I know that I need to be vigilant and I will review the output. The entire team structure is aware of this. There will be processes to offset this 20%.
The problem is that when the AI becomes > 95% accurate (if at all) then humans will become complacent and the checks and balances will be ineffective.