> You need iteration and I believe these kinds of AI have the same issues as us....

famouswaffles · on March 20, 2023

Literally LLMs get much better with chain of thought, feedback, and/or consensus.

Gpt-3 performance on MultiArith goes from 18% to 92% with all three. This isn't some hackneyed anthropomizing. Countless research papers showing massive improvement with these processes.

drinfinity · on March 20, 2023

That's (IMO) too narrow view of what a "machine" is. Complex machinery of any kind never is 100% correct and needs constant correction and maintenance. I still think approaching this as a "calculator" is awkward at best.

Jensson · on March 20, 2023

> Complex machinery of any kind never is 100% correct and needs constant correction and maintenance

Computers are extremely close to 100%, we generally expect a CPU to never make errors even after years of working. If it starts making any errors at all we throw it away and make a new one.

pixl97 · on March 20, 2023

This is a very weird statement that's failing based on logical category.

My computer will pretty much add 1+1 correctly forever never making a mistake.

My computer will perform an 'error' every time I put bad code into it, and some of those logic chains and error conditions are not very obvious.

The issue here is you think the LLM is performing a category 1 error, when the problem we are seeing is a much more human like category 2 error.

bryanrasmussen · on March 20, 2023

>Computers are extremely close to 100%

We must work in extremely different industries!

Jensson · on March 20, 2023

Do you code in checks to check the calculations made by the CPU? I've never ever seen anyone do that. If a CPU starts making errors we throw it away. A typical CPU will make many quadrillions of correct calculations before its first error, I'd say that is basically 0 errors.