Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm on the same page here. I have seen this sentiment about Codex suddenly being good a few times now, so I booted Codex CLI thinking-high back up after a break and asked it to look for bugs. It promptly found five bugs that didn't actually exist. It was the kind of truly impressively stupid mistake that I haven't seen Claude Code make essentially ever, and made me wonder if this isn't the sort of thing that's making people downplay the power of LLMs for agentic coding.


I asked Sonnet 4.5 to find bugs in the code, it found five high-impact bugs that, when I prompted it a second time, it admitted weren't actually bugs. It's definitely not just Codex.


In my case codex fixed a bug in one shot. Took 10 min to debug and find it.

Claude struggled long time and still didn’t find.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: