Whatever the benchmarks might say, there's something about Claude that seems to ...

weego · 2025-08-07T17:06:15 1754586375

Agreed, I always give my one pager product briefs to AI to break down into phases and tasks, and then progress trackers. I explicitly prompt for verbose phases, tasks and test plans.

Yesterday without much promoting Claude 4.1 gave me 10 phases, each with 5-12 tasks that could genuinely be used to kanban out a product step by step.

Claude 3.7 sonnet was effectively the same with fewer granular suggestions for programming strategies.

Gemini 2.5 gave me a one pager back with some trivial bullet points in 3 phases, no tasks at all.

o3 did the same as as Gemini, just less coherent.

Claude just has whatever the thing is for now

unshavedyak · 2025-08-07T17:10:26 1754586626

How are you having claude track these phases/tasks? Eg are you having it write to a TASKS.md and update it after each phase?

m3kw9 · 2025-08-08T04:56:04 1754628964

Just say begin task 1, 2 etc scroll back and see the task. Or copy paste into notes and do them sequenced

SequoiaHope · 2025-08-08T02:16:45 1754619405

If you have any examples of these one pagers I’d love to see them!

concinds · 2025-08-07T18:23:53 1754591033

Gemini Pro or Flash?

dudeinhawaii · 2025-08-07T19:49:20 1754596160

My experience has been that Claude Code is exceptional at tool use (and thus working with agentic IDEs) but... not the smartest coder. It will happy re-invent the wheel, create silos, or generate terrible code that you'll only discover weeks or months later. I've had to rollback weeks of code to discover major edge regressions that Claude had introduced.

Now, someone will say 'add more tests'. Sure. But that's a bandaid.

I find that the 'smarter' models like Gemini and o3 output better quality code overall and if you can afford to send them the entire context in a non-agentic way .. then they'll generate something dramatically superior to the agentic code artifacts.

That said, sometimes you just want speed to proof a concept and Claude is exceptional there. Unfortunately, proof of concepts often... become productionized rather than developers taking a step back to "do it right".

dagss · 2025-08-08T07:55:57 1754639757

I disagree that tests are bandaids. Humans needs tests to avoid doing regressions. If you avoid tests you are giving the AI a much harder task than what human programmers usually have.

atonse · 2025-08-07T16:50:31 1754585431

That's been my experience too. Even though Gemini also does seem to do the fancy one-shot demo code well, in day to day coding, Claude seems to do a much better job of just understanding how programming actually works, what to do, what not to do, etc.

deadbabe · 2025-08-07T18:22:28 1754590948

The secret is just better context engineering. There is no other “secret” sauce, all these models are built on the same concepts.

bamboozled · 2025-08-07T17:09:14 1754586554

Claude is fast too, Gemini isn’t as good and just gets hung up on things Claude doesn’t.