It depends. Google's AI that gloms on to search is not particularly good for pro...

zozbot234 · 2026-03-07T14:29:57 1772893797

Set up mmap properly and you can evaluate small/medium MoE models (such as the recent A3B from Qwen) on most ordinary hardware, they'll just be very slow. But if you're willing to wait you can get a feel for their real capabilities, then invest in what it takes to make them usable. (Usually running them on OpenRouter will be cheaper than trying to invest in your own homelab: even if you're literally running them on a 24/7 basis, the break even point compared to a third-party service is too unrealistic.)

frumiousirc · 2026-03-08T13:18:05 1772975885

Subjectively, but with tests using identical prompts, I find the quality of qwen3.5 122b below claude haiku by as much as claude haiku is below claude sonnet for software design planning tasks. I have yet to try a like-for-like test on coding.

nickjj · 2026-03-07T13:58:18 1772891898

> But, this also all depends on the experience level of the developer. If you are gonna vibe code,

Where I find it struggles is when I prompt it with things like this:

> I'm using the latest version of Walker (app launcher on Linux) on Arch Linux from the AUR, here is a shell script I wrote to generate a dynamic dmenu based menu which gets sent in as input to walker. This is working perfectly but now I want to display this menu in 2 columns instead of 1. I want these to be real columns, not string padding single columns because I want to individually select them. Walker supports multi-column menus based on the symbol menu using multiple columns. What would I need to change to do this? For clarity, I only want this specific custom menu to be multi-column not all menus. Make the smallest change possible or if this strategy is not compatible with this feature, provide an example on how to do it in other ways.

This is something I tried hacking on for an hour yesterday and it led me down rabbit hole after rabbit hole of incorrect information, commands that didn't exist, flags that didn't exist and so on.

I also sometimes have oddball problems I want to solve where I know awk or jq can do it pretty cleanly but I don't really know the syntax off the top of my head. It fails so many times here. Once in a while it will work but it involves dozens of prompts and getting a lot of responses from it like "oh, you're right, I know xyz exists, sorry for not providing that earlier".

I get no value from it if I know the space of the problem at a very good level because then I'd write it unassisted. This is coming at things from the perspective of having ~20 years of general programming experience.

Most of the problems I give it are 1 off standalone scripts that are ~100-200 lines or less. I would have thought this is the best case scenario for it because it doesn't need to know anything beyond the scope of that. There's no elaborate project structure or context involving many files / abstractions.

I don't think I'm cut out for using AI because if I paid for it and it didn't provide me the solution I was asking for then I would expect a refund in the same way if I bought a hammer from the store and the hammer turned into spaghetti when I tried to use it, that's not what I bought it for.

frumiousirc · 2026-03-08T13:14:45 1772975685

What LLM are you using? What you describe should be no problem for gemini free or claude haiku and above. Other models, I dunno.

nickjj · 2026-03-09T12:27:31 1773059251

Both ChatGPT's anonymous one as well as Google's "AI mode" on their search page which brings you to a dedicated page to start prompting. I'm not sure if that's Gemini proper because if I goto https://gemini.google.com/app it doesn't have my history.

frumiousirc · 2026-03-11T09:53:01 1773222781

The "AI mode" in Google search is pretty bad for programming. It is not Gemini.

I don't have direct experience with ChatGPT but those that do that I've talked to place it behind Gemini and Claude models.

Try free Claude or Gemini on the web and see if you have a better experience. Claude free is better than Gemini free. (actually, Gemini free seems extra dumb lately).

nickjj · 2026-03-11T22:47:30 1773269250

Thanks, I tried both and the results were not good IMO.

I gave them the same prompts. Both failed to give a working solution. I lost track of how many times it said "This is the guaranteed to work final solution" which still had the same problem as the 5 previous failures.

I gave up after around 40 failed prompts in a row where it was "Absolutely certain" it will work and is the "final boss" of the solution.