Hacker Newsnew | past | comments | ask | show | jobs | submit | anentropic's commentslogin

That was my feeling - more than 'owning' uv etc I could see this as being about getting people onboard who had a proven track record delivering developer tooling that was loved enough to get wide adoption

or `uv tool install unsloth` for a safe 'global' installation

"Together with architectural refinements, our Mamba-3 model achieves significant gains across retrieval, state-tracking, and downstream language modeling tasks. At the 1.5B scale, Mamba-3 improves average downstream accuracy by 0.6 percentage points compared to the next best model (Gated DeltaNet), with Mamba-3's MIMO variant further improving accuracy by another 1.2 points for a total 1.8 point gain. Across state-size experiments, Mamba-3 achieves comparable perplexity to Mamba-2 despite using half of its predecessor's state size. Our evaluations demonstrate Mamba-3's ability to advance the performance-efficiency Pareto frontier."

It's apparently optimised for inference efficiency

Additionally: https://www.together.ai/blog/mamba-3

"Mamba-3 SISO beats Mamba-2, Gated DeltaNet, and even Llama-3.2-1B (Transformer) on prefill+decode latency across all sequence lengths at the 1.5B scale."


I have been using this a lot lately and ... it's good.

Sometimes annoying - you can't really fire and forget (I tend to regret skipping discussion on any complex tasks). It asks a lot of questions. But I think that's partly why the results are pretty good.

The new /gsd:list-phase-assumptions command added recently has been a big help there to avoid needing a Q&A discussion on every phase - you can review and clear up any misapprehensions in one go and then tell it to plan -> execute without intervention.

It burns quite a lot of tokens reading and re-reading its own planning files at various times, but it manages context effectively.

Been using the Claude version mostly. Tried it in OpenCode too but is a bit buggy.

They are working on a standalone version built on pi.dev https://github.com/gsd-build/gsd-2 ...the rationale is good I guess, but it's unfortunate that you can't then use your Claude Max credits with it as has to use API.


I'm curious how these approaches compare with MRDTs implemented in Irmin

https://gowthamk.github.io/docs/mrdt.pdf


Is any of this stuff sort of out there working and I maybe used it without realising?

Or it's all super niche for "personal website-based social networking" enthusiasts and never took off, because big players didn't implement it and we need them to, or whatever?


Yeah...

https://github.com/opengraviton/graviton?tab=readme-ov-file#...

the benchmarks don't show any results for using these larger-than-memory models, only the size difference

it all smells quite sloppy


This vs JTD?

Arguably the original name was the newspeak and the new name is more honest

Do you have a nice way to let it 'use the app' or receive visual feedback?

I imagine that would help the process a lot


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: