That was my feeling - more than 'owning' uv etc I could see this as being about getting people onboard who had a proven track record delivering developer tooling that was loved enough to get wide adoption
"Together with architectural refinements, our Mamba-3 model achieves significant gains across retrieval, state-tracking, and downstream language modeling tasks. At the 1.5B scale, Mamba-3 improves average downstream accuracy by 0.6 percentage points compared to the next best model (Gated DeltaNet), with Mamba-3's MIMO variant further improving accuracy by another 1.2 points for a total 1.8 point gain. Across state-size experiments, Mamba-3 achieves comparable perplexity to Mamba-2 despite using half of its predecessor's state size. Our evaluations demonstrate Mamba-3's ability to advance the performance-efficiency Pareto frontier."
It's apparently optimised for inference efficiency
"Mamba-3 SISO beats Mamba-2, Gated DeltaNet, and even Llama-3.2-1B (Transformer) on prefill+decode latency across all sequence lengths at the 1.5B scale."
I have been using this a lot lately and ... it's good.
Sometimes annoying - you can't really fire and forget (I tend to regret skipping discussion on any complex tasks). It asks a lot of questions. But I think that's partly why the results are pretty good.
The new /gsd:list-phase-assumptions command added recently has been a big help there to avoid needing a Q&A discussion on every phase - you can review and clear up any misapprehensions in one go and then tell it to plan -> execute without intervention.
It burns quite a lot of tokens reading and re-reading its own planning files at various times, but it manages context effectively.
Been using the Claude version mostly. Tried it in OpenCode too but is a bit buggy.
They are working on a standalone version built on pi.dev https://github.com/gsd-build/gsd-2 ...the rationale is good I guess, but it's unfortunate that you can't then use your Claude Max credits with it as has to use API.
Is any of this stuff sort of out there working and I maybe used it without realising?
Or it's all super niche for "personal website-based social networking" enthusiasts and never took off, because big players didn't implement it and we need them to, or whatever?
reply