More

anentropic · 2026-03-19T18:38:15 1773945495

That was my feeling - more than 'owning' uv etc I could see this as being about getting people onboard who had a proven track record delivering developer tooling that was loved enough to get wide adoption

anentropic · 2026-03-18T09:15:15 1773825315

or `uv tool install unsloth` for a safe 'global' installation

anentropic · 2026-03-18T08:44:19 1773823459

"Together with architectural refinements, our Mamba-3 model achieves significant gains across retrieval, state-tracking, and downstream language modeling tasks. At the 1.5B scale, Mamba-3 improves average downstream accuracy by 0.6 percentage points compared to the next best model (Gated DeltaNet), with Mamba-3's MIMO variant further improving accuracy by another 1.2 points for a total 1.8 point gain. Across state-size experiments, Mamba-3 achieves comparable perplexity to Mamba-2 despite using half of its predecessor's state size. Our evaluations demonstrate Mamba-3's ability to advance the performance-efficiency Pareto frontier."

It's apparently optimised for inference efficiency

Additionally: https://www.together.ai/blog/mamba-3

"Mamba-3 SISO beats Mamba-2, Gated DeltaNet, and even Llama-3.2-1B (Transformer) on prefill+decode latency across all sequence lengths at the 1.5B scale."

anentropic · 2026-03-18T08:33:10 1773822790

I have been using this a lot lately and ... it's good.

Sometimes annoying - you can't really fire and forget (I tend to regret skipping discussion on any complex tasks). It asks a lot of questions. But I think that's partly why the results are pretty good.

The new /gsd:list-phase-assumptions command added recently has been a big help there to avoid needing a Q&A discussion on every phase - you can review and clear up any misapprehensions in one go and then tell it to plan -> execute without intervention.

It burns quite a lot of tokens reading and re-reading its own planning files at various times, but it manages context effectively.

Been using the Claude version mostly. Tried it in OpenCode too but is a bit buggy.

They are working on a standalone version built on pi.dev https://github.com/gsd-build/gsd-2 ...the rationale is good I guess, but it's unfortunate that you can't then use your Claude Max credits with it as has to use API.

anentropic · 2026-03-16T10:10:06 1773655806

I'm curious how these approaches compare with MRDTs implemented in Irmin

https://gowthamk.github.io/docs/mrdt.pdf

anentropic · 2026-03-12T15:15:09 1773328509

Is any of this stuff sort of out there working and I maybe used it without realising?

Or it's all super niche for "personal website-based social networking" enthusiasts and never took off, because big players didn't implement it and we need them to, or whatever?

anentropic · 2026-03-09T16:17:24 1773073044

Yeah...

https://github.com/opengraviton/graviton?tab=readme-ov-file#...

the benchmarks don't show any results for using these larger-than-memory models, only the size difference

it all smells quite sloppy

anentropic · 2026-03-09T10:55:46 1773053746

This vs JTD?

anentropic · 2026-03-06T11:27:41 1772796461

Arguably the original name was the newspeak and the new name is more honest

anentropic · 2026-03-04T23:23:12 1772666592

Do you have a nice way to let it 'use the app' or receive visual feedback?

I imagine that would help the process a lot