More

PhilippGille · 2025-12-11T22:55:08 1765493708

There's various alternatives. For example:

- https://github.com/goccy/go-json

- https://github.com/bytedance/sonic

PhilippGille · 2025-12-11T19:05:38 1765479938

Gemini 3 Pro Preview also got more expensive than 2.5 Pro.

2.5 Pro: $1.25 input, $10 output (million tokens)

3 Pro Preview: $2 input, $12 output (million tokens)

TechDebtDevin · 2025-12-11T22:17:35 1765491455

Literally no difference in productivity from a free/ <0.50c output OpenRouter model. All these > $1.00+ per mm output are literal scams. No added value to the world.

wahnfrieden · 2025-12-12T03:54:05 1765511645

5.1 Pro is great

manmal · 2025-12-12T06:35:23 1765521323

I struggle to see where Pro is better than 5.x with Thinking. Actually prefer the latter.

wahnfrieden · 2025-12-12T16:13:17 1765555997

Many problems where latter spins its wheel and Pro gets it in one go, for me. You need to give Pro full files as context and you need to fit within its ~60k (I forget exactly) silent context window if using via ChatGPT. Don't have it make edits directly, have it give the execution plan back to Codex

PhilippGille · 2025-12-10T23:31:22 1765409482

> He has GLM 4.5 Running at ~100 Tokens per second.

GLM 4.5 Air, to be precise. It's a smaller 166B model, not the full 355B one.

Worth mentioning when discussing token throughput.

dnhkng · 2025-12-11T07:05:03 1765436703

I'm downloading DeepSeek-V3.2-Speciale now at FP8 (reportedly Gold-medal performance in the 2025 International Mathematical Olympiad and International Olympiad in Informatics).

It will fit in system RAM, and as its mixture of experts and the experts are not too large, I can at least run it. Token/second speed will be slower, but as system memory bandwidth is somewhere around 5-600Gb/s, so it should feel OK.

Gracana · 2025-12-11T14:46:41 1765464401

Check out "--n-cpu-moe" in llama.cpp if you're not familiar. That allows you to force a certain number of experts to be kept in system memory while everything else (including context cache and the parts of the model that every token touches) is kept in VRAM. You can do something like "-c128k -ngl 99 --n-cpu-moe <tuned_amt>" where you find a number that allows you to maximize VRAM usage without OOMing.

PhilippGille · 2025-12-05T10:17:19 1764929839

Why not? A good status page runs on a different cloud provider in a different region, specifically to not be affected at the same time.

PhilippGille · 2025-12-02T23:32:59 1764718379

https://openrouter.ai/mistralai/mistral-large-2512

PhilippGille · 2025-11-27T19:58:00 1764273480

Multiple projects for autonomous multi agent teams already exist.

PhilippGille · 2025-11-27T00:31:46 1764203506

Qwen2.5-VL-7B to be precise. It's a relevant difference.

PhilippGille · 2025-11-22T04:37:19 1763786239

Related from July:

"Linux on Snapdragon X Elite: Linaro and Tuxedo Pave the Way for ARM64 Laptops"

291 points, 217 comments

https://news.ycombinator.com/item?id=44699393

userbinator · 2025-11-22T07:31:49 1763796709

The first comment there is worth reading again, just for this sentence:

If you want to change some settings oft[sic] the device, you need to use their terrible Electron application.

PhilippGille · 2025-11-21T21:40:28 1763761228

Not mentioned yet in this subthread, but worth checking out because it runs fully local: https://play.google.com/store/apps/details?id=com.stoegerit....

It's not perfect, for example its monthly/yearly subscription detection didn't work great for me, but compared to all those apps that involve trusting a third party with your banking data it's worth a look.

PhilippGille · 2025-11-19T07:27:42 1763537262

You can push to any other Git server during a GitHub outage to still share work, trigger a CI job, deploy etc, and later when GitHub is reachable again you push there too.

Yes you lose some convenience (like GitHub's pull requests UI can't be used, but you can temporarily use the other Git server's UI for that.

I think their point was that you're not fully locked in to GitHub. You have the repo locally and can mirror it on any Git remote.

paulddraper · 2025-11-19T13:54:41 1763560481

For sure, you don’t have to use GitHub to be that shared server.

It is awfully convenient, web interface, per branch permissions and such.

But you can choose a different server.

1718627440 · 2025-11-19T14:14:50 1763561690

If your whole network is down, and you also don't want to connect the hosts with an Ethernet cable, you can even just push to an USB stick.