More

rubiquity · 2026-04-19T15:40:12 1776613212

He doesn't work at Vercel but he is the type to never pass up any opportunity to chase clout.

dankwizard · 2026-04-20T01:13:12 1776647592

He is affiliated with Vercel though

threecheese · 2026-04-19T15:57:08 1776614228

Almost like that’s his job.

Hey, I’m with you - I think social media needs to die specifically for this reason. I’m reminded of the term “snake oil” - it’s like the dawn of newspapers again.

TiredOfLife · 2026-04-19T19:03:33 1776625413

Media as a whole needs to die

hoppyhoppy2 · 2026-04-19T23:28:34 1776641314

Including books and the internet?

rubiquity · 2026-04-17T21:31:31 1776461491

I have both HW3 (2021 Y) and HW4 (2025 3). FSD in the HW4 is a delight. FSD in HW3 phantom brakes constantly both back when FSD was a pile of C++ and now with the "Lite" driving model. I don't see how Tesla can ever make FSD suitable on HW3 given the hardware (<200 TOPS).

rubiquity · 2026-04-16T20:39:10 1776371950

Have you tried running llama.cpp with Unified Memory Access[1] so your iGPU can seamlessly grab some of the RAM? The environment variable is prefixed with CUDA but this is not CUDA specific. It made a pretty significant difference (> 40% tg/s) on my Ryzen 7840U laptop.

1 - https://github.com/ggml-org/llama.cpp/blob/master/docs/build...

zozbot234 · 2026-04-16T20:49:22 1776372562

Your link seems to be describing a runtime environment variable, it doesn't need a separate build from source. I'm not sure though (1) why this info is in build.md which should be specific to the building process, rather than some separate documentation; and (2) if this really isn't CUDA-specific, why the canonical GGML variable name isn't GGML_ENABLE_UNIFIED_MEMORY , with the _CUDA_ variant treated as a legacy alias. AIUI, both of these should be addressed with pull requests for llama.cpp and/or the ggml library itself.

rubiquity · 2026-04-16T21:10:29 1776373829

You are right that it is an environment variable, and that's how I have it set in my nix config. Thanks for correcting that.

Unfortunately llama.cpp is somewhat notorious for having lackluster docs. Most of the CLI tools don't even tell you what they are for.

mncharity · 2026-04-16T22:33:50 1776378830

Hmm. Perhaps there's a niche for a "The Missing Guide to llama.cpp"? Getting started, I did things like wrapping llama-cli in a pty... and only later noticing a --simple-io argument. I wonder if "living documents" are a thing yet, where LLMs keep an eye on repo and fora, and update a doc autonomously.

mncharity · 2026-04-16T22:16:41 1776377801

I hadn't tried that, thanks! I found simply defining GGML_CUDA_ENABLE_UNIFIED_MEMORY, whether 1, 0, or "", was a 10x hit to 2 t/s. Perhaps because the laptop's RAM is already so over-committed there. But with the much smaller 4B Qwen3.5-4B-Q8_0.gguf, it doubled performance from 20 to 40+ t/s! Tnx! (an old Quadro RTX 3000 rather than an iGPU)

rubiquity · 2026-04-16T20:34:48 1776371688

Not sure why you're being downvoted, I guess it's because how your reply is worded. Anyway, Qwen3.7 35B-A3B should have intelligence on par with a 10.25B parameter model so yes Qwen3.5 27B is going to outperform it still in terms of quality of output, especially for long horizon tasks.

rubiquity · 2026-04-16T19:24:19 1776367459

Could be on a bike path where bikes are on the left and pedestrians to the right.

rubiquity · 2026-04-06T03:17:11 1775445431

All Microsoft services are for entertainment purposes only, as in you’d have to be absolutely crazy enough to use before a Microsoft sales rep has taken your execs to a steakhouse and strip club.

zahlman · 2026-04-06T04:32:35 1775449955

> All Microsoft services are for entertainment purposes only

Which is why they're all getting named Copilot now.

kuerbel · 2026-04-06T05:38:57 1775453937

Trying to get free/busy working with an exchange hybrid setup is not my kind of entertainment but I don't judge

rubiquity · 2026-04-06T03:12:32 1775445152

The optionality of consuming services from places other than internet titans for one would be nice.

dlenski · 2026-04-07T16:18:11 1775578691

What exactly does that have to do with the bandwidth of one's home Internet connection?

rubiquity · 2026-03-30T20:53:59 1774904039

I was distracted by the picture of the ingredients to a Final Ward being at the top of the page.

rubiquity · 2026-03-24T17:10:57 1774372257

llama.cpp and llama-swap do this better than Ollama and with far more control.

circularfoyers · 2026-03-24T18:51:31 1774378291

Don't even need to use llama-swap anymore now that llama-server supports the same functionality.

rubiquity · 2026-03-25T00:13:37 1774397617

I did not know that. Thanks for sharing!

rubiquity · 2026-03-01T07:03:03 1772348583

This is great to hear! Out of curiosity, which brand did you go with? I tend to stick to Sapphire but the prices are within $200 of each other.

cyberax · 2026-03-02T02:13:11 1772417591

I got Sapphires because they were the ones available at the time of purchase :)