More

cowmix · 2026-02-18T17:25:11 1771435511

If you don't mind saying, what distro and/or Docker container are you using to bet Qwen3 Coder Next going?

lambda · 2026-02-18T18:37:02 1771439822

I'm running Fedora Silverblue as my host OS, this is the kernel:

  $ uname -a
  Linux fedora 6.18.9-200.fc43.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Feb  6 21:43:09 UTC 2026 x86_64 GNU/Linux

You also need to set a few kernel command line paramters to set it up to allow it to use most of your memory as graphics memory, I have the following in my kernel command line, those are each 110 GiB expressed in number of pages (I figure leaving 18 GiB or so for CPU memory is probably a good idea):

  ttm.pages_limit=28835840 ttm.page_pool_size=28835840

Then I'm running llama.cpp in the official llama.cpp Docker containers. The Vulkan one works out of the box. I had to build the container myself for ROCm, the llama.cpp container has ROCm 7.0 but I need 7.2 to be compatible with my kernel. I haven't actually compared the speed directly between Vulkan and ROCm yet, I'm pretty much at the point where I've just gotten everything working.

In a checkout of the llama.cpp repo:

  podman build -t llama.cpp-rocm7.2 -f .devops/rocm.Dockerfile --build-arg ROCM_VERSION=7.2 --build-arg ROCM_DOCKER_ARCH='gfx1151' .

Then I run the container with something like:

  podman run -p 8080:8080 --device /dev/kfd --device /dev/dri --security-opt seccomp=unconfined --security-opt label=disable --rm -it -v ~/.cache/llama.cpp/:/root/.cache/llama.cpp/ -v ./unsloth:/app/unsloth llama.cpp-rocm7.2  --model unsloth/MiniMax-M2.5-GGUF/UD-Q3_K_XL/MiniMax-M2.5-UD-Q3_K_XL-00001-of-00004.gguf --jinja --ctx-size 16384 --seed 3407 --temp 1.0 --top-p 0.95 --min-p 0.01 --top-k 40 --port 8080 --host 0.0.0.0 -dio

Still getting my setup dialed in, but this is working for now.

Edit: Oh, yeah, you had asked about Qwen3 Coder Next. That command was:

  podman run -p 8080:8080 --device /dev/kfd --device /dev/dri --security-opt seccomp=unconfined --security-opt label=disable \
    --rm -it -v ~/.cache/llama.cpp/:/root/.cache/llama.cpp/ -v ./unsloth:/app/unsloth llama.cpp-rocm7.2  -hf unsloth/Qwen3-Coder-Next-GGUF:UD-Q6_K_XL \
    --jinja --ctx-size 262144 --seed 3407 --temp 1.0 --top-p 0.95 --min-p 0.01 --top-k 40 --port 8080 --host 0.0.0.0 -dio

(as mentioned, still just getting this set up so I've been moving around between using `-hf` to pull directly from HuggingFace vs. using `uvx hf download` in advance, sorry that these commands are a bit messy, the problem with using `-hf` in llama.cpp is that you'll sometimes get surprise updates where it has to download many gigabytes before starting up)

nyrikki · 2026-02-18T18:20:52 1771438852

I can't answer for the OP but it works fine under llama.cpp's container.

cowmix · 2025-11-28T19:15:26 1764357326

This is huge. There's been 3rd party Signal library for this for years -- and for some reason I can't determine, the developers have opted NOT to do this.

jeltz · 2025-11-28T21:17:35 1764364655

Yeah, the Signal team's roadmap seems very strange to me as an outsider. There are some low hanging fruits which they just seem to refuse to fix.

And given how in this case Molly could fix it it cannot have been that hard to fix.

cowmix · 2025-11-28T13:57:25 1764338245

In the 80s I read all the comics compilations from the late 50s -> 70s, that was the golden age of the strip. It was an amazing comic and you'll see why all the strips creators since then were inspired by it.

kulahan · 2025-11-28T22:26:54 1764368814

You’re not the only one talking about just how wonderful the earliest strips were. I think I’ll be checking those out

cowmix · 2025-11-19T20:27:12 1763584032

Yeah, I wonder what gaps in Litellm Proxy made Mozilla want to even do this.

cowmix · 2025-10-28T16:33:50 1761669230

For all my constant freak-outs about AI in general, it turned out to be a godsend last year when my wife’s mom was hospitalized (and later passed away a few weeks afterward). Multimodal ChatGPT had just become available on mobile, so being able to feed it photos of her vital sign monitors to figure out what was going on, have it translate what the doctors were telling us in real time, and explain things clearly made an incredible difference. I even used it to interpret legal documents and compare them with what the attorneys were telling us — again, super helpful.

And when the bills started coming in, it helped there too. Hard to say if we actually saved anything — but it certainly didn’t hurt.

brikym · 2025-10-29T01:06:35 1761699995

Doubters say it's not as accurate or could hallucinate. But the thing about hiring professionals is that you have to blindly trust them because you'd need to have a professional level of knowledge to qualify who is competent.

LLMs are a good way to double check if the service you're getting is about right or steer them onto the right hypothesis when they have some confirmation bias. This assumes that you know how to prompt it with plenty of information and open questions that don't contain leading presuppositions.

An LLM read my wife's blood lab results and found something the doctor was ignoring.

throwawayffffas · 2025-10-29T03:29:16 1761708556

All these things are language parsing and transforming. That's the kind of thing llms are good at.

brikym · 2025-10-29T19:48:47 1761767327

And statistical modelling. LLM's must have a weights that associate _seen X chemical high, so Y condition likely follows in the documents I've read_.

cowmix · 2025-10-22T19:52:07 1761162727

This post resurfaced a thought I had. MSFT is really, really pushing AI. It would be really cool if someone attempted, with any of the coding models / agents, to recreate Windows from "scratch". THAT would be very interesting, and useful -- on my levels.

cowmix · 2025-10-16T19:10:44 1760641844

This is what people like Grover Norquist fought hard to stop. They want every part of the tax collecting/paying process as hard as possible:

https://priceonomics.com/the-stanford-professor-who-fought-t...

cowmix · 2025-09-26T13:13:54 1758892434

When I ran the Python Meetup here in Phoenix -- an engineer for Intel's compilers group would show up all the time. I remember he would constantly be frustrated that Intel management would purposely down-play and cripple advances of the Atom processor line because they thought it would be "too good" and cannibalize their desktop lines. This was over 15 years ago -- I was hearing this in real-time. He flat out said that Intel considered the mobile market a joke.

cowmix · 2025-09-24T01:40:09 1758678009

On Snapdragon / ARM Windows, it is the only game in town -- and it works great too!

cowmix · 2025-09-08T21:56:50 1757368610

100% -- firs time I have seen providing a docker compose file is a sign of weakness.