More

threeducks · 2025-12-05T12:44:39 1764938679

Tail calls can be used for parsing very efficiently: https://news.ycombinator.com/item?id=41289114

threeducks · 2025-12-03T13:52:01 1764769921

I can not say how big ML companies do it, but from personal experience of training vision models, you can absolutely reuse the weights of barely related architectures (add more layers, switch between different normalization layers, switch between separable/full convolution, change activation functions, etc.). Even if the shapes of the weights do not match, just do what you have to do to make them fit (repeat or crop). Of course the models will not work right away, but training will go much faster. I usually get over 10 times faster convergence that way.

sota_pop · 2025-12-04T15:54:56 1764863696

It’s possible the model architecture influences the effectiveness of utilizing pretrained weights. i.e. cnns might be a good fit for this since the first portion is the feature extractor, but you might scrap the decoder and simply retrain that.

Can’t say whether the same would work with Transformer architecture, but I would guess there are some portions that could potentially be reused? (there still exists an encoder/feature extraction portion)

If you’re reusing weights from an existing model, then it seems it becomes more of a “fine-tuning” exercise as opposed to training a novel foundational model.

threeducks · 2025-12-03T11:18:47 1764760727

Why would the open weights providers need their own tools for agentic workflows when you can just plug their OpenAI-compatible API URL into existing tools?

Also, there are many providers of open source models with caching (Moonshot AI, Groq, DeepSeek, FireWorks AI, MiniMax): https://openrouter.ai/docs/guides/best-practices/prompt-cach...

rglullis · 2025-12-03T12:46:24 1764765984

> when you can just plug their OpenAI-compatible API URL into existing tools?

Only the self-hosting diehards will bother with that. Those that want to compete with Claude Code, Gemini CLI, Codex et caterva will have to provide the whole package and do it a price point that is competitive even with low volumes - which is hard to do because the big LLM providers are all subsidizing their offerings.

threeducks · 2025-12-03T11:02:42 1764759762

You need a certain level of batch parallelism to make inference efficient, but you also need enough capacity to handle request floods. Being a small provider is not easy.

threeducks · 2025-12-03T10:40:33 1764758433

I just tried it with GPT-5.1-Codex. The compression ratio is not amazing, so not sure if it really worked, but at least it ran without errors.

A few ideas how to make it work for you:

1. You gave a link to a PDF, but you did not describe how you provided the content of the PDF to the model. It might only have read the text with something like pdftotext, which for this PDF results in a garbled mess. It is safer to convert the pages to PNG (e.g. with pdftoppm) and let the model read it from the pages. A prompt like "Transcribe these pages as markdown." should be sufficient. If you can not see what the model did, there is a chance it made things up.

2. You used C++, but Python is much easier to write. You can tell the model to translate the code to C++ once it works in Python.

3. Tell the model to write unit tests to verify that the individual components work as intended.

4. Use Agent Mode and tell the model to print something and to judge whether the output is sensible, so it can debug the code.

throwaway31131 · 2025-12-03T14:32:41 1764772361

Interesting. Thanks for the suggestions.

threeducks · 2025-12-01T17:40:33 1764610833

> I do wonder if there are any DOS vectors that need to be considered if such a large image can be defined in relatively small byte space.

You can already DOS with SVG images. Usually, the browser tab crashes before worse things happen. Most sites therefore do not allow SVG uploads, except GitHub for some reason.

asddubs · 2025-12-02T07:12:14 1764659534

svg is also just kind of annoying to deal with, because the image may or may not even have a size, and if it does, it can be specified in a bunch of different units, so it's a lot harder to get this if you want to store the size of the image or use it anywhere in your code

threeducks · 2025-12-01T09:21:19 1764580879

Here is an even older comment chain about it from 2020: https://news.ycombinator.com/item?id=23895706

Apparently, comparing low-background steel to pre-LLM text is a rather obvious analogy.

pseidemann · 2025-12-01T09:58:11 1764583091

As well as that people often do think alike.

If you have a thought, it's likely it's not new.

rollulus · 2025-12-01T09:27:58 1764581278

Oh wow, great find! That’s really early days.

threeducks · 2025-11-30T17:00:49 1764522049

Could you explain a bit how the code works? For example, how does it detect the correct pixel size and how does it find out how to color the (potentially misaligned) pixels?

threeducks · 2025-11-30T16:53:26 1764521606

I found the attempt to promote his website as an "Ask HN" thread amusing, after his previous four "Show HN" threads were closed.

ProllyInfamous · 2025-11-30T16:57:28 1764521848

https://news.ycombinator.com/showhn.html

I think reddit's moderation guideline [that <10% of a users' posts ought to be related to product], along with time-limitations [see Y Combinator's own policy on its own incubated projects posting].

With exceptions for truly exceptional users (community concensus) // none granted, here.

----

New accounts ought to be able to downvote (currently 501+ karma) before they can ever submit new links (somehow no current restriction), IMHO.

----

OP: you are obviously new here (possibly AI translations, minimum, if not clanking-outright)... if your account isn't banned (which it should be IMHO, for at least a few months): don't post again until within the next monthly "What are you working on" thread, which is auto-generated (not by you).

This will require that you actually visit the homepage regularly, to wait for this thread... which might give you an opportunity to learn more about this community's culture / structure / rules.

At a minimum, give the bare minimum effort of abiding by this community's absolutely bare bones rules (publicly available).

threeducks · 2025-11-30T16:47:09 1764521229

That's a lot of em-dashes in your ad.

And a heads up: I get "This video is not rated. Join vimeo to watch." when trying to watch the video.

spinity · 2025-11-30T16:55:51 1764521751

Thanks for the feedback! I’m pretty new to posting on HN, so the writing style might be a bit rough — still figuring out the “right amount of em-dashes” As for the video, it plays fine on my side, but it might be restricted by Vimeo’s region or Cloudflare settings on your network. I’ll double-check the permissions to make sure everyone can view it. Thanks for the heads up!