Hacker Newsnew | past | comments | ask | show | jobs | submit | michaelgiba's commentslogin

It’s not surprising that there could be a very slight quality drop off for making the model return its answer in a constrained way. You’re essentially forcing the model to express the actual answer it wants to express in a constrained language.

However I would say two things: 1. I doubt this quality drop couldn’t be mitigated by first letting the model answer in its regular language and then doing a second constrained step to convert that into structured outputs. 2. For the smaller models I have seen instances where the constrained sampling of structured outputs actually HELPS with output quality. If you can sufficiently encode information in the structure of the output it can help the model. It can effectively let you encode simple branching mechanisms to execute at sample time


> You’re essentially forcing the model to express the actual answer it wants to express in a constrained language.

You surely aren't implying that the model is sentient or has any "desire" to give an answer, right?

And how is that different from prompting in general? Isn't using english already a constraint? And isn't that what it is designed for, to work with prompts that provide limits in which to determine the output text? Like there is no "real" answer that you supress by changing your prompt.

So I don't think its a plausible explanation to say this happens because we are "making" the model return its answerr in a "constrained language" at all.


> You surely aren't implying that the model is sentient or has any "desire" to give an answer, right?

The model is a probabilistic machine that was trained to generate completions and then fine tuned to generate chat style interactions. There is an output, given the prompt and weights, that is most likely under the model. That’s what one could call the model’s “desired” answer if you want to anthropomorphize. When you constrain which tokens can be sampled at a given timestep you by definition diverge from that


73% of startups are just writing computer programs


Interesting idea. Although I wouldn't consider `but restrict the data set to publications from <= year 1600` "easy".

If you did have access to a high-quality pretraining dataset and you could explore training up to 1600, then up to 1610, 1620, ... 1700 and look at how the presence of calculus was learned over that period. Running some tests with the intermediate models to capture the effect


I’m glad to see llamafile being resurrected. A few things I hope for:

1. Curate a continuously extended inventory of prebuilt llamafiles for models as they are released 2. Create both flexible builds (with dynamic backend loading for cpu and cuda) and slim minimalist builds 3. Upstreaming as much as they can into llama.cpp and partner with the project


Crazier ideas would be: - extend the concept to also have some sort of “agent mode” where the llamafiles can launch with their own minimal file system or isolated context - detailed profiling of main supported models to ensure deterministic outputs


Love the idea!


They stopped publishing images, not like they changed anything significant about the product itself.

Frankly the whole thing is not newsworthy


> not like they changed anything significant about the product itself

They had already done this months ago, they changed the licence and stripped a lot of the code of their admin UI/object browser. So a lot of the features vanished for people overnight if they updated. This is what the OP is linked to - a fork at the point that they did this. This was work already done, feature already widely in use, they decided to take it out of what was available to the community.

So the product has significantly changed and offering had been reduced. That in addition to the stopping of publishing images - both without notice - caused a decent bit off community 'wtf'.



For anyone curious here is an interactive write up about this http://michaelgiba.com/grammar-based/index.html


This is much more thorough, but here is an interactive post covering the related topic constrained sampling I put together a few weeks back:

http://michaelgiba.com/grammar-based/index.html


I was inspired by your project to start making similar multi-agent reality simulations. I’m starting with the reality game “The Traitors” because it has interesting dynamics.

https://github.com/michaelgiba/survivor (elimination game with a shoutout to your original)

https://github.com/michaelgiba/plomp (a small library I added for debugging the rollouts)


Very cool!


Nice, I’m particularly excited for the tiny models.


I like the idea but I would hesitate to upload my API keys.

why not make the prompting/orchestration pieces open source? A user could run locally to generate a result and the app could focus on displaying the results in a fun way


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: