Harmony: OpenAI's response format for its open-weight model series

lajr · 2025-08-05T16:38:14 1754411894

This format, or similar formats, seem to be the standard now, I was just reading the "Lessons from Building Manus"[1] post and they discuss the Hermes Format[2] which seems similar in terms of being pseudo-xml.

My initial thought was how hacky the whole thing feels, but then the fact that it works and gives rise to complex behaviour (like coercing specific tool selection in the Manus post) is quite simple and elegant.

Also as an aside, it is good that it appears that each standard tag is a single token in the OpenAI repo.

[1] https://manus.im/blog/Context-Engineering-for-AI-Agents-Less... [2] https://github.com/NousResearch/Hermes-Function-Calling

irthomasthomas · 2025-08-05T18:52:11 1754419931

Prediction: GPT-5 will use a consortium of models for parallel reasoning, possibly including their oss versions. Each using different 'channels' from the harmony spec.

I have a branch of llm-consortium where I was noodling with giving each member model a role. Only problem is it's expensive to evaluate these ideas so I put it on hold. But maybe now with oss models being cheap I can try and it on those.

nxobject · 2025-08-05T20:07:39 1754424459

Computer science's favorite move: we've reached the limits of a scaling law meant to benefit single-threaded processes, so let's go parallel...

BoredPositron · 2025-08-05T22:53:09 1754434389

we are scaling in one direction for 2 years now...

Imustaskforhelp · 2025-08-05T18:59:34 1754420374

What are your thoughts on some other model like qwen using something like this?

Pardon me but are you thinking that this method is superior than mixture of experts? What are your thoughts?

irthomasthomas · 2025-08-05T19:07:06 1754420826

I tested a consortium of qwens on the brainfuck test and it solved it, while the single models fail.

MOEs are a single model. An 'expert' is a subset of layers chosen by a router model for each token. This makes them run faster. A consortium is a type of parallel reasoning that uses multiple of the same or different models to generate parallel response and find the best one.

All models have a jagged frontier with weird skill gaps. A consortium can bridge those gaps and increase performance on the frontier.

onlyrealcuzzo · 2025-08-05T19:53:17 1754423597

Has anyone compared a consortium of leading edge 3B-20B models compared to the most powerful models?

I'd love to see how they performed.

irthomasthomas · 2025-08-05T20:22:43 1754425363

Do you have a favourite benchmark? I may just have the budget for testing some 3b models

mindwok · 2025-08-05T20:49:50 1754426990

This is what Grok 4 Heavy does with apparent success.

irthomasthomas · 2025-08-05T20:57:53 1754427473

They may have been inspired by it. It was shared by karpathy... https://x.com/karpathy/status/1870692546969735361

I wish someone would extract the Grok Heavy prompts to confirm, but I guess those jailbreakers don't have the $200 sub.

dr_dshiv · 2025-08-05T18:30:59 1754418659

Yesterday I gave a presentation on the role of harmony in AI — as a matter of philosophical interest. I’d previously written a large literature review on the concept of harmony (here: https://www.sciencedirect.com/science/article/pii/S240587262...). If you are curious about the slides, here: Bit.ly/ozora2025

I assume they are using the concept of harmony to refer to the consistent response format? Or is it their intention for an open weights release?

accrual · 2025-08-05T19:55:02 1754423702

> The format enables the model to output to multiple different channels for chain of thought, and tool calling preambles along with regular responses

That's pretty cool and seems like a logical next step to structure AI outputs. We started out with a stream of plaintext. In the future perhaps we'll have complex typed output.

Humans also emit many channels of information simutaneously. Our speech, tone of voice, body language, our appearance - it all has an impact on how our information is received by another.

obviyus · 2025-08-05T17:06:19 1754413579

Links seem to be working now:

- https://openai.com/index/introducing-gpt-oss/

- https://cdn.openai.com/pdf/419b6906-9da6-406c-a19d-1bb078ac7...

citizensinan · 2025-08-05T20:16:29 1754424989

Same here - all those links are either broken or asking for auth. Classic case of announcing something before the infrastructure is ready.

This kind of coordination failure is surprisingly common with AI releases lately. Remember when everyone was trying to access GPT-4 on launch day? Or when Anthropic's Claude had those random outages during their big announcements?

Makes you wonder if they're rushing to counter Google's Genie 3 news and got caught with their pants down during the GitHub outage. The timing seems too coincidental.

At least when it does go live, having truly open weights models will be huge for the community. Just wish they'd test their deployment pipeline before hitting 'publish' on the blog post.

deckar01 · 2025-08-05T16:38:06 1754411886

gpt-oss models are reportedly being hosted on huggingface.

https://www.bleepingcomputer.com/news/artificial-intelligenc...

bbor · 2025-08-05T17:04:16 1754413456

(as of 3 days ago)

qsort · 2025-08-05T16:34:42 1754411682

pelican when

MarcelOlsz · 2025-08-05T16:39:05 1754411945

What's pelican?

babelfish · 2025-08-05T16:39:38 1754411978

@simonw asks every new foundation model to generate an SVG of a pelican riding a bicycle as a part of his review post

echelon · 2025-08-05T16:44:18 1754412258

The foundation model companies should just learn that case and call it a day.

pythonaut_16 · 2025-08-05T17:54:11 1754416451

Yes, they should definitely Goodhardt the Pelican Test so we can... just have to invent a new test?

Spivak · 2025-08-05T18:59:58 1754420398

Yes but then you can use the pelican test in all your marketing where you say that this is the <apple slide deck voice> most capable model. ever. And then ignore the new test except as a footnote in some long dry boring evaluation.

schmidtleonard · 2025-08-05T16:57:58 1754413078

He spotted a pelican in a presentation the other week, so they're on to him and he's on to them.

unglaublich · 2025-08-05T17:42:11 1754415731

Benchmark-driven development, like Dieselgate in automotive.

righthand · 2025-08-05T18:56:30 1754420190

I hope this ends in well poisoning to where all data about pelicans is associated with a bicycle in some way to which you can't get any model to give you correct information about pelicans or bicycles but you can get a pelican riding a bicycle.

HaZeust · 2025-08-05T17:45:24 1754415924

wen pelican.... WEN BICYCLE

Scene_Cast2 · 2025-08-05T18:35:04 1754418904

I wonder how much performance is left on the table due to it not being zero-copy.

jfoster · 2025-08-05T16:25:14 1754411114

The page links to: https://gpt-oss.com/ and https://openai.com/open-models

... but these links aren't active yet. I presume they will be imminently, and I guess that means that OpenAI are releasing an open weights GPT model today?

gsibble · 2025-08-05T19:45:43 1754423143

It's weird to me that ChatGPT would release a local model that you can't plug directly into their client.....kind of defeats the purpose.

Also creates a walled garden on purpose.

throwaway314155 · 2025-08-05T16:22:13 1754410933

what's this for?

koakuma-chan · 2025-08-05T16:39:32 1754411972

Basically, LLMs are trained with a specific conversation format, and if your input does not follow that format, the LLM will perform poorly. We usually don't have to worry about this because their API automatically puts our input into the proper format, but I guess now that they open sourced a model, they are also releasing the corresponding format.

babelfish · 2025-08-05T16:22:28 1754410948

read the README

FergusArgyll · 2025-08-05T16:25:27 1754411127

None of their links work?

- https://gpt-oss.com/ Auth required?

- https://openai.com/open-models/ seems empty?

- https://cookbook.openai.com/topic/gpt-oss 404

- https://openai.com/index/gpt-oss-model-card/ empty page?

Am I holding the internet wrong?

jfoster · 2025-08-05T16:26:46 1754411206

I think they're currently doing the release. I am guessing those will all be online soon.

minimaxir · 2025-08-05T16:46:47 1754412407

The new transformers release describes the model: https://github.com/huggingface/transformers/releases/tag/v4....

> GPT OSS is a hugely anticipated open-weights release by OpenAI, designed for powerful reasoning, agentic tasks, and versatile developer use cases. It comprises two models: a big one with 117B parameters (gpt-oss-120b), and a smaller one with 21B parameters (gpt-oss-20b). Both are mixture-of-experts (MoEs) and use a 4-bit quantization scheme (MXFP4), enabling fast inference (thanks to fewer active parameters, see details below) while keeping resource usage low. The large model fits on a single H100 GPU, while the small one runs within 16GB of memory and is perfect for consumer hardware and on-device applications.

trenchpilgrim · 2025-08-05T16:35:03 1754411703

Presumably they use GitHub and their release process is delayed by the current GitHub outage.

skhameneh · 2025-08-05T16:54:16 1754412856

Apparently the issue was resolved, but there's no indication there was an outage in the last 24 hours when looking at status... https://www.githubstatus.com/

Not a fan of this presentation of communication.

skhameneh · 2025-08-05T18:06:33 1754417193

The status page now reflects an issue, at time of writing it had been resolved for almost an hour and there was no indication of an issue.

bbor · 2025-08-05T17:03:40 1754413420

IDK I think this is on purpose: https://nitter.net/sama/status/1952759361417466016#m

  we have a lot of new stuff for you over the next few days!
  something big-but-small today.
  and then a big upgrade later this week.

EDIT: nevermind, I spoke too soon! I guess this was referring to GPT 5 later this week. https://openai.com/open-models/ is live

echelon · 2025-08-05T16:42:44 1754412164

Cosmically bad timing.

stronglikedan · 2025-08-05T18:46:12 1754419572

> Am I holding the internet wrong?

Likely, considering every single one opens right up for me.

stillpointlab · 2025-08-05T16:33:05 1754411585

Also https://cookbook.openai.com/articles/openai-harmony is referenced 3 times in the README but it is 404

Imustaskforhelp · 2025-08-05T18:56:46 1754420206

the link does work now for what its worth

wilg · 2025-08-05T18:03:45 1754417025

Every OpenAI announcement has threads of people complaining that the links don't work yet as if you can trivially deploy 10 different interconnected websites completely instantly.

guluarte · 2025-08-05T17:10:33 1754413833

https://ollama.com/library/gpt-oss

FergusArgyll · 2025-08-05T16:26:27 1754411187

Does seem like we're gonna get open weights models today tho

paxys · 2025-08-05T16:47:07 1754412427

I'm guessing someone published the github repo too early.

echelon · 2025-08-05T16:50:29 1754412629

GitHub is having an outage.

OpenAI might have tried coordinating the press release of their open model to counter Google Genie 3 news but got stuck in the middle of the outage.

Bluestein · 2025-08-05T18:49:34 1754419774

GitHub got hugged to death by OpenAI :)

rvz · 2025-08-05T16:36:32 1754411792

> Am I holding the internet wrong?

The GitHub outage is delaying them on their release.