Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Before stable diffusion, nobody released weights at all. Meta et al only started sharing their models with the world when they realized how fast a developer ecosystem was building around the best models.

Without stability, all of AI would still be closed and opaque.



>Before stable diffusion, nobody released weights at all.

That's not true. There's been a lot of models with weights from every player before Stability.

>Without stability, all of AI would still be closed and opaque.

Most GANs (the practically spiritual predecessor to diffusion models) for example were available. Huggingface existed and has realistically done more to keep AI open. And again, this specific release we are talking about by Stability is not Open.

Stability is great but you are re-writting history and doing it on the release where it makes least sense to do so.


Nah. Dunno where this is coming from but infamously no AI models were released by big players for years. Rewind 18 months and all you got is GPT-3.0 that no one seems to care about and Disco Diffusion-y type stuff.


You are looking at a very short part of recent history. It has not been like that at all.


I'm all ears. I was "in the room" from 2019 on. Can't name one art model you could run on your GPU from a FAANG or OpenAI before SD, and can't name one LLM with public access before ChatGPT, much less weights available till LLaMA 1.

But please, do share.


Openai - GPT2 2019 - https://openai.com/research/gpt-2-1-5b-release

Google - T5 - Feb 2020 - https://blog.research.google/2020/02/exploring-transfer-lear...

Both of these were and still are used heavily for on-going research and T5 has been found to be decently useful when fine-tuned.

Weights were available for both.



> Can't name one art model you could run on your GPU from a FAANG or OpenAI before SD

Google published dozens to promote Tensorflow:

https://experiments.withgoogle.com/font-map https://experiments.withgoogle.com/sketch-rnn-demo https://experiments.withgoogle.com/curator-table https://experiments.withgoogle.com/nsynth-super https://experiments.withgoogle.com/t-sne-map

The list goes on. Many are source-available with weights too.

> can't name one LLM with public access before ChatGPT, much less weights available till LLaMA 1.

Do any of these ring a bell?

- DistilBERT/MobileBERT/DeBERTa/RoBERTa/ALBERT

- FNet

- GPT2/GPT-Neo/GPT-J

- Marian

- MBart

- M2m100

- NLLB

- Electra

- T5/LongT5/T5-flan

- XLNet

- Reformer

- ProphetNet

- Pegasus

That's not comprehensive but may be enough to jog your memory.


I understand your point.

The gap in communication is we don't mean _literally_ no one _ever_ open-sourced models. I agree, that would be absurd. [1]

Companies, quite infamously and well-understood, _did_ hold back their "real" generative models, even from being available for pay.

Take a stab at a literal definition: - post-GPT2 LLMs (ex. PALM, PALM2) - art like DaLL-E, Imagen, Parti

Loosely, we had Disco Diffusion for art, and GPT-3 for LLMs, and then Dall-E, then Midjourney. That was over an _entire year_, and the floodgates on private ones didn't open till post SD/ChatGPT.

[1] thank you for the lengths you went to highlight the best over a considered span of time, I would have just said something snarky :)

[2] I did not realize FLAN was open-sourced a month before ChatGPT, that's fascinating: we're stretching a bit, beyond that, IMHO: the BERTs aren't recognizable as LLMs.


All good. I've also been working on LLMs since 2019-ish, so I wanted to toss a hat in the ring for the underrepresented transformer models. They were cool (eg. dumb), fast and worked better than they had any right to. In a lot of ways they are the ancestors of ChatGPT and Llama, so it's important to at least bring them into the discussion.


> Can't name one art model you could run on your GPU from a FAANG or OpenAI before SD

CLIP could be used as an image generator, slowly.

> and can't name one LLM with public access before ChatGPT, much less weights available till LLaMA 1

InstructGPT was available on OpenAI playground for months before ChatGPT and was basically as capable as GPT3, people were really missing out. Don't know any good public models though.



In the image generation space, weights were never released for ImageGen and Dall-e, but yes you can find weights for more specialized generative models like StyleGAN (2, 3 etc). Stable Diffusion was arguably one of the most influential open model releases, and I think the substantial investment in StabilityAI is evidence of that.


There were open reproductions of DALLE1 like ruDALLE.


GPT-2, GPT-J, XLNET, BERT, Longformers and T5 were all freely available before Stable Diffusion was even a press release.


Stable Diffusion 1 contains a model OpenAI released. The CLIP encoder that was trained on text/image pairs at OpenAI.

https://huggingface.co/runwayml/stable-diffusion-v1-5

https://huggingface.co/runwayml/stable-diffusion-v1-5/blob/m...

Uploaded to Hugging Face Jan 2021

https://huggingface.co/openai/clip-vit-large-patch14




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: