Before stable diffusion, nobody released weights at all. Meta et al only started...

Tenoke · on Sept 13, 2023

>Before stable diffusion, nobody released weights at all.

That's not true. There's been a lot of models with weights from every player before Stability.

>Without stability, all of AI would still be closed and opaque.

Most GANs (the practically spiritual predecessor to diffusion models) for example were available. Huggingface existed and has realistically done more to keep AI open. And again, this specific release we are talking about by Stability is not Open.

Stability is great but you are re-writting history and doing it on the release where it makes least sense to do so.

refulgentis · on Sept 13, 2023

Nah. Dunno where this is coming from but infamously no AI models were released by big players for years. Rewind 18 months and all you got is GPT-3.0 that no one seems to care about and Disco Diffusion-y type stuff.

thomashop · on Sept 13, 2023

You are looking at a very short part of recent history. It has not been like that at all.

refulgentis · on Sept 13, 2023

I'm all ears. I was "in the room" from 2019 on. Can't name one art model you could run on your GPU from a FAANG or OpenAI before SD, and can't name one LLM with public access before ChatGPT, much less weights available till LLaMA 1.

But please, do share.

thewataccount · on Sept 13, 2023

Openai - GPT2 2019 - https://openai.com/research/gpt-2-1-5b-release

Google - T5 - Feb 2020 - https://blog.research.google/2020/02/exploring-transfer-lear...

Both of these were and still are used heavily for on-going research and T5 has been found to be decently useful when fine-tuned.

Weights were available for both.

refulgentis · on Sept 13, 2023

See https://news.ycombinator.com/item?id=37501964

smoldesu · on Sept 13, 2023

> Can't name one art model you could run on your GPU from a FAANG or OpenAI before SD

Google published dozens to promote Tensorflow:

https://experiments.withgoogle.com/font-map https://experiments.withgoogle.com/sketch-rnn-demo https://experiments.withgoogle.com/curator-table https://experiments.withgoogle.com/nsynth-super https://experiments.withgoogle.com/t-sne-map

The list goes on. Many are source-available with weights too.

> can't name one LLM with public access before ChatGPT, much less weights available till LLaMA 1.

Do any of these ring a bell?

- DistilBERT/MobileBERT/DeBERTa/RoBERTa/ALBERT

- FNet

- GPT2/GPT-Neo/GPT-J

- Marian

- MBart

- M2m100

- NLLB

- Electra

- T5/LongT5/T5-flan

- XLNet

- Reformer

- ProphetNet

- Pegasus

That's not comprehensive but may be enough to jog your memory.

refulgentis · on Sept 13, 2023

I understand your point.

The gap in communication is we don't mean _literally_ no one _ever_ open-sourced models. I agree, that would be absurd. [1]

Companies, quite infamously and well-understood, _did_ hold back their "real" generative models, even from being available for pay.

Take a stab at a literal definition: - post-GPT2 LLMs (ex. PALM, PALM2) - art like DaLL-E, Imagen, Parti

Loosely, we had Disco Diffusion for art, and GPT-3 for LLMs, and then Dall-E, then Midjourney. That was over an _entire year_, and the floodgates on private ones didn't open till post SD/ChatGPT.

[1] thank you for the lengths you went to highlight the best over a considered span of time, I would have just said something snarky :)

[2] I did not realize FLAN was open-sourced a month before ChatGPT, that's fascinating: we're stretching a bit, beyond that, IMHO: the BERTs aren't recognizable as LLMs.

smoldesu · on Sept 13, 2023

All good. I've also been working on LLMs since 2019-ish, so I wanted to toss a hat in the ring for the underrepresented transformer models. They were cool (eg. dumb), fast and worked better than they had any right to. In a lot of ways they are the ancestors of ChatGPT and Llama, so it's important to at least bring them into the discussion.

astrange · on Sept 13, 2023

> Can't name one art model you could run on your GPU from a FAANG or OpenAI before SD

CLIP could be used as an image generator, slowly.

> and can't name one LLM with public access before ChatGPT, much less weights available till LLaMA 1

InstructGPT was available on OpenAI playground for months before ChatGPT and was basically as capable as GPT3, people were really missing out. Don't know any good public models though.

hofstee · on Sept 13, 2023

https://github.com/google/deepdream

waffletower · on Sept 13, 2023

In the image generation space, weights were never released for ImageGen and Dall-e, but yes you can find weights for more specialized generative models like StyleGAN (2, 3 etc). Stable Diffusion was arguably one of the most influential open model releases, and I think the substantial investment in StabilityAI is evidence of that.

astrange · on Sept 13, 2023

There were open reproductions of DALLE1 like ruDALLE.

smoldesu · on Sept 13, 2023

GPT-2, GPT-J, XLNET, BERT, Longformers and T5 were all freely available before Stable Diffusion was even a press release.

vhold · on Sept 13, 2023

Stable Diffusion 1 contains a model OpenAI released. The CLIP encoder that was trained on text/image pairs at OpenAI.

https://huggingface.co/runwayml/stable-diffusion-v1-5

https://huggingface.co/runwayml/stable-diffusion-v1-5/blob/m...

Uploaded to Hugging Face Jan 2021

https://huggingface.co/openai/clip-vit-large-patch14