There isn't, though you can run it over wasm on it. I tried it a while back with a port of the w2c2 transpiler (https://github.com/euclaise/w2c9/), but something like wazero is a more obvious choice
This is not exactly propaganda in the typical sense, but it clearly is the case that people successfully edit Wikipedia to further objectives. As an example, the Wikipedia page for Meta-analysis (which isn't even that obscure of a topic) currently contains content that seems to plausibly be trying to promote Suhail Doi's methods, and it seems that it has been like this for a number of years. It cites 5 papers from him, more than anyone else, of which the largest has 297 citations. It has a subsection devoted to his method of meta-analysis, despite it being a rather obscure and rarely used method. There have been additional subsections added over time, which also focus on somewhat obscure areas, but frankly these additions are sketchy in similar ways.
In general, it is not uncommon to come across slantedness issues. Is it completely 100% clear that Doi has come on and maliciously added his papers? Not quite, but good propaganda wouldn't be either, and would actually be far less suspicious-looking.
Yes, but the claim is about "unlimited context length." I doubt attention over each segment can be as good at recall as attention over the full input context.
A lot of embedding models are built on top of T5's encoder, this offers a new option
The modularity of the enc-dec approach is useful - you can insert additional models in between (e.g. A diffusion model), you can use different encoders for different modalities, etc
There's a new 7B version that was trained on more tokens, with longer context, and there's now a 14B version that competes with Llama 34B in some benchmarks.
To be fair, it's a stupid distraction from discussing the model. If every thread just turns into politics it would not make for good discussions. People can start threads about specific ideologies that language models have and they can be discussed there (and have been). Bringing that up every time a model is discussed feels off topic and legit to flag (I didn't flag it)
Edit: but now I see the thread has basically been ruined and it's going to be about politics instead of anything new and interesting about the model, congrats everyone.
Alignment is also a distraction, its OpenAI marketing and something people who don't understand ML talk about, not a serious topic.
Like I said, discussing model politics have a place but bringing it up every time a model is mentioned is distracting and prevents adult discussion. It would be like if every time a company comes up, the thread gets spammed with discussion about the worst thing that company has ever done instead of discussing it in context.
The condescension is unfounded and unnecessary. Discussion of the usefulness of a model or its interface also includes these topics. If the refusal to discuss the topic came from anything other than it simply being absent from training data, that’s highly interesting from multiple perspectives.
For example, ChatGPT’s practical usability is regularly hobbled by alignment concerns, notably more so in 3.5 than 4. It’s a worthy topic, not a distraction and characterizing it as something other than ‘adult discussion’ is nothing more than an egocentric encoding of your specific interests into a ranking of importance you impose on others. A little humility goes a long way.
We’re here to be curious and that includes addressing misconceptions, oversights and incorrect assumptions. That all still counts as adult discussion.
Imagine if a Chinese company releases a model that kicks the state of the art's ass and everyone starts using it because it works so well. Now the censorship has leaked into all the systems that use it.
Any model produced by any North American or European companies, even X, may be trained in a way that would be politically correct to the taste of the company, some are left leaning and some are right leaning, but the topics being censored by the model will still be far lesser than the model being created by a Chinese/Russian companies. This is because for the companies to survive in a totalitarian government, the company must bent and satisfy all the filtering request by the government.
reply