"Good enough" open weights models were "almost there" since 2022.
I distrust the notion. The bar of "good enough" seems to be bolted to "like today's frontier models", and frontier model performance only ever goes up.
Methane has good energy density, doesn't demand cryogenics or diffuse through steel, burns very cleanly, and can be used in modified gasoline ICEs - without even sacrificing the gasoline fuel capability.
Without cryogenics, methane has such low energy density that a low-pressure fuel tank would still have to be as big as a bus for your compact methane-powered vehicle to go as far as you could on a few gallons of gasoline.
In general, "green hydrogen" makes the most sense if used as a chemical feedstock that replace natural gas in industrial processes - not to replace fossil fuels or be burned for heat.
On paper, hydrogen has good energy density, but taking advantage of that in truth is notoriously hard. And for things that demand energy dense fuels, there are many less finicky alternatives.
(Not GP) There was a well recognized reproducibility problem in the ML field before LLM-mania, and that's considering published papers with proper peer-reviews. The current state of afairs in some ways is even less rigourous than that, and then some people in the field feel free to overextend their conclusions into other fields like neurosciences.
We're in the "mad science" regime because the current speed of progress means adding rigor would sacrifice velocity. Preprints are the lifeblood of the field because preprints can be put out there earlier and start contributing earlier.
Anthropic, much as you hate them, has some of the best mechanistic interpretability researchers and AI wranglers across the entire industry. When they find things, they find things. Your "not scientifically rigorous" is just a flimsy excuse to dismiss the findings that make you deeply uncomfortable.
Strange that they raised money at all with an idea like this.
It's a bad idea that can't work well. Not while the field is advancing the way it is.
Manufacturing silicon is a long pipeline - and in the world of AI, one year of capability gap isn't something you can afford. You build a SOTA model into your chips, and by the time you get those chips, it's outperformed at its tasks by open weights models half their size.
Now, if AI advances somehow ground to a screeching halt, with model upgrades coming out every 4 years, not every 4 months? Maybe it'll be viable. As is, it's a waste of silicon.
The prototype is: silicon with a Llama 3.1 8B etched into it. Today's 4B models already outperform it.
Token rate in five digits is a major technical flex, but, does anyone really need to run a very dumb model at this speed?
The only things that come to mind that could reap a benefit are: asymmetric exotics like VLA action policies and voice stages for V2V models. Both of which are "small fast low latency model backed by a large smart model", and both depend on model to model comms, which this doesn't demonstrate.
In a way, it's an I/O accelerator rather than an inference engine. At best.
Even if this first generation is not useful, the learning and architecture decisions in this generation will be. You really can't think of any value to having a chip which can run LLMs at high speed and locally for 1/10 of the energy budget and (presumably) significantly lower cost than a GPU?
If you look at any development in computing, ASICs are the next step. It seems almost inevitable. Yes, it will always trail behind state of the art. But value will come quickly in a few generations.
maybe they're betting on improvement in models to plateau, and that having a fairly stablized capable model that is orders of magnitude faster than running on GPU's can be valuable in the future?
The "small model with unique custom domain knowledge" approach has a very low capability ceiling.
Model intelligence is, in many ways, a function of model size. A small model tuned for a given domain is still crippled by being small.
Some things don't benefit from general intelligence much. Sometimes a dumb narrow specialist really is all you need for your tasks. But building that small specialized model isn't easy or cheap.
Engineering isn't free, models tend to grow obsolete as the price/capability frontier advances, and AI specialists are less of a commodity than AI inference is. I'm inclined to bet against approaches like this on a principle.
> Engineering isn't free, models tend to grow obsolete as the price/capability frontier advances, and AI specialists are less of a commodity than AI inference is. I'm inclined to bet against approaches like this on a principle.
This does not sound like it will simplify the training and data side, unless their or subsequent models can somehow be efficiently utilized for that.
However, this development may lead to (open source) hardware and distributed system compilation, EDA tooling, bus system design, etc getting more deserved attention and funding.
In turn, new hardware may lead to more training and data competition instead of the current NVIDIA model training monopoly market.
So I think you're correct for ~5 years.
A fine tuned 1.7B model probably is still too crippled to do anything useful. But around 8b the capabilities really start to change. I’m also extremely unemployed right now so I can provide the engineering.
That is not exactly true. The brain does a lot of things that are not "pattern recognition".
Simpler, more mundane (not exactly, still incredibly complicated) stuff like homeostasis or motor control, for example.
Additionally, our ability to plan ahead and simulate future scenarios often relies on mechanisms such as memory consolidation, which are not part of the whole pattern recognition thing.
The brain is a complex, layered, multi-purpose structure that does a lot of things.
I distrust the notion. The bar of "good enough" seems to be bolted to "like today's frontier models", and frontier model performance only ever goes up.
reply