> With each new release, they add evidence that they were correct.
Is it though? If the goal is human-level AI, or hell, even rat-level AI, the evidence is pretty convincing that you should be able to train and deploy it without requiring enough energy to sail a loaded container ship across the Pacific Ocean. Our brains draw about 20 watts, remember. This suggests to me that no, in fact, scale will not get us "there".
I don’t actually know if this is true, but the intuition I have is that this huge expenditure of energy is just the result of speeding through evolution. Our neural structures have evolved for hundreds of millions of years. The aggregate energy cost of that evolution has been enormous, but the result is a compact, hyper-efficient brain. Who’s to say that on the other end of this we’re not going to end up with the same in silicon?
Even if that were the case, it seems wasteful/pointless to have to go through all of biological evolution every time one wants to label images or generate some text.
That 20 watts is to run the network. Our brain has had a billion years to work out details of the architecture and encode a lot of basic stuff as instinct (and it still sucks at a lot of things). You should be counting that energy cost as well - we didnt get from nerve nets to frontal lobes overnight.
Earth is about 4.5B years old, life is about 3.7B years old, multicellular life (including life with neural nets) is about 600 million years old. I don't think the span from microbe to multicellular organism counts in brain evolution.
Do you want artificial intelligence or do you want energy efficiency? Personally, I think this work is about proving that we can create the former. Making it small and efficient comes later. That has been true of many advances in technology, and I see no reason why it should not apply here. I find it hard to believe that present energy consumption is evidence that we cannot create human or rat-level AI.
Training an AI in 2020 is best thought of as a capital investment. Like digging a mine or building a wind farm, the initial investment is very large but the operating costs are much lower, and in the long run you expect to get a lot more money out - a lot more value out - than you put in.
Training GPT-3 cost $5m; running it costs .04c per page of output.
If it is scaled up by 1000x as the article proposes, does that mean it will cost $40/page of output? Or does the additional cost just go into training the model?
If it's 100x from increased investment and 10x from short-term efficiency gains, yeah you'd expect $4/page. Model compression or some other tech might make it more efficient in the long-run.
> , the evidence is pretty convincing that you should be able to train and deploy it without requiring enough energy to sail a loaded container ship across the Pacific Ocean.
Yes and airplanes use much more energy to fly than a bird. What that got to do with the airline industry?
Who cares how much power it needs? Plug it into a hydroelectric dam. A superhuman AI would surely provide higher ROI than the terrawatts used for smelting Aluminium.
You're missing my point. If it's possible to achieve general AI with incredibly minimal computational requirements, then this implies that current methods which rely on some sort of teraflop arms race to achieve better results are based on a fundamentally flawed model.
General intelligence in its biological form was achieved with hundreds of millions of years evolution, which required the "evaluation" of trillions and trillions of instantiations of nervous systems. The total energy consumption of all those individual organisms was many many orders of magnitude more than all of the energy that has been produced by the entirety of humanity.
I'm well aware of the distinction between training a model and running it. Look, GPT-3 has 185 billion parameters. Modern low-power CPUs will get you about 2GFLOPs/watt [1]. So even if all GPT-3 did was add its parameters together it would take multiple seconds on an equivalently powered CPU to do something that our brains do easily in real time. It's not an issue of processing power; an 8086 from 40 years ago easily runs circles around us in terms of raw computational power. Rather, it's that our brains are wired in a fundamentally different way than all existing neural networks, and because of that, this line of research will never lead to GAI, not even if you threw unlimited computing power at it.
Birds are wired in a fundamentally different way than all our existing computers thus we will never have fly-by-wire, not even if we throw unlimited computing power at it.
Actually that's a great example. For centuries men labored (and died) trying to build ornithopters--machines that flap their wings like birds--under the mistaken impression that this was the secret to flight. Finally, after hundres of years of progressively larger and more powerful, but ultimately failing designs, the Wright brothers came along and showed us that flight is the result of wing shape and pressure differentials, and has nothing whatsover to do with flapping.
GPT-3 and whatever succeeds it are like late-stage ornithopters: very impressive feats of engineering, but not ultimately destined to lead us to where their creators hoped. We need the Wright brothers of AI to come and show us the way.
Is it though? If the goal is human-level AI, or hell, even rat-level AI, the evidence is pretty convincing that you should be able to train and deploy it without requiring enough energy to sail a loaded container ship across the Pacific Ocean. Our brains draw about 20 watts, remember. This suggests to me that no, in fact, scale will not get us "there".
https://www.forbes.com/sites/robtoews/2020/06/17/deep-learni...