
The Scaling Era: An Oral History of AI, 2019–2025

70 When comparing RLHF versus non-RLHF models, RLHF is equivalent to increasing the model size 100 times in terms of the resulting increase in human evaluators’ preference ratings.
Dwarkesh Patel • The Scaling Era: An Oral History of AI, 2019–2025
Then GPT-3 comes down—the crucial test. Going from GPT-2 to GPT-3 is one of the biggest scale-ups in all of neural network history. If scaling was bogus, then the GPT-3 paper would be super unimpressive.
Dwarkesh Patel • The Scaling Era: An Oral History of AI, 2019–2025
a laptop, the compute used for Google DeepMind’s Gemini Ultra, released in 2023, would be the size of New York City.46 In this period, the compute used to train each frontier model doubled every six months—four times faster than Moore’s law predicts.47
Dwarkesh Patel • The Scaling Era: An Oral History of AI, 2019–2025
don’t want people to come away thinking that models aren’t going to get much better. The jumps we’ve seen so far are huge. Even if those continue on a smaller scale, we’re still in for extremely smart, very reliable agents over the next couple of orders of magnitude. We have a lot more jumps coming. Even if those jumps are smaller, relatively speak
... See moreDwarkesh Patel • The Scaling Era: An Oral History of AI, 2019–2025
the compute used for Google DeepMind’s Gemini Ultra, released in 2023, would be the size of New York City.
Dwarkesh Patel • The Scaling Era: An Oral History of AI, 2019–2025
Specific abilities are very hard to predict. Back when I was working on GPT-2 and GPT-3, we were asking, “When does arithmetic come into place? When do models learn to code?” Sometimes it’s very abrupt. It’s like how you can predict statistical averages of the weather, but the weather on one particular day is very hard to predict. One of the first
... See moreDwarkesh Patel • The Scaling Era: An Oral History of AI, 2019–2025
It doesn’t seem particularly compelling. One source of evidence is work by Suzana Herculano-Houzel, a neuroscientist who has dissolved the brains of many creatures to determine how many neurons are present. She’s found a lot of interesting scaling laws. She has a paper discussing the human brain as a scaled-up primate brain.60 Across a wide variety
... See moreDwarkesh Patel • The Scaling Era: An Oral History of AI, 2019–2025
You talk about the enormous power that superintelligence and the government will have. It’s pretty plausible that in the alternative world, one AI company will have that power. Say OpenAI has a six-month lead. You’re talking about the most powerful weapon ever. So you’re making a radical bet on a private company CEO as the benevolent dictator.