
Elon Musk agrees that we've exhausted AI training data | TechCrunch


There is a potentially important source of variance for all of this: we’re running out of internet data. That could mean that, very soon, the naive approach to pretraining larger language models on more scraped data could start hitting serious bottlenecks.
Frontier models are already trained on much of the internet. Llama 3, for example, was... See more
Frontier models are already trained on much of the internet. Llama 3, for example, was... See more
SITUATIONAL AWARENESS - The Decade Ahead • I. From GPT-4 to AGI: Counting the OOMs
AI needs to be able to deal not only with specific situations for which there is an enormous amount of cheaply obtained relevant data, but also problems that are novel, and variations that have not been seen before.