Elon Musk agrees that we've exhausted AI training data

Elon Musk agrees that we've exhausted AI training data | TechCrunch

RelatedInsightsHighlights

While LLMs continue to devour web-scraped data, they’ll increasingly consume their own digital progeny as AI-generated content continues to flood the internet. This recursive loop, experimentally confirmed, erodes the true data landscape. Rare events vanish first. Models churn out likely sequences from the original pool while injecting their own... See more

Azeem Azhar • 🔮 Open-source AI surge; UBI surprises; AI eats itself; Murdoch’s empire drama & the internet’s Balkanisation ++ #484

Thumbnail of www-x-com-emollick-status-1605756428941246466-36d0c2d914334116

We are running out of a vital resource: words! There are “only” 5 to 10 trillion high-quality words (papers, books, code) on the internet. Our AI models will have used all of that for training by 2026. Low-quality data (tweets, fanfic) will last to 2040. https://t.co/hm1EaJ6Enu https://t.co/PNsyUbPhBZ

Ethan Mollick

x.com

There Are No New Ideas in AI… Only New Datasets

blog.jxmo.io