How AI-generated text is poisoning the internet
While LLMs continue to devour web-scraped data, they’ll increasingly consume their own digital progeny as AI-generated content continues to flood the internet. This recursive loop, experimentally confirmed, erodes the true data landscape. Rare events vanish first. Models churn out likely sequences from the original pool while injecting their own... See more
Azeem Azhar • 🔮 Open-source AI surge; UBI surprises; AI eats itself; Murdoch’s empire drama & the internet’s Balkanisation ++ #484
The problem may even get worse. Generative AI is producing vast amounts of questionable content that contaminates the datasets on which future AIs will be trained.