After 500+ LoRAs made, here is the secret

The difference between "pretty good" and "great data" is hundreds of millions of dollars in wasted parameters. Basically every "great" model today, can be 10x'd with better data curation.
How much this matters is genuinely shocking, and everyone is ignoring it. https://t.co/kJrZA56mQu
The "Quality over Quantity" Insight
The old era of Deep Learning was about "Big Data"—dumping the entire internet into a model. The new era is about "Smart Data."
The Finding: A small dataset of 1,000 high-quality, textbook-like examples often outperforms a dataset of 100,000 messy web-scraped examples. This is famously known as the "Textbooks Are
When I worked in machine learning every day, I found a close analogue of this to be very true. A slightly better architecture tended to matter much less than getting better data. Or, at least, once I hit on a reasonable model architecture, I tended to do much better by cleaning and gathering more data than by optimizing the architecture.
Nate Meyvis • Nate Meyvis
Practitioners agree that the vast majority of time in building a machine learning pipeline is spent on feature engineering and data cleaning. Yet, despite its importance, the topic is rarely discussed on its own.