Sublime
An inspiration engine for ideas

Vector databases for AI memory just got replaced by MP4 files.
Store millions of text chunks in MP4 files instead of expensive vector databases with lightning-fast semantic search.
No database needed. 100% Opensource. https://t.co/iuIS5ElHsM
I built a RAG Routing Agent that exactly knows where to look.
It automatically route queries to the right knowledge base and falls back to web search when needed.
100% Opensource Code with step-by-step tutorial. https://t.co/EnUwAm0wEf
Shubham Saboox.com
a technique i'm increasingly believing is going to be really useful in large-scale synth data pipelines is the use of graph algorithms to do semantic deduplication
submitted a paper about this a while back, never got around to cleaning it up for arxiv, but it's a neat trick https://t.co/XNVRTWxCSq

The results in this paper look good. I want to believe.
https://t.co/Vg3ing9t0g https://t.co/HaECdszUE6

Introducing MaPO, a memory-efficient technique for aligning T2I diffusion models on preference data 🔥
We eliminate the need to have a reference model when performing alignment fine-tuning.
Code, models, datasets, and paper are up... See more
@samselikoff I am using this in prod in multiple places and been working as expected https://t.co/xFFp4LoDmO
Abdul wahabx.com