Sublime
An inspiration engine for ideas
Kimi is the real deal. Unless it's really Sonnet in a trench coat, this is the best agentic open-source model I've tested - BY A MILE.
Here's a slice* of a 4 HOUR run (~1 second per minute) with not much more than 'keep going' from me every 90 minutes or so.
The task involved editing multipl... See more
Hrishix.comGood post from @balajis on the "verification gap".
You could see it as there being two modes in creation. Borrowing GAN terminology:
1) generation and
2) discrimination.
e.g. painting - you make a brush stroke (1) and then you look for a while to see if you improved the paintin... See more
Andrej Karpathyx.comMoravec's paradox in LLM evals
I was reacting to this new benchmark of frontier math where LLMs only solve 2%. It was introduced because LLMs are increasingly crushing existing math benchmarks. The interesting issue is that even though by many accounts (/evals), LLMs are inching well into top expert territory (e.g. in m... See more
Andrej Karpathyx.comI built a ChatGPT app that lets you chat with any codebase!
99% of projects just copy/paste Langchain tutorials. This goes well beyond that.
Here's how I built it:
Mark Tenenholtzx.com