Composer: Building a fast frontier model with RL

Composer: Building a fast frontier model with RL · Cursor

RelatedInsightsCollectionsHighlights

I cannot emphasize enough how much I prefer this "tutorial doc + build-it-yourself" coding workflow to the typical "ugh" feeling of reviewing huge agent PRs. You can try it right now and see for yourself: 1) Instead of having Claude Code make a PR, ask it to output a Markdown doc with a deep, detailed guide for a dev to... See more

Geoffrey Litt x.com

Shipping at Inference-Speed | Peter Steinberger

Peter Steinberger steipete.me

Thumbnail of www-x-com-byebyescaling-status-2003900947488227381-97c8c0f82cc14eb6

HOW IS THIS ALPHA EVEN PUBLIC? 10x SEARCH DEPTH VIA GRPO The intuition has always been that scaling agentic search is a compute problem. It’s not. It’s a "stability-of-objective" problem. Most 8B models suffer from "horizon collapse" - they are mathematically "anxious" to terminate the search loop because their training... See more

return of the research era ꙮ

x.com