Sublime
An inspiration engine for ideas
I read the DeepSeek-R1 paper the day it came out, and I don’t think GRPO is the key to its success. Instead, here’s what truly matters (ranked by importance):
1. Iterative RL and SFT
2. A hybrid reward model—mixing rule-based RM and neural RM for deterministic tasks
3. High-quality synthetic data, with human post-processing only when necessary
4. ... See more
original content.
Sunil Gupta • Driving Digital Strategy: A Guide to Reimagining Your Business
1 billion strategic-investment portfolio
Sunil Gupta • Driving Digital Strategy: A Guide to Reimagining Your Business
without human intervention.
Sunil Gupta • Driving Digital Strategy: A Guide to Reimagining Your Business
HannahSun's Portfolio
hannahhsun.design