What Actually Matters (And What Doesn’t) for DeepSeek
DeepSeek Coder comprises a series of code language models trained from scratch on both 87% code and 13% natural language in English and Chinese, with each model pre-trained on 2T tokens. We provide various sizes of the code model, ranging from 1B to 33B versions. Each model is pre-trained on repo-level code corpus by employing a window size of 16K ... See more
DeepSeek Coder

Over the past 2 and half years we’ve seen the rise of the LLM’s but one of the great contributers to LLM’s Yann LeCun believes that LLM’s are actually old news and that we’re now just making them marginally better and he’s much more focused on other things.
Firstly he thinks these models need to understand the physical world. Right now, LLMs are gr


去年 Jim Fan 对于 Q* 的预测,现在结合刚发布的 o1 来看基本上都是准确的!
Jim 将 Q* 和 AlphaGo 做了类比,猜测 Q* 可能类似于 AlphaGo,是通过与自己之前的版本进行对弈,自我对弈不断进步,甚至于架构都是类似的。
AlphaGo 的架构核心有四个组件:
1. 策略神经网络(Policy NN,学习部分):负责选择下一步最有可能赢的走法
2. 价值神经网络(Value NN,学习部分):评估当前棋局
3. 蒙特卡洛树搜索(MCTS,搜索部分):模拟从当前位置开始落子的多种可能,类似于人类在算棋步(假如我放在A位置,那么对手可能下在哪几个位置,然后我再下一步怎么应对……)
4. 输赢判定:根据围棋规则判定谁赢了。
这个架构的神奇之处在于整个训练过... See more