Sublime
An inspiration engine for ideas




看了下DeepSeek论文,做了点笔记和思考
DeepSeek降低训练成本会不会让算力需求下降?看历史就知道了,GPT4级别token价格在一年半之内降低了100~1000倍,也没有类似的担忧(如图)
AI算力通缩是必然现象,长线旋律就是十年加速六个数量级:软件/算法加速三个数量级,硬件加速三个数量级
------------------------------
先从DeepSeek R1论文本身说起
Deepseek本身并没有像chatgpt o1那样做test time scaling获得更强推理能力,属于post-training后训练范畴,看起来和chatgpt o1不是一个路子(o1是推理的时候专门扩大“思考步骤”或“搜索过程"再挑出最优,推理时耗费比o4高几个数量级),只... See more
i think a lot about how user interfaces simply help us understand what the computer is doing
now, the computers can understand us
@sarthakgh @DavidSHolz Build companies to buy your freedom and do what you love, not to satisfy VCs. 🫶🏽
As paradoxical as it sounds, this is true. The jhanas start by feeling like a path to a special state, but eventually reveal themselves as a path that’s always been about coming home.
Like falling in love, a radical acceptance can kick off with an infatuated ecstasy and simmer