Sublime
An inspiration engine for ideas


Our model is also pretty good at doing useless but fun stuff!
@OpenAI https://t.co/R8iZ95tZF9

loved this answer from @rabois on identifying exceptional talent early. few have done it better over the past 25 yrs! https://t.co/uxSoXlfsbl
Am I the only one who views almost all modern TV as worthless slop? I've retvrned to movies, and older ones — would much rather 2hrs of very high quality content I'll think about for years than 12 * 40min of slop I'll have forgotten after a month
Flo Crivellox.com



看了下DeepSeek论文,做了点笔记和思考
DeepSeek降低训练成本会不会让算力需求下降?看历史就知道了,GPT4级别token价格在一年半之内降低了100~1000倍,也没有类似的担忧(如图)
AI算力通缩是必然现象,长线旋律就是十年加速六个数量级:软件/算法加速三个数量级,硬件加速三个数量级
------------------------------
先从DeepSeek R1论文本身说起
Deepseek本身并没有像chatgpt o1那样做test time scaling获得更强推理能力,属于post-training后训练范畴,看起来和chatgpt o1不是一个路子(o1是推理的时候专门扩大“思考步骤”或“搜索过程"再挑出最优,推理时耗费比o4高几个数量级),只... See more