![Thumbnail of www-x-com-fearless-rising-status-1884460152226734391](https://s3.amazonaws.com/public-storage-prod.startupy.com/media/curation/tweet/1884460152226734391/image0.jpg)
When you feel calm and regulated and in that state of ventral vagal, something really cool happens.
Self-criticism fades.
Why? Because compassion and self-judgment can't coexist.
Stephen Porges (theorist behind Polyvagal Theory) says that when you're in a state of fight or flight, it's impossible to enter the portal of compassion, because that ... See more
![Thumbnail of www-x-com-karminski3-status-1887641350390358352](https://s3.amazonaws.com/public-storage-prod.startupy.com/media/curation/tweet/1887641350390358352/image0.jpg)
看到一个AI医疗框架 OpenHealth,这个框架可以让你把各项医院的检查报告传进去然后通过AI解析得到报告,然后还可以个根据上传的个人资料咨询并制定健康计划。
输入源支持血液检查结果,体检数据,家族遗传病史,目前症状。
模型支持 llama, DeepSeek-V3, GPT , Claude, Gemini
[1/2] https://t.co/OsGHyyCxPQ
Andrej Karpathy 读完 R1 论文后,对强化学习的一些解释,并重申 RLHF 跟 RL 不是一码事。
无论是在儿童学习还是深度学习中,学习主要有两种类型:
1)模仿学习(观察和重复,即预训练和监督微调),
2)试错学习(强化学习)。
我最喜欢的简单例子是 AlphaGo:
1)通过模仿专家玩家进行学习,
2)通过强化学习赢得比赛。
几乎所有深度学习的惊人结果,以及所有*魔法*的来源,始终是第二种类型。
2显著更强大,给人带来惊喜。
2是在《打砖块》中,球拍学会在砖块后面击打球的过程。
2是 AlphaGo 甚至击败李世石的时刻。
2 是“顿悟时刻”,当 DeepSeek(或 o1 等)发现重新评估假设、回溯、尝试其他方法等效果很好时。
这是你在这个模型思维链中看到的解决策略。... See more
@mkonnikova I once worked with an editor who had pre-printed cards which he sent out to botherers. They read: “The literary editor thanks you for your letter, the contents of which have been noted”. I always thought this was a perfect response.