在机器人领域应用深度强化学习，目前主流的一些思路是什么？

在机器人领域应用深度强化学习，目前主流的一些思路是什么？ - 知乎

RelatedInsightsHighlights

Thumbnail of www-x-com-byebyescaling-status-2003900947488227381-97c8c0f82cc14eb6

HOW IS THIS ALPHA EVEN PUBLIC? 10x SEARCH DEPTH VIA GRPO The intuition has always been that scaling agentic search is a compute problem. It’s not. It’s a "stability-of-objective" problem. Most 8B models suffer from "horizon collapse" - they are mathematically "anxious" to terminate the search loop because their training... See more

return of the research era ꙮ

x.com

对于locomotion,外界环境大多可以视为一个刚体,物理特性基本可以忽略。这使得可以花费更多时间来建立机器人本体的精确物理模型,以及设计更复杂的物理引擎。这是为什么RL更适合Locomotion

渣大米 • Article

整个sim-to-real过程如图4所示,共分为四步:

(1)识别出机器人的物理参数,并对机器人进行刚体运动学/动力学建模;

(2)收集真实的关节电机执行数据,训练一个Actuator Net;

(3)在仿真中,利用Actuator Net建模关节电机,并结合第一步中的刚体运动学/动力学建模,进行强化学习;

(4)将第3步中训练得到的策略部署到真机上。

渣大米 • Article

小米技术 • Article