The Q* hypothesis: Tree-of-thoughts reasoning, process reward models, and supercharging synthetic data

by Nathan Lambert

Thumbnail of The Q* hypothesis: Tree-of-thoughts reasoning, process reward models, and supercharging synthetic data

updated 10mo ago

  • from Shortwave — rajhesh.panchanadhan@gmail.com [Gmail alternative]

    Nicolay Gerold added

  • Shortwave — rajhesh.panchanadhan@gmail.com [Gmail alternative]

    by app.shortwave.com

    74 highlights

    Thumbnail of Shortwave — rajhesh.panchanadhan@gmail.com [Gmail alternative]

    Nicolay Gerold and added

  • The Alignment Problem

    by Brian Christian

    1 highlight

    Cover of The Alignment Problem