ThistleknotB to

LocalLLaMA@poweruser.forumEnglish · 1 year ago

The Q* hypothesis: Tree-of-thoughts reasoning, process reward models, and supercharging synthetic data

8

1

The Q* hypothesis: Tree-of-thoughts reasoning, process reward models, and supercharging synthetic data

ThistleknotB to

LocalLLaMA@poweruser.forumEnglish · 1 year ago

8

https://www.interconnects.ai/p/q-star

Chat

Willing_BreadfruitB
link
fedilink
English
arrow-up
1·
1 year ago
Yann Lecunn tweet what this is today. Token prediction with planning. Far below prompt level.
- ThistleknotOPB
  link
  fedilink
  arrow-up
  1·
  1 year ago
  https://twitter.com/ylecun/status/1728126868342145481?ref_src=twsrc%5Egoogle%7Ctwcamp%5Eserp%7Ctwgr%5Etweet