Alien Top
  • Communities
  • Create Post
  • heart
    Support Lemmy
  • search
    Search
  • Login
  • Sign Up
ThistleknotB to LocalLLaMA@poweruser.forumEnglish · 2 years ago

The Q* hypothesis: Tree-of-thoughts reasoning, process reward models, and supercharging synthetic data

message-square
message-square
8
link
fedilink
1
message-square

The Q* hypothesis: Tree-of-thoughts reasoning, process reward models, and supercharging synthetic data

ThistleknotB to LocalLLaMA@poweruser.forumEnglish · 2 years ago
message-square
8
link
fedilink

https://www.interconnects.ai/p/q-star

alert-triangle
You must log in or register to comment.
  • Willing_BreadfruitB
    link
    fedilink
    English
    arrow-up
    1
    ·
    2 years ago

    Yann Lecunn tweet what this is today. Token prediction with planning. Far below prompt level.

    • ThistleknotOPB
      link
      fedilink
      arrow-up
      1
      ·
      2 years ago

      https://twitter.com/ylecun/status/1728126868342145481?ref_src=twsrc%5Egoogle%7Ctwcamp%5Eserp%7Ctwgr%5Etweet

  • perlthoughtsB
    link
    fedilink
    English
    arrow-up
    1
    ·
    2 years ago

    It sounds like open source synthia, openchat, and zephyr lol. The whitepapers. lolol.

LocalLLaMA@poweruser.forum

localllama@poweruser.forum

Subscribe from Remote Instance

Create a post
You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !localllama@poweruser.forum

Community to discuss about Llama, the family of large language models created by Meta AI.

Visibility: Public
globe

This community can be federated to other instances and be posted/commented in by their users.

  • 4 users / day
  • 4 users / week
  • 4 users / month
  • 4 users / 6 months
  • 3 local subscribers
  • 11 subscribers
  • 1.03K Posts
  • 5.96K Comments
  • Modlog
  • mods:
  • communick@poweruser.forum
  • BE: 0.19.11
  • Modlog
  • Instances
  • Docs
  • Code
  • join-lemmy.org