LocalLLaMA@poweruser.forumEnglish · 3 years ago

Why didn't gpt4 work at first and how did they "fix it"?

1

Why didn't gpt4 work at first and how did they "fix it"?

LocalLLaMA@poweruser.forumEnglish · 3 years ago

According to this tweet,

when gpt4 first finished training it didn’t actually work very well and the whole team thought it’s over, scaling is dead…until greg went into a cave for weeks and somehow magically made it work

So gpt-4 was kind of broken at first. Then greg spent a few weeks trying to fix it and then it somehow worked.

So why did it not work at first and how did they fix it?
I think this is an important question to the OSS community,

Chat

dogesatorB
link
fedilink
English
arrow-up
1·
3 years ago
Predicting the loss is very different from predicting real world abilities, they are able to top the former, not the latter.

Predicting the future loss once you’re already 10% into training is fairly trivial. Predicting the actual abilities though is not.