Longjumping-Bake-557B to

LocalLLaMA@poweruser.forumEnglish · 2 years ago

Why is no one releasing 70b models?

1

Why is no one releasing 70b models?

Longjumping-Bake-557B to

LocalLLaMA@poweruser.forumEnglish · 2 years ago

There has been a lot of movement around and below the 13b parameter bracket in the last few months but it’s wild to think the best 70b models are still llama2 based. Why is that?

We have 13b models like 8bit bartowski/Orca-2-13b-exl2 approaching or even surpassing the best 70b models now

Chat

WaterPeckerB
link
fedilink
English
arrow-up
1·
2 years ago
Who pays for all this training on all these models we see knocking about and I don’t mean the ones released by the big companies? Like who has the resources to train a 70b model? Like one of the guys below said 1.7 million GPU hours for example thats pretty friggin expensive no?