[D] why falcon-180b ranking has dramatically decreased?

Life_Ask2806 · 1 year ago

[D] why falcon-180b ranking has dramatically decreased?

vatsadev · 1 year ago

Well, the model is trained on refinedWeb, which is 3.5T, so a little below chinchilla optimal for 180b. Also, all the models from the falcon series seem to feel more and more undertrained,

The 1b model was good, and is still good after several newer gens
the 7b was capable pre llama 2
40b and 180b were never as good