55B Yi model merges

Aaaaaaaaaeeeee · 1 year ago

55B Yi model merges

candre23 · 1 year ago

It’s a new foundational model, so some teething pains are to be expected. Yi is heavily based on (directly copied, for the most part) llama2, but there are just enough differences in the training parameters that default llama2 settings don’t get good results. KCPP has already addressed the rope scaling, and I’m sure it’s only a matter of time before the other issues are hashed out.

55B Yi model merges

55B Yi model merges

Yi-55B - a mlinmg Collection