Hi everybody!
Inspired by a recent thread, mentioning the insane goliath abilities I decided to merge four SFT Yi models to make 2 seperate 55B Yi, one with context 200K and one with 32K.
Try them out and let me know!
You must log in or register to comment.
What are the eval results?
Did you do post-merge retraining? Without at least some results are going to be poor…
Very cool. How did you merge them?