Goliath-120B - quants and future plans

AlpinDale · 2 years ago

Goliath-120B - quants and future plans

multiverse_fan · 2 years ago

Goliath was created by merging layers of Xwin and Euryale. (from their model card)

The layer ranges used are as follows:
- range 0, 16 Xwin 
- range 8, 24 Euryale 
- range 17, 32 Xwin 
- range 25, 40 Euryale 
- range 33, 48 Xwin 
- range 41, 56 Euryale 
- range 49, 64 Xwin 
- range 57, 72 Euryale 
- range 65, 80 Xwin

I’m not sure how the model would be reduced to 70B unless it’s through removing layers. Is that what “shearing” is? I don’t understand what is being pruned in that, is it layers?