Quantizing 70b models to 4-bit, how much does performance degrade?

ae_dataviz · 1 year ago

Quantizing 70b models to 4-bit, how much does performance degrade?

Herr_Drosselmeyer · 1 year ago

It’s a rule of thumb that yes, higher parameter at low quant beats lower parameter at high quant (or no quant) but take it with a grain of salt as you may still prefer a lower parameter model that’s more tuned for the task you prefer.