Quantizing 70b models to 4-bit, how much does performance degrade?

ae_dataviz · 1 year ago

Quantizing 70b models to 4-bit, how much does performance degrade?

Dry-Vermicelli-682 · 1 year ago

44GB of GPU VRAM? WTH GPU has 44GB other than stupid expensive ones? Are average folks running $25K GPUS at home? Or those running these like working for company’s with lots of money and building small GPU servers to run these?