Where and how to run Goliath 120b GGUF with good performance?

abandonedexplorer · 3 years ago

Where and how to run Goliath 120b GGUF with good performance?

Worldly-Mistake-8147 · 3 years ago

I’m sorry for a little side-track, but how much context you able to squeeze into your 3 GPUs with Goliath’s 4bit quant?
I’m considering to add another 3090 to my own doble-GPU setup just to run this model.

panchovix · 3 years ago

I tested 4K and it worked fine at 4.5bpw. Max will be prob about 6k. I didn’t use 8bit cache

Now 4.5bpw is kinda overkill, 4.12~ bpw is like 4bit 128g gptq, and that would let you use a lot more context.

Dead_Internet_Theory · 3 years ago

That is awesome. What kind of platform do you use for that 3 GPUs setup?

panchovix · 3 years ago

Basically for now an X670E mobo + Ryzen 7 7800X3D.

But if you want the full speed, a server mobo with ton of PCI-E lanes will work better.