Is there any way to speed up the MythoMax-L2-13B on a 6GB GPU?

OverallBit9 · 1 year ago

Is there any way to speed up the MythoMax-L2-13B on a 6GB GPU?

Civil_Ranger4687 · 1 year ago

Never use the Q_8 versions of GGUFs unless most/all of the model can comfortably fit into your VRAM. The Q_6 version is much smaller, and almost the same quality.

For your setup, I would use mythomax-l2-13b.Q4_K_M.gguf.

OverallBit9 · 1 year ago

In my tests Q4 is giving me the same amount of tokens as Q5 so I decided to use Q5, first time tesint text gen locally with models, thank you very much for explaining I am getting used to it now and understanding what the settings do.

Civil_Ranger4687 · 1 year ago

Yeah there’s so much to learn I’m still figuring a lot out too.

Good tip for settings: Play around mostly with temperature, top-p, and min-p.