So in the last few weeks i have been experimenting with LLMs on my personal laptop (as I’m rarely at home) but I’m gonna have my pc with me in a few days. When running models (MythoMax 13b, mostly Q6_K and Q5_K_M GGUF) I can definitely feel my laptop not liking it. Slowdowns, crashes, service terminations and timeouts.

Now, the situation is this, I have unexpectedly gotten some money which i want to invest in PC parts.
My PC currently has 16GB of DDR5 Ram and a GTX 1070 with 8GB VRAM.
The idea now is to buy a 96GB Ram Kit (2x48) and Frankenstein the whole pc together with an additional Nvidia Quadro P2200 (5GB Vram).

Would the whole “machine” suffice to run models like MythoMax 13b, Deepseek Coder 33b and CodeLlama 34b (all GGUF)

Specs after: 112GB DDR5, 8GB VRAM and 5GB VRAM, CPU is a Ryzen 5 7500F

And the question i should have asked first, can the GTX 1070 and P2200 setup even work, like would text gen webui even detect both cards?

Sorry if thats a dumb question

  • a_beautiful_rhindB
    10 months ago

    13gb does not make for much. Especially when part of it is used for graphics and all old pascal architecture.

    By all means just put the card is and see where it gets you on 13b.