I am going to build a LLM server very soon, targeting 34B models (specifically phind-codellama-34b-v2.Q4 GGUF GPTQ AWQ).

I am stuck between these two setups:

  1. 12400 + DDR5 6000MHz 30CL + 4060 Ti 16GB (GGUF; Split the workload between CPU and GPU)
  2. 3090 (GPTQ/AWQ model fully loaded in GPU)

Not sure if the speed bump of 3090 is worth the hefty price increase. Does anyone have benchmarks/data comparing these two setups?

BTW: Alder Lake CPUs run DDR5 in gear 2 (while AM4 run DDR5 in gear 1). AFAIK gear 1 offers lower latency. Would this give AM4 big advantage when it comes to LLM?

  • fallingdowndizzyvrB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    There’s really no comparison. The 4060s, even the Ti, have crap for memory bandwidth. 288GB/s in the case of the Ti. DDR5 is also not fast enough to make much difference. So that combo is not going to be speedy. It in no way compares to a 3090.