Comparing 4060 Ti 16GB + DDR5 6000 vs 3090 24GB: looking for 34B model benchmarks

regunakyle · 2 years ago

Comparing 4060 Ti 16GB + DDR5 6000 vs 3090 24GB: looking for 34B model benchmarks

fallingdowndizzyvr · 2 years ago

There’s really no comparison. The 4060s, even the Ti, have crap for memory bandwidth. 288GB/s in the case of the Ti. DDR5 is also not fast enough to make much difference. So that combo is not going to be speedy. It in no way compares to a 3090.

mcmoose1900 · 2 years ago

Here’s a 7B llama.cpp bench on a 3090 and 7800X3D, with CL28 DDR5 6000 RAM.

All layers offloaded to GPU:

Generation:5.94s (11.6ms/T), Total:5.95s (86.05T/s)

And here is just 2/35 layers offlloaded to CPU:

Generation:7.59s (14.8ms/T), Total:7.75s (66.10T/s)

As you can see, the moment you offload even a little bit to CPU, you are going to hit performance hard. More than a few layers and the hit is very severe.

Here is exllamav2 for reference, though the time also includes prompt processing so its actually faster than indicated:

3.91 seconds, 512 tokens, 130.83 tokens/second (includes prompt eval.)

regunakyle · 2 years ago

Thanks for your data! Can you do the test again with the phind codellama 34B model?

candre23 · 2 years ago

The 3090 will outperform the 4060 several times over. It’s not even a competition - it’s a slaughter.

As soon as you have to offload even a single layer to system memory (regardless of the speed), you cut your performance by an order of magnitude. I don’t care if you have screaming fast DDR5 in 8 channels and a pair of the beefiest xeons money can buy, your performance will fall off a cliff the minute you start offloading. If a 3090 is within your budget, that is the unambiguous answer.