I’ve been in need of a dedicated training rig lately and so I’ve been looking at gpus. For context I’m already running training on a 16gb 3080ti laptop and inference on a 16gb 4060ti. Both are really just fine for 13B models.

When I’m looking at cards though it appears I could buy nearly 4 more 16gb 4060ti cards for the price of a 24gb 4090.

I understand that the 4090 is potentially 2-3 times faster based on benchmarks, but does this actually translate to improved Llama speeds? Would it even be viable to go for double 4060ti’s instead?

Currently I’m standardized on 16gb/13B/4bit but I’d love to push beyond that, have more vram for training, etc. What are my options?