https://www.amazon.se/-/en/NVIDIA-Tesla-V100-16GB-Express/dp/B076P84525 price in my country: 81000SEK or 7758,17 USD
My current setup:
NVIDIA GeForce RTX 4050 Laptop GPU
cuda cores: 2560
memory data rate 16.00 Gbps
My laptop GPU works fine for most ML and DL tasks. I am currently finetuning a GPT-2 model with some data that I scraped. And it worked surprisingly well on my current setup. So it’s not like I am complaining.
I do however own a stationary PC with some old GTX 980 GPU. And was thinking of replacing that with the V100.
So my question to this community is: For those of you who have bought your own super-duper-GPU. Was it worth it. And what was your experience and realizations when you started tinkering with it?
Note: Please refrain giving me snarky comments about using Cloud GPU’s. I am not interested in that (And I am in fact already using one for another ML task that doesn’t involve finetuning) . I am interested to hear about the some hardware hobbyists opinion on this matter.
-
You want VRAM, like lots of folks have mentioned; there’s some non-obvious things here - you can make smaller VRAM work w/ reduced batch size or non-AdamW optimizers, but you trade off both speed and quality to do so.
-
You can split training across multiple GPUs; I use 2x 3060 12gb, though a real 24gb card would be better.
-
I don’t recommend a V100 - you’d miss out on the bfloat16 datatype.
-
I’d love a V100 but they go for stupid prices where 3090s and a whole host of other cards make more sense. I think even RTX 8000 is cheaper and has more ram/is newer.
A V100 16GB is like $700 on ebay. RTX 3090 24GB can be had for a similar amount.
Exactly which has me wondering why 3090 24g isn’t mentioned more on this sub. Isn’t that actually the best option. multiple of those