How to upgrade to the next VRAM breakpoints, and is it worth it?

DominicanGreg · 1 year ago

How to upgrade to the next VRAM breakpoints, and is it worth it?

fediverser · 1 year ago

This post is an automated archive from a submission made on /r/LocalLLaMA, powered by Fediverser software running on alien.top. Responses to this submission will not be seen by the original author until they claim ownership of their alien.top account. Please consider reaching out to them let them know about this post and help them migrate to Lemmy.

Lemmy users: you are still very much encouraged to participate in the discussion. There are still many other subscribers on !localllama@poweruser.forum that can benefit from your contribution and join in the conversation.

Reddit users: you can also join the fediverse right away by getting by visiting https://portal.alien.top. If you are looking for a Reddit alternative made for and by an independent community, check out Fediverser.

wind_dude · 1 year ago

Or 2 a6000s. But yea $$$ matters.

AutomataManifold · 1 year ago

I think its work remembering that while the really big models take a lot of VRAM, they also quantize down to smaller sizes, so the numbers are slightly misleading.

fallingdowndizzyvr · 1 year ago

The easiest thing to do is to get a Mac Studio. It also happens to be the best value. 3x4090s at $1600 each is $4800. That’s just for the cards. Adding a machine to put those cards into will cost another few hundred dollars. Just the cost of 3x4090s put you into Mac Ultra 128GB range. Adding the machine to put those cards into puts you in Mac Ultra 192GB range. With those 3x4090s you only have 72GB of RAM. Both those Mac options give you much more RAM.

a_beautiful_rhind · 1 year ago

I’m not getting a super huge jump with the bigger models yet. Just a mild bump. I got a P100 to load the low 100s and have exllama work. That’s 64g of FP16 using vram.

For bigger I can use FP32 and put back the 2 more P40s. That’s 120g of vram. Also 6 vidya cards :P

It required building for this type of system from the start. I’m not made of money either, I just upgrade it over time.

-Automaticity · 1 year ago

If Nvidia isn’t upgrading GPU’s past 24GB for the RTX 50 series then that will probably factor into the open source community keeping models below 40b parameters. I don’t know the exact cutoff point. A lot of people with 12gb VRAM can run 13b models but you could also run 7b 8-bit with 16k context size. It will get increasingly difficult to run larger contexts with larger models.

Some larger open models are being released but there won’t be much community there to train on a bunch of datasets to the huge models to nail the ideal finetune.

nero10578 · 1 year ago

You don’t NEED 3090/4090s. A 3x Tesla P40 setup still streams at reading speed running 120b models.

nostriluu · 1 year ago

I have two questions:

what’s this going to look like in six months, with new Intel, AMD, ARM/RISC UMA, hybrid designs well supported and 7200mt+ DDR5 common?

Are the high memory models that much better? My impression is you get a lot of reliable utility out of good smaller models, from there it’s diminishing returns.

I had a honking system with two 3090s, but it felt a bit boondoggle-ish, I sold it and my current plan is to get something like a 4060ti-16gb and also use OpenAI’s API, so I can wait to see what develops, rather than spending it all now while it’s still early days. I can see how someone who is really developing LLMs would want more, but as a “consumer” this seems reasonable.

Even for the “just get a Mac studio,” it seems like the M3 can use more VRAM and is more optimized, so worth it to wait until the M3 Ultra comes out, unless you can get a bargain bin previous model.

DominicanGreg · 1 year ago

Parts wise, a threadripper + ASUS Pro WS WRX80E-SAGE SE WiFi II is already a 2k price floor.

each 4090 is 2-2.3k

each 3090 is 1-1.5k

so building a machine from scratch will run you easily 8-10k off 4090’s and 6-8k off 3090’s. If you already have some GPUS or parts you would still problaly need 2 or more extra gpu’s plus the space and power to run them.

to my specific situation i would have to grab the treadripper, mobo, a case, ram, 2 more cards im looking at potentially 5-7k worth of damage. OR… pay 8.6k for a mac pro m2 and get an entire extra machine to play with.

There’s definitely an entire Mac Pro M3 series on the way considering they just released the laptops, it’s only a matter of time for them to shoot out the announcements. So i would definitely feel a bit peeved if i bought the M2 tower only for a month or two later apple to release the m3 versions.