I am considering purchasing a 3090 primarily for use with Code Llama. Is it a good investment? I haven’t been able to find any relevant videos on YouTube and would like to understand more about its performance speeds.
You must log in or register to comment.
With a 3090 and sufficient system RAM, you can run 70b models but they’ll be slow. About 1.5 tokens/second. Plus quite a bit of time for prompt ingestion. It’s doable but not fun.
one is not enough
With a dedicated 3090 (another card for OS) a 34b 5bpw just fits and runs very fast. Like 10-20t/s. The quality is good for my application, but I’m not coding.