Is it possible to fine tune a 33B model with 48GB vRAM?

tgredditfc · 2 years ago

Is it possible to fine tune a 33B model with 48GB vRAM?

Aaaaaaaaaeeeee · 2 years ago

start with Lora rank=1, 4bit, flash-attention-2, context 256, batchsize=1 until your reach your maximum. Qlora 33b definitely works on just 24gb, it worked back a few months ago.

kpodkanowicz · 2 years ago

i have some issues with flash attention and with 48gb i can go up to 512 rank with batch size 1 and max len 768. My last run was 1024 max len, batch 2, gradient 32, rank 128 and gives pretty nice results

tgredditfc · 2 years ago

Thanks for sharing!

a_beautiful_rhind · 2 years ago

Should work on single 24g gpu as well as either qlora or alpaca_lora_4bit. You won’t get big batches or big context but it’s good enough.

tgredditfc · 2 years ago

Thanks! I have some problems to load GPTQ models with transformer loader.