Currently I have 12+24GB VRAM and I get Out Of Memory all the time when try to fine tune 33B models. 13B is fine, but the outcome is not very good so I would like to try 33B. I wonder if it’s worthy to replace my 12GB GPU with a 24GB one. Thanks!

  • kpodkanowiczB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    i have some issues with flash attention and with 48gb i can go up to 512 rank with batch size 1 and max len 768. My last run was 1024 max len, batch 2, gradient 32, rank 128 and gives pretty nice results