I have a dataset with context length of 6k. I want to fine tune Mistral model with 4 bit quantization and I faced with error while using 24 GIG RAM GPU.

Is there any base how much ram is needed for this large context length?

Thanks!