Plus-Significance348B to

LocalLLaMA@poweruser.forumEnglish · 1 year ago

CUDA out of Memory Error

1

1

CUDA out of Memory Error

Plus-Significance348B to

LocalLLaMA@poweruser.forumEnglish · 1 year ago

1

I’m using a Colab notebook with a T4 to have Llama 2 summarize all tables in a PDF. It works for the first 10 or so tables, and then I’m getting the dreaded CUDA out of memory error.

It seems like each successive summarization call is accumulating on the GPU, is there some way to clear the allocated memory from the previous call so that the memory allocated doesn’t build up?

Chat

AaaaaaaaaeeeeeB
link
fedilink
English
arrow-up
1·
1 year ago
Long context is useless without flash-attention.

+4k = 2gb

+8k = 4gb

+16k = 16gb

+32k = 256gb

64k = 65536gb