I don't understand Mistral and context size, honestly.

anti-lucas-throwaway · 1 year ago

I don't understand Mistral and context size, honestly.

anti-lucas-throwaway · 1 year ago

So I did some research and after I while in the rabbit hole I think that sliding window attention is not implemented in ExLlama (or v2) yet, and it is not in the AMD ROCm fork of Flash Attention yet either.

I think that means it’s just unsupported right now. Very unfortunate, but I guess I’ll have to wait. Waiting for support is the price I pay for saving 900 euros on a GPU by not buying a 4090, but a 7900 XTX. I’m fine with that.