oobabooga4B to LocalLLaMA@poweruser.forumEnglish · 1 year agoQuIP#: SOTA 2-bit quantization method, now implemented in text-generation-webui (experimental)github.comexternal-linkmessage-square6fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkQuIP#: SOTA 2-bit quantization method, now implemented in text-generation-webui (experimental)github.comoobabooga4B to LocalLLaMA@poweruser.forumEnglish · 1 year agomessage-square6fedilink
minus-squarea_beautiful_rhindBlinkfedilinkEnglisharrow-up1·1 year agoFrom the issue about this in the exllamav2 repo, quip was using more memory and slower than exl. How much context can you fit?
From the issue about this in the exllamav2 repo, quip was using more memory and slower than exl. How much context can you fit?