oobabooga4B to LocalLLaMA@poweruser.forumEnglish · 2 years agoQuIP#: SOTA 2-bit quantization method, now implemented in text-generation-webui (experimental)github.comexternal-linkmessage-square6linkfedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkQuIP#: SOTA 2-bit quantization method, now implemented in text-generation-webui (experimental)github.comoobabooga4B to LocalLLaMA@poweruser.forumEnglish · 2 years agomessage-square6linkfedilink
minus-squarea_beautiful_rhindBlinkfedilinkEnglisharrow-up1·2 years agoFrom the issue about this in the exllamav2 repo, quip was using more memory and slower than exl. How much context can you fit?
From the issue about this in the exllamav2 repo, quip was using more memory and slower than exl. How much context can you fit?