oobabooga4B to LocalLLaMA@poweruser.forumEnglish · 11 months agoQuIP#: SOTA 2-bit quantization method, now implemented in text-generation-webui (experimental)github.comexternal-linkmessage-square6fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkQuIP#: SOTA 2-bit quantization method, now implemented in text-generation-webui (experimental)github.comoobabooga4B to LocalLLaMA@poweruser.forumEnglish · 11 months agomessage-square6fedilink
minus-squarea_beautiful_rhindBlinkfedilinkEnglisharrow-up1·11 months agoFrom the issue about this in the exllamav2 repo, quip was using more memory and slower than exl. How much context can you fit?
From the issue about this in the exllamav2 repo, quip was using more memory and slower than exl. How much context can you fit?