oobabooga4B to LocalLLaMA@poweruser.forumEnglish · 1 year agoQuIP#: SOTA 2-bit quantization method, now implemented in text-generation-webui (experimental)plus-squaregithub.comexternal-linkmessage-square6fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkQuIP#: SOTA 2-bit quantization method, now implemented in text-generation-webui (experimental)plus-squaregithub.comoobabooga4B to LocalLLaMA@poweruser.forumEnglish · 1 year agomessage-square6fedilink
minus-squareoobabooga4BtoLocalLLaMA@poweruser.forum•Deepseek llm 67b Chat & BaselinkfedilinkEnglisharrow-up1·1 year agoI’m desensitized at this point. I wonder if this is yet another Pretraining on the Test Set Is All You Need marketing stunt or not, as most new models lately have been. linkfedilink
oobabooga4B to LocalLLaMA@poweruser.forumEnglish · 1 year agotransformers library PR: GrammarConstrainedLogitsProcessor, compatible with llama.cpp GBNFplus-squaregithub.comexternal-linkmessage-square1fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linktransformers library PR: GrammarConstrainedLogitsProcessor, compatible with llama.cpp GBNFplus-squaregithub.comoobabooga4B to LocalLLaMA@poweruser.forumEnglish · 1 year agomessage-square1fedilink
minus-squareoobabooga4BtoLocalLLaMA@poweruser.forum•TabbyAPI released! A pure LLM API for exllama v2.linkfedilinkEnglisharrow-up1·1 year agoGradio is a 70MB requirement FYI. It has become common to see people calling text-generation-webui “bloated”, when most of the installation size is in fact due to Pytorch and the CUDA runtime libraries. https://preview.redd.it/pgfsdld7xw0c1.png?width=370&format=png&auto=webp&s=c50a14804350a1391d57d0feac8a32a5dcf36f68 linkfedilink
I’m desensitized at this point. I wonder if this is yet another Pretraining on the Test Set Is All You Need marketing stunt or not, as most new models lately have been.