Looking for any model that can run with 20 GB VRAM. Thanks!
For some people “uncensored” means it hasn’t been lobotomized, but for others it means it can write porn.
Wonder what card you have that’s 20GB?
That’s actually so funny, the 2 times I’ve asked this before, I get downvoted to shit.
What are you looking for?
With a 3090, you can run any 13b model in 8 bit, group size 128, act order true, at decent speed.
Go-tos for the more spicy stuff would be Mythomax and Tie fighter.
Best experience I had was with TheBloke/Wizard-Vicuna-30B- Uncensored-GGML
Best 30B llm so far in general. Censorship kill’s capabilities
I haven’t found one that is universally best regardless of the benchmarks. Same story with vector embeddings, you’ll need to test a few out for your own use case.
The best one I’ve found for my projects though is https://huggingface.co/Open-Orca/Mistral-7B-OpenOrca and the AWQ implementation https://huggingface.co/TheBloke/Mistral-7B-OpenOrca-AWQ.