I’m blown away. See for yourself.
https://migel.substack.com/p/a-conversation-with-tess
Tess, welcome to the world!
Model is Open Source with 200K context length.
Available at: https://huggingface.co/migtissera/Tess-M-v1.0
do you have to download 71GB to try it?! :-)
How many tokens in your substack example?
Do you have examples of using model for fiction with length 16K-40K tokens?Almost the same syntax as Yi Capybara. Excellent.
I propose all Yi 34B 200K finetunes use Vincuna-ish prompt syntax, so they can ALL be merged into one hellish voltron model.
The deed is done:
https://huggingface.co/brucethemoose/Capybara-Tess-Yi-34B-200K
Seems coherent in transformers, I’m gonna quant it to exl2 and test it out.
Just wanted to come back and let you know I started using this last night, and this is fantastic. I haven’t put it through much testing yet, but just know that on initial use I’m very impressed by this model for general purpose AI assistant. It’s keeping to the Assistant’s more informal speech patterns while also answering questions well and keeping up with large context. Those are 3 checkboxes I’ve never been able to check at once. This praise wont’ get much visibility since it’s an older thread, but just wanted to let you know at least.
More random feedback: you should put some combination of Yi, 34B, and or 200K in the title.
No one tags anything on HF, so the only way to browse models is by title. I would have totally missed this in my Yi/34B searches if not for the Reddit post.
Quantized GGUF here: https://huggingface.co/TheBloke/Tess-Medium-200K-v1.0-GGUF
And GPTQ https://huggingface.co/TheBloke/Tess-Medium-200K-v1.0-GPTQ
What’s the VRAM usage? a context that big can use an enormous amount…
Just on another note, this place is just super hostile! I didn’t think it would be, considering it’s the LocalLLaMA sub-reddit and we are all here to support open source or freely available models.
This is harsher than the Twitter mob!
I’ll still release models, but sorry guys, not coming here again.