TabbyAPI released! A pure LLM API for exllama v2.

panchovix · 1 year ago

TabbyAPI released! A pure LLM API for exllama v2.

panchovix · 1 year ago

By the hard work of kingbri, Splice86 and turboderp, we have a new API loader for LLMs using the exllamav2 loader! This is on a very alpha state, so if you want to test it may be subject to change and such.

TabbyAPI also works with SillyTavern! Doing some special configurations, it can work as well.

As a reminder, exllamav2 added mirostat, tfs and min-p recently, so if you used those on exllama_hf/exllamav2_hf on ooba, these loaders are not needed anymore.

Enjoy!

a_beautiful_rhind · 1 year ago

Nice. A lightweight loader. Will make us free of gradio.

oobabooga4 · 1 year ago

Gradio is a 70MB requirement FYI. It has become common to see people calling text-generation-webui “bloated”, when most of the installation size is in fact due to Pytorch and the CUDA runtime libraries.

https://preview.redd.it/pgfsdld7xw0c1.png?width=370&format=png&auto=webp&s=c50a14804350a1391d57d0feac8a32a5dcf36f68

kpodkanowicz · 1 year ago

I think there is room for everyone - Text Gen is a piece of art - it’s the only thing in the whole space that always works and is reliable. However, if im building an agent and getting a docker build, I can not afford to change text gen etc.

TabbyAPI released! A pure LLM API for exllama v2.

TabbyAPI released! A pure LLM API for exllama v2.

GitHub - theroyallab/tabbyAPI