Today I tried a number of private (local) opensource #GenAI #LLM servers in Docker. I only run LLM servers in Docker. Without Docker I’m pretty sure my desktop would quickly become an angry bag of snakes in no time (Snakes, pythons, geddit? 🐍 😁 ). For context, I’m evaluating these LLM components to figure out what part they might play in my Backchat plugin project for Backstage from Spotify (https://via.vmw.com/backchat)

Here’s what I discovered:

* PrivateGPT has promise. It offers an OpenAI API compatible server, but it’s much to hard to configure and run in Docker containers at the moment and you must build these containers yourself. If it did run, it could be awesome as it offers a Retrieval Augmented Generation (ingest my docs) pipeline. The project’s docs were messy for Docker use. (https://github.com/imartinez/privateGPT)

* OpenVINO Model Server. Offers a pre-built docker container, but seems more suited to ML rather than LLM/Chat use cases. Also, It doesn’t offer and OpenAI API. Pretty much a non-starter for my use case but an impressive project. (https://docs.openvino.ai/2023.1/ovms_what_is_openvino_model_server.html)

* Ollama Web UI & Ollama. This server and client combination was super easy to get going under Docker. Images have been provided and with a little digging I soon found a `compose` stanza. The chat GUI is really easy to use and has probably the best model download feature I’ve ever seen. Just one problem - doesn’t seem to offer OpenAI API compatibility which limits it’s effectiveness for my use case. (https://github.com/ollama-webui/ollama-webui)

In the end I liked Ollama/Ollama Web UI a lot. If OpenAI API compatibility gets added, it could be my go-to all round LLM project of choice - but not yet.

Ollama Web UI in Backstage

Backchat architecture