(Also posted on r/ML) I know this is the LocalLlama forum but there has been some interest in the OpenAI Assistants API, so here goes. Also of course a similar API service could be implemented using open/local models so it’s not entirely irrelevant!

Given the OpenAI Assistants API released last week, a natural next question was — how can we have several assistants work together on a task?

This was a perfect fit for the Langroid Multi Agent framework (which already works with the completions API and any other local/remote LLM).

For those interested in details of how to work with this API I wanted to share how we implemented a near-complete support for all Assistant features into the Langroid agent framework:

https://github.com/langroid/langroid/blob/main/langroid/agent/openai_assistant.py

We created an OpenAIAssistant class derived from ChatAgent. In Langroid you wrap a ChatAgent in a Task object to enable a multi agent interaction loop. Now the same can be done with an OpenAIAssistant object.

I made a Colab notebook which gradually builds up from simple examples to a two-agent system for structured information extraction from a document:

https://colab.research.google.com/drive/190Tk7t4AdY1P9F_NlZ33-YEoGnHweQQ0

Our implementation supports function-calling, tools (retrieval/RAG, code interpreter). For the code interpreter we capture the code logs and display them in the interaction.

We leverage persistent threads and assistants by caching their ids based on the username + machine + org, so that in a later session they could resume a previous thread + assistant. This is perhaps a simplistic implementation, I’m sure there are better ideas here.

A key feature that is currently disabled is caching: this is turned off because storing Assistant responses in threads is not allowed by the API.

In any case, hope this is useful to some folks, as I’ve seen a lot of questions about this API in various forums.