Hey guys,

I’m running the quantized version of mistral-7B-instruct and its pretty fast and accurate for my use case. On my PC I’m generating approximately 4 tokens per second with the idea of generating one-sentence responses for my NPC characters, which is good enough for what I need.

After fiddling around with oobabooga a bit I found out that you can perform API calls on localhost and print out the text, which is exactly what I need for this to work.

The issue I’m running into here is that if I were to make a game with AI-generated content, how can I make it easy for players to run their own localhost and perform api calls in the game this way? I feel like for the unexperienced, setting all this up would be a nightmare for them and I don’t want to alienate non-tech savvy players.