I’m using Windows 11 and OOBE. I use SillyTavern as well, but not always.
I’ve been playing with 20b models (work great at 4096 context) and 70b ones (but too slooow unless I make the context 2048, which is then usable, but the low context sucks)
What else am I missing I see there are some 34b models now for Exllama2, but I’m having issues getting them to work, quality (which PROFILE do I use??) or speed wise (what context setting? This is not the 200k context version)…
For your recommended model, what is the best settings for those on a single card system? (4090, 96GB of RAM, I9-13900k)
Any suggestions for best experience is appreciated (for creative, RPG/Chat/Story usage).
Thank you.
I’m a huge fan of MLewd-ReMM-L2-Chat-20B at the moment. I use the 6-bit quant and have found it at times to be similar in quality to the roleplays I used to have with ChatGPT 3.5 before "Open"AI nerfed it into oblivion. Hardly ever have to reroll.
So far with the local models, I’ve just done like storybook format, RPGing, without a game system, dice, rolls, etc, which I used to do with chat GPT…
Do you have a prompt template that works well for you that you would be willing to share that gamifies it?
Not entirely, I mainly just use it as an open ended story. However, if I’m doing a second person text adventure, sometimes I will place information pertaining to the “game” and put it in memory (or system prompt, if in LM Studio), like this:
Scenario: Second person survival text adventure game set in the African savannah, hundreds of millions of years ago.
Inventory: Stone knife, dried meat, berries (4), half-filled water skin.
Current quest: Go hunting
In my experience so far, it seems to do a really good job of remembering exactly whatever’s in my inventory, and if my character picks something up in game, I add it to the memory.
If you wanted more game-like systems like stats and dice rolls, I suppose you could keep track of those externally, and your rolls could be done through a site like random.org and just tell the LLM your result.