I’m using Windows 11 and OOBE. I use SillyTavern as well, but not always.

I’ve been playing with 20b models (work great at 4096 context) and 70b ones (but too slooow unless I make the context 2048, which is then usable, but the low context sucks)

What else am I missing I see there are some 34b models now for Exllama2, but I’m having issues getting them to work, quality (which PROFILE do I use??) or speed wise (what context setting? This is not the 200k context version)…

For your recommended model, what is the best settings for those on a single card system? (4090, 96GB of RAM, I9-13900k)

Any suggestions for best experience is appreciated (for creative, RPG/Chat/Story usage).

Thank you.

  • GyramuurB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    Not entirely, I mainly just use it as an open ended story. However, if I’m doing a second person text adventure, sometimes I will place information pertaining to the “game” and put it in memory (or system prompt, if in LM Studio), like this:

    Scenario: Second person survival text adventure game set in the African savannah, hundreds of millions of years ago.

    Inventory: Stone knife, dried meat, berries (4), half-filled water skin.

    Current quest: Go hunting

    In my experience so far, it seems to do a really good job of remembering exactly whatever’s in my inventory, and if my character picks something up in game, I add it to the memory.

    If you wanted more game-like systems like stats and dice rolls, I suppose you could keep track of those externally, and your rolls could be done through a site like random.org and just tell the LLM your result.