• 2 Posts
  • 24 Comments
Joined 1 year ago
cake
Cake day: October 31st, 2023

help-circle




  • Hmm.

    Well there is the target length (tokens) setting in SillyTaverns advanced formatting tab.

    I’ve got it set to 200 as above and then the Response (tokens) setting set to 300.

    The “target” is actually the setting which I’ve got set to 200. The setting at 300 is merely a “cap” it can’t go over.

    So I’d start with changing the target length (tokens) to 100 and change your Response (tokens) cap to say 150-175 to give it a bit of wiggle room.

    If that doesn’t work try removing the “be verbose” part of what I wrote if you are using that or edit this part to “Write multiple brief fresh sentences, paragraphs, and phrases.”


  • After seeing your comment I tried the OpenHermes-2.5-neural-chat-7B-v3-1-7B-GGUF model you mention.

    Unfortunately setup the way I am it didn’t respond very well for me.

    Honestly I don’t think the concept of that merge is too good to be frank.

    OpenHermes is fantastic. If I had to state it’s flaws I’d say it’s prose is a bit dry and the dialogue seems to speak past you in a way rather than clearly responding to you. Only issues for roleplay really.

    From all I’ve read neuralchat is much the same (tbh though I’ve not got neuralchat to work particularly well for me at all) so any merge created from those two models I would expect to be a bit lacking in the roleplay department.

    That said if you are wanting a model for more professional purposes it might be worth further testing.

    For roleplay Misted-7B is leagues better. At least in my testing in my setup.




  • /u/reiniken has reminded me of one important point I didn’t touch on much.

    It’s important to replicate the style you want the AI to write in, in the first message and in your own replies to help the AI keep replicating the format.

    So write narration in 3rd person and add some bracketed thoughts in the introduction message of your card if you follow my guide.

    That’s why in my examples I speak myself in 3rd person. You don’t have to the AI can keep to the format without doing so from my testing but I think writing your own narration in 3rd person helps the AI keep it’s narration in 3rd person too. If it see’s your narration in 1st person it could be tempted to write it’s narration in 1st person.









  • I use koboldccp.

    You are probably right about it being a bug as at first I couldn’t get the model to work at all (it crashed koboldccp when loading up) but it was just because I had a week old version of koboldccp. I needed to download the version that came out like 4 days ago (at that time) ha! Then it loaded up fine but with that already mentioned quirk. I guess it will get fixed in short time.

    Yeah the future of local LLM’s lies in the smaller models for sure!


  • I was very impressed by rocket 3B in the brief time I tested it. Unfortunately I’ve only ever used mistral based 7B’s so I can’t compare it to older 7B’s but it wouldn’t surprise me if the benchmark results are accurate and it is as good as the old 7B’s.

    I’m glad I tried it as now I know to keep an eye on 3B progress. Might not be too long before 3B’s are performing at the level of current mistral 7Bs!

    One weird thing though. It crashed for me when I attempted to load in all layers even though I had the VRAM space and when loading 32/35 layers it gave me the same inference speed as when I load 32/35 layers of a 7B.


  • I could only get pretty muddled responses from the model.

    Despite seemingly having a simple prompt template I suspect I didn’t enter all the data correctly into simpletavern as the outputs I was getting were similar to when I have a wrong template selected for a model.

    Shrugs

    If a model wants to be successful they should really pick a standard template (pref ChatML) and clearly state that’s what they are using.