- 1 Post
- 4 Comments
Joined 2 years ago
Cake day: October 27th, 2023
You are not logged in. If you use a Fediverse account that is able to follow users, you can follow this user.
naptasticBto
LocalLLaMA@poweruser.forum•1.3B with 68.29% Humaneval lol, don't behead me. Part of my project PIC (partner-in-crime)English
1·2 years agoThat definitely works better. I wouldn’t trust it too far though. It just told me I can remove the first part of a file with one seek() and one truncate() call…
naptasticBto
LocalLLaMA@poweruser.forum•1.3B with 68.29% Humaneval lol, don't behead me. Part of my project PIC (partner-in-crime)English
1·2 years agoJust selecting StarChat, it instantly became conversational. :+1:
naptasticBto
LocalLLaMA@poweruser.forum•1.3B with 68.29% Humaneval lol, don't behead me. Part of my project PIC (partner-in-crime)English
1·2 years agoOk, it finally downloaded and I’ve spent a few minutes with it. It keeps getting into endless pathways of jaron (e.g., “fair play make world communal environment tolerant embraces diversity embrace equity promote unity instill resilience proactive leadership” and it just goes on like that–no punctuation, no connecting words–until it reaches the token limit.) What loader and settings work best with this model?

It’s important that we not disclose all our test questions, or models will continue to overfit and underlearn. Now, to answer your question:
When evaluating a code model, I look for questions with easy answers, then tweak them slightly to see if the model gives the easy answer or figures out that I need something else. I’ll give one example out of tens*:
Most of the models I’ve tested will give a correct answer to the wrong question: seek(1024) and truncate(). That removes everything after the first 1 KiB of the file.
(*I’m being deliberately vague about how many questions I have for the same reason I don’t share them. Also it’s a moving target.)