I’ve playing with a lot of models around 7B but I’m now prototyping something that would be fine with a 1B model I think, but there’s just Phi-1.5 that I’ve seen of this size, and I haven’t seen a way to run it efficiently so far. llama.cpp has still not implemented it for instance.
Anyone has an idea of what to use?
Deepseek-Coder has a 1B model I believe that’s outperforming 13B models— I’ll check back once I find a link
Edit: found it https://evalplus.github.io/leaderboard.html
Thanks! But I’m not looking for one that does coding, more one that’s good at detecting fallacies and reasoning. Phi-1.5 seems a better fit for that
I would still give it a try— it’s misleading to think these coding models are only good at that, being good at coding actually has shown to improve its scores across multiple benchmarks.
RWKV 1.5B, its Sota for its size, outperforms tinyLlama, and uses no extra vram for fitting its whole ctx len in browser.
Tiny llama is 1.1b ?
I mean yeah but it’s not done training AFAIK, and not fine-tuned either