RWKV v5 7b, Fully Open-Source, 60% trained, approaching Mistral 7b in abilities or surpassing it.

vatsadev · 1 year ago

RWKV v5 7b, Fully Open-Source, 60% trained, approaching Mistral 7b in abilities or surpassing it.

vasileer · 1 year ago

I tested the 3B model and it looks good, especially the multilingual part (demo https://huggingface.co/spaces/BlinkDL/RWKV-Gradio-2)

ambient_temp_xeno · 1 year ago

Seems amazingly good. I might get a real use out of a raspberry pi after all.

MoffKalast · 1 year ago

Well it seems a lot better at Slovenian than LLamas or Mistral, especially for a 3B model, although it mostly just rambles about stuff that’s vaguely related to the prompt and makes lots of grammatical mistakes. The 7B one ought to be interesting once it’s done.

vatsadev · 1 year ago

Its trained on 100+ languages, the focus is multilingual

alchemist1e9 · 1 year ago

Will that make it a good translator? I remember seeing somewhere a 400+ language translation model but not an LLM somewhere. Wonder what the best many language open source fast high quality translation solutions might look like.

Aaaaaaaaaeeeee · 1 year ago

Would the amount of RAM used at the end of 16k or 32k compared to mistral be less?

Is the t/s the same speed as during the beginning?

Looks like something to test in kobold.cpp later if nobody has done those tests yet.

Aaaaaaaaaeeeee · 1 year ago

RWKV-4 7b does not increase any RAM usage with --nommap at 13k with koboldcpp. is that normal? Is there no kv-cache and no extra ram usage for context?

vatsadev · 1 year ago

Thats the point of rwkv, you could have a 10 mil contx len and it would be the same as 100 ctx len

ambient_temp_xeno · 1 year ago

Fully open source?