MoffKalastBtoLocalLLaMA@poweruser.forum•40x or more speedup by selecting important neuronsEnglish
1·
10 months agoI doubt it, most of their leverage is in being the only suppliers of hardware required for pretraining foundational models. This doesn’t really change that.
Well it seems a lot better at Slovenian than LLamas or Mistral, especially for a 3B model, although it mostly just rambles about stuff that’s vaguely related to the prompt and makes lots of grammatical mistakes. The 7B one ought to be interesting once it’s done.