Two sets of base models from China (Yuan 2.0-2B, 51B, 102B and XVERSE-7B, 13B, 65B)

Illustrious_Sand6784 · 2 years ago

Two sets of base models from China (Yuan 2.0-2B, 51B, 102B and XVERSE-7B, 13B, 65B)

Aaaaaaaaaeeeee · 2 years ago

Deepseek 67B still beats XVERSE-65B in the benchmarking scores.
The benchmarks indicate strong math and coding performance for these two model series.
Yuan has a unique optional attention mechanism that enhances output quality

fallingdowndizzyvr · 2 years ago

I’m really interested in having a 51B model. I would love something between 34B and 65/70B.

Dead_Internet_Theory · 2 years ago

>XVERSE 7B, 13B and 65B
Either they are ripping off Meta and not telling us about it, or there’s some reason why the ~30B parameter models are being ignored. It’s the perfect size for a 24GB card! Bummer.