- 1 Post
- 2 Comments
Joined 2 years ago
Cake day: November 10th, 2023
You are not logged in. If you use a Fediverse account that is able to follow users, you can follow this user.
DataLearnerAIBto
LocalLLaMA@poweruser.forum•Yi-34B vs Yi-34B-200K on sequences <32K and <4KEnglish
1·2 years agoIn most scenarios, models with extended context are optimized for long sequences. If the sequence is not very long, it is often recommended to use a regular model
Ali opensouced a 72B model called Qwen-72B: Qwen/Qwen-72B · Hugging Face
It supports Chinese and English. The performance on MMLU is remarkable.