Anyone Tried Adding New Languages to Open Source AI Models? Need Advice!

nefarkederki · 2 years ago

WaterdanceAC · 2 years ago

I havent worked on this personally, but I like to keep an eye out for projects like this. Some resources/thoughts - dataset: https://huggingface.co/datasets/allenai/MADLAD-400 and the bilingual arabic/English project Jais found that training the model with some coding abilities proved helpful. https://www.cerebras.net/blog/jais-a-new-pinnacle-in-open-arabic-nlp Good luck!