XTTSv2 is released. I’d say it’s a big jump in quality.
- Better voice cloning
- Better audio
- Impressive prosody and expressiveness
- Added more languages, I guess total 16 languages.
- Non-EN languages sounds way better
- Streaming under 200ms ( I have 3090)
- Finetuning code
Here you can try https://huggingface.co/spaces/coqui/xtts