Impressive! Really polyglott!
WER = word error rate
"... It transcribes a full hour of audio in under 15 seconds. Best-in-class on FLEURS accuracy. Leads the accuracy-speed Pareto frontier. ..."
"... The model handles 43 languages with a single system. It is optimized for diverse accents, dialects, and real-world acoustic conditions. ..."
MAI-Transcribe-1.5 (official website)
No comments:
Post a Comment