Monday, June 08, 2026

Microsoft AI Introduces MAI-Transcribe-1.5: 2.4% WER and Up to 5x Faster Long-Audio Transcription

Impressive! Really polyglott!

WER = word error rate

"... It transcribes a full hour of audio in under 15 seconds. Best-in-class on FLEURS accuracy. Leads the accuracy-speed Pareto frontier. ..."

"... The model handles 43 languages with a single system. It is optimized for diverse accents, dialects, and real-world acoustic conditions. ..."

Microsoft AI Introduces MAI-Transcribe-1.5: 2.4% WER on Artificial Analysis, Best-in-Class FLEURS Accuracy, and Up to 5x Faster Long-Audio Transcription - MarkTechPost "Microsoft's Superintelligence team has shipped MAI-Transcribe-1.5, a production-focused speech-to-text model with expanded language coverage, domain-aware keyword biasing, and faster long-form inference."

MAI-Transcribe-1.5 (official website)

No comments: