Microsoft AI has announced two major developments in its mission to create specialized AI models — the launch of MAI-Voice-1, its first in-house speech generation model, and the public testing of MAI-1-preview, a large language model built on a mixture-of-experts framework. These advancements mark a significant step in Microsoft’s strategy to combine proprietary systems, partner collaborations, and open-source efforts to power Copilot and other AI products at scale.
MAI-Voice-1 is already being deployed across Copilot Daily and Podcasts, and is now accessible via Copilot Labs for broader experimentation. The model has been engineered for highly expressive, multi-speaker audio suited to interactive experiences such as storytelling, guided meditations, and conversational companions. According to Microsoft, the system can generate “a full minute of audio in under a second on a single GPU”, making it one of the most efficient voice generation technologies currently available. The company framed this as a key step toward its belief that “Voice is the interface of the future for AI companions.”
In parallel, Microsoft has opened testing of MAI-1-preview, a large-scale LLM trained using approximately 15,000 NVIDIA H100 GPUs. The model, structured as a mixture-of-experts, is available for evaluation on LMArena, with a phased rollout expected across Copilot’s text-based experiences in the coming months. “We have big ambitions for where we go next – model advancements, an exciting roadmap of compute, and the chance to reach billions of people through Microsoft’s products,” said Mustafa Suleyman, CEO of Microsoft AI.
The announcement also underscored Microsoft’s expanding AI infrastructure, with the MAI GB200 cluster now operational, providing the compute foundation for future large-scale model development. The company noted that API access is currently being offered to trusted testers, with feedback from early adopters helping to refine the models ahead of a wider release.
Positioning itself as a “lean, fast-moving lab,” Microsoft AI confirmed ongoing hiring efforts to scale its research and development agenda. With MAI-Voice-1 and MAI-1-preview, Microsoft is not only advancing performance benchmarks in speech and text generation but also reinforcing its ambition to make AI companions more natural, powerful, and widely accessible.