At the 2026 Build developer conference, Microsoft announced a significant expansion of its self-developed MAI model family created by the Microsoft AI Superintelligence team, launching the first general-purpose inference model MAI-Thinking-1, a code model MAI-Code-1 for GitHub Copilot, and multiple updated versions of speech, transcription, and image generation models to further improve its end-to-end AI technology landscape. This marks Microsoft's accelerated expansion in the field of basic models from speech and images to complex reasoning and developer productivity scenarios.

Microsoft said that the MAI model family has continued to expand in the past year, and has released MAI-Voice-1, MAI-1-preview, and MAI-Transcribe-1 and MAI-Image-2 earlier this year, and then launched MAI-Image-2.5, which has improved the quality of text rendering, stylized illustrations and commercial images. This time, on this basis, new reasoning and coding models are added, and the voice, transcription and image product lines are simultaneously upgraded to form a more complete product portfolio.
MAI-Thinking-1 is the first inference model officially announced by Microsoft. It was trained from scratch by the Microsoft AI team and was not distilled from other models. Microsoft emphasized that the model is trained using clean, commercially licensed enterprise-level data and is designed to meet the requirements of enterprise users for data compliance and commercializability. MAI-Thinking-1 is a medium-scale model with 35 billion activation parameters and supports 128K context windows. It is mainly targeted at scenarios such as complex multi-step instruction execution, long context reasoning, and code generation.
Although Microsoft did not disclose detailed benchmark data in the announcement, it cited independent review results in its blog and said that in the blind test, the reviewers' overall preference was more towards MAI-Thinking-1 than Anthropic's Claude Sonnet 4.6. In addition, Microsoft also stated that in the SWE-bench Pro code task test, MAI-Thinking-1 was comparable to Claude Opus 4.6 in coding performance, showing the potential of this model for developers and complex engineering tasks. MAI-Thinking-1 is currently in private preview for select customers via Microsoft Foundry.
In terms of image generation, Microsoft's previously released MAI-Image-2.5 and its "flash variant" have been opened to developers through Microsoft Foundry. According to Arena's latest article list data cited by Microsoft, MAI-Image-2.5 has surpassed Google's Nano Banana Pro in the task of generating text images and entered the top three on the list. This model has been integrated into PowerPoint and is gradually being rolled out to OneDrive, providing higher-quality image generation capabilities for the Office ecosystem.
In terms of speech transcription, Microsoft released MAI-Transcribe-1 in April this year, which supports speech-to-text transcription in the 25 most commonly used languages based on its own product usage data. This time Microsoft launched the upgraded version MAI-Transcribe-1.5, which has reached the industry-leading level in speech recognition accuracy and expanded the number of languages covered to 43. It plans to add streaming transcription capabilities to the model soon to meet the needs of real-time scenarios.
In the direction of speech synthesis, after announcing that MAI-Voice-1 was generally available in April this year, Microsoft released MAI-Voice-2 and its lightning version this time. The new generation speech generation model supports more than 15 additional languages and provides more voice style options to adapt to richer application scenarios, such as multilingual customer service, content dubbing, and smart assistants.
For developers' coding scenarios, Microsoft simultaneously launched MAI-Code-1, an efficient inference code model optimized for GitHub workloads. This model has been launched in GitHub Copilot and Visual Studio Code, providing support for daily coding, refactoring, code completion and other scenarios. Although Microsoft has not disclosed the specific benchmark results of MAI-Code-1, this release is regarded as an important signal - Microsoft no longer relies entirely on OpenAI and Anthropic for the underlying model of GitHub Copilot, but gradually introduces self-developed models.
In terms of distribution channels, in addition to providing services to enterprises and developers through Microsoft Foundry, Microsoft also announced that its MAI series models will be launched on third-party platforms such as Fireworks AI, Baseten and OpenRouter. At the same time, Fireworks AI has also been made generally available within Microsoft Foundry, providing enterprise customers with more architecture and deployment options. By cooperating with multiple platforms, Microsoft hopes to further lower the access threshold and accelerate the implementation of MAI models in different cloud and tool ecosystems.
From the overall layout, Microsoft is building a complete enterprise-level AI capability matrix using multiple types of models such as MAI reasoning, coding, speech, transcription, and images. With the addition of MAI-Thinking-1 and MAI-Code-1, Microsoft's voice in the fields of complex reasoning and developer productivity has been significantly enhanced, and it has also provided a more competitive technical base for GitHub Copilot, Office series and collaboration platforms.