In the early morning of December 27, OpenAI announced,
Image source: OpenAI
According to previous media reports, OpenAI did not specify the "upstream provider" related to the problem, but its exclusive cloud provider Microsoft reported that there was a "power problem" in one of its data centers. The problem occurred at the same time as the OpenAI problem and affected North America. At the same time, there were also problems with Xbox cloud games. Just after 5 p.m. ET on December 26, Microsoft said it had "fully restored power" to the affected data center.
As of the end of this summer, ChatGPT has more than 200 million daily active users. Since its release, popular OpenAI products including ChatGPT have experienced multiple outages.
The most recent large-scale outage occurred on December 11, a few days after the release of Sora. All services under OpenAI, including ChatGPT, API, and Sora, experienced severe performance degradation or even complete unavailability from 3:16 pm to 7:38 pm Pacific time on December 11, lasting more than four hours. This outage was due to misconfiguration of the newly deployed telemetry service, which overloaded the control planes of hundreds of Kubernetes clusters around the world, causing cascading failures of critical systems.