As the 2025 Atlantic hurricane season draws to a close, early assessments show that Google DeepMind's AI weather model significantly outperforms traditional physical models in both forecast accuracy and speed. This change could reshape the field of meteorology. DeepMind’s Weather Lab has been releasing tropical cyclone forecasts since June, and its performance has significantly surpassed the Global Forecast System (GFS) used by the National Weather Service.

University of Miami climate scientist Brian McNoldy analyzed forecast data for 13 storms this season. He found that the DeepMind AI model had a lower average position error than the GFS model within the prediction interval of up to 5 days. Specifically, at the 120-hour prediction time, the track error of the DeepMind model was 165 nautical miles, while the GFS was as high as 360 nautical miles, a gap of more than twice.
This difference mainly comes from the different technical routes. GFS relies on clear physical equations to perform three-dimensional simulations of atmospheric motion, which requires a large amount of calculations on NOAA supercomputers. The DeepMind system uses neural networks to train decades of meteorological data and can generate forecasts in minutes on conventional GPU clusters without the need for huge computing infrastructure.
Hurricane researcher Michael Lowry pointed out that AI models can continuously learn from past prediction errors and adjust themselves, which physical systems cannot do. He also said that AI models can be quickly retrained using new data, and their improvement speed is expected to increase exponentially, while traditional models can only be optimized gradually.
Not only that, the DeepMind model also surpasses official forecasts developed by human experts and consensus models that integrate multi-model results to reduce bias. If ultimately verified by official data, this will be the first time that AI has surpassed automatic and manual consensus forecasts in the Atlantic basin.


The current analysis does not cover the European Center for Medium-Range Weather Forecasts (ECMWF) model, which has historically been considered the global benchmark. Historical data shows that the accuracy of ECMWF's tropical cyclone tracks is similar to or slightly better than official forecasts, but this season's data also shows that it is difficult to surpass DeepMind's performance.
The outstanding performance of the DeepMind system has sparked discussion about the future role of traditional numerical weather prediction. Physical models must process fluid dynamics, radiative transfer and thermodynamic equations at millions of grid points, which require huge computing power and are susceptible to truncation errors. Data-driven neural networks, on the other hand, infer dynamic processes directly from global reanalysis data without the need for complex formulas.
AI models are "deep generative models" that can learn high-dimensional patterns. It is reported that DeepMind uses an encoder-decoder architecture optimized for temporal and spatial prediction to process trajectory and intensity prediction simultaneously in a single network. The system has also been stable this season in predicting maximum wind speeds and pressure changes, where physical models still have inconsistencies.

Meanwhile, GFS's performance this year has puzzled meteorologists. This model was upgraded to the FV3 power core in 2019, but there was a performance regression after the transformation. Persistent model biases and track divergence have left frontline forecasters increasingly distrustful of their guidance on tropical systems.
Lowry and others speculate that gaps in observational data at the National Weather Service due to federal budget constraints have exacerbated the problem, but no official assessment has been released.