The value of observations for weather prediction in the age of machine learning

By: Prof Sarah Dance (Professor of Data Assimilation)

Last week, on 25th February 2025, our colleagues at ECMWF (European Centre for Medium-range Weather Forecasts) took their deep-learning-based global weather forecasting system, known as the AIFS, into operational production, running alongside their physics-based numerical weather prediction system.  The AIFS outperforms state-of-the-art physics-based models for several measures of accuracy, and is computationally much faster to run. However, both systems currently share a need for initialization with a best estimate of the current state of the atmosphere, created using a process called data assimilation 

Tens of millions of atmospheric observations are used in weather prediction every day, but observations alone cannot describe the weather at all points on the globe. To get a complete picture we use data assimilation to optimally combine the observations with information from a physics-based computer model, taking account of our confidence in each source of data.  The resulting analysis is used as a starting point for both physics-based and machine-learning-based weather forecasts. It is also important to note that reanalysis, using data assimilation with historic observations, is a key component of training data for machine learning forecasting systems.  

Figure: Schematic of the global observing system from the World Meteorological Organization https://community.wmo.int/en/observation-components-global-observing-system

In recent decades, steady improvements in global numerical weather prediction accuracy have been driven by enhancements to data assimilation methods and increasing volumes of observations assimilated (e.g., Bauer et al, 2015). However, new observations are expensive, with new satellites costing billions of pounds. Thus, investments in such systems are evidence-driven. To inform these financial decisions, data assimilation scientists have traditionally carried out quantitative experiments to estimate the impact of new observations on improving forecasts with physics-based numerical weather prediction systems (e.g., Hu et al, 2025).   

In the age of machine-learning forecasting, the connection between each observation-type and forecast accuracy is less transparent. It is not yet clear how to measure the impact of specific observations on machine learning training, nor initialization of machine learning models. This is a critical question to address to ensure continued improvement in forecast accuracy, and better resilience to hazardous weather events.  

References 

Bauer, P., Thorpe, A. & Brunet, G. The quiet revolution of numerical weather prediction. Nature525, 47–55 (2015). https://doi.org/10.1038/nature14956 

Hu, G., Dance, S.L., Fowler, A., Simonin, D., Waller, J., Auligne, T., et al. (2025) On methods for assessment of the value of observations in convection-permitting data assimilation and numerical weather forecasting. Quarterly Journal of the Royal Meteorological Society https://doi.org/10.1002/qj.4933 

About sdriscoll

https://twitter.com/SimonDriscoll_ Researching machine learning and thermodynamics of Arctic sea ice. Part of SASIP (2021-present) @UniofReading (Schmidt Futures). Previously DPhil Physics @UniofOxford (climate/volcanoes/geoengineering). Also nuclear war/winter + X-risk.
This entry was posted in Climate. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *