By: Prof Sarah Dance (Professor of Data Assimilation)
Last week, on 25th February 2025, our colleagues at ECMWF (European Centre for Medium-range Weather Forecasts) took their deep-learning-based global weather forecasting system, known as the AIFS, into operational production, running alongside their physics-based numerical weather prediction system. The AIFS outperforms state-of-the-art physics-based models for several measures of accuracy, and is computationally much faster to run. However, both systems currently share a need for initialization with a best estimate of the current state of the atmosphere, created using a process called data assimilation.
Tens of millions of atmospheric observations are used in weather prediction every day, but observations alone cannot describe the weather at all points on the globe. To get a complete picture we use data assimilation to optimally combine the observations with information from a physics-based computer model, taking account of our confidence in each source of data. The resulting analysis is used as a starting point for both physics-based and machine-learning-based weather forecasts. It is also important to note that reanalysis, using data assimilation with historic observations, is a key component of training data for machine learning forecasting systems.

Figure: Schematic of the global observing system from the World Meteorological Organization https://community.wmo.int/en/observation-components-global-observing-system
In recent decades, steady improvements in global numerical weather prediction accuracy have been driven by enhancements to data assimilation methods and increasing volumes of observations assimilated (e.g., Bauer et al, 2015). However, new observations are expensive, with new satellites costing billions of pounds. Thus, investments in such systems are evidence-driven. To inform these financial decisions, data assimilation scientists have traditionally carried out quantitative experiments to estimate the impact of new observations on improving forecasts with physics-based numerical weather prediction systems (e.g., Hu et al, 2025).
In the age of machine-learning forecasting, the connection between each observation-type and forecast accuracy is less transparent. It is not yet clear how to measure the impact of specific observations on machine learning training, nor initialization of machine learning models. This is a critical question to address to ensure continued improvement in forecast accuracy, and better resilience to hazardous weather events.
References
Bauer, P., Thorpe, A. & Brunet, G. The quiet revolution of numerical weather prediction. Nature 525, 47–55 (2015). https://doi.org/10.1038/nature14956
Hu, G., Dance, S.L., Fowler, A., Simonin, D., Waller, J., Auligne, T., et al. (2025) On methods for assessment of the value of observations in convection-permitting data assimilation and numerical weather forecasting. Quarterly Journal of the Royal Meteorological Society https://doi.org/10.1002/qj.4933