# Data assimilation – Blending models and observations to predict intense rainfall

In sunny days like those we are experiencing right now it is easy to forget the risk of flooding. However, we all know that sooner or later there will be intense rainfall somewhere and the risk of flooding will increase. As scientists it is nice to have a break like this to keep thinking and developing our science so that it becomes useful in the near of far future, once the rain is back. Remember that an increase in summer storms is expected in a warmer, moister environment (for more details see this blog entry).

The prediction of intense rainfall relies heavily on numerical models. These are complex computer codes that solve the equations that describe the motion of air, water vapour and liquid and ice cloud particles. Numerical models are already very good at forecasting large-scale motion. For example, they can forecast with reasonable accuracy the path and development of a low-pressure system many days in advance. However, numerical models are not so good at forecasting small-scale motion because to do so they require equations to accurately describe small-scale processes, which for the large scale were just minor details. This is similar to looking at a distant tree with its top full of leaves. At first sight the edge of the tree (large scale) appears static, but when you look closer you realise that the edge exhibits tiny motions as individual leaves (small scale) oscillate in response to the wind pushing them.

One of the most important aspects of the forecasting process is starting the model. One would think that this is just a matter of pressing a computer button, but it is actually much more complicated than that. The first thing we need is to provide the model with an accurate estimate of the state of the atmosphere at the start of the forecast. The model needs to know where to start. To do this we need to take the observations available and blend them with a first guess of the state of the atmosphere. The first guess is usually provided by a previous forecast. The process of blending observations and previous forecasts is called data assimilation.

Part of project FRANC is to develop new data assimilation techniques to be useful for the prediction of intense rainfall at small scales, more specifically at convective scales (around 1 km). The blending of observations and model information requires some knowledge of how trustable the observations and the model are. In other words, we need to know how big the errors in observations and previous forecasts are. Once again, there has been great improvement in recent years in the estimation of the level of trust we can place in observations and models when applied to the large-scale description of the atmosphere.

The improvements in large-scale data assimilation are partly due to the fact that it is relatively easier to find relationships between variables at large scales. For example, vertical motions are very small compared to horizontal motions so that the large-scale motion can be thought of as almost two-dimensional and in balance with pressure. This is not so true for small-scale motion where the updrafts inside clouds become important. They are not just details to add. Instead they are essential part of the phenomena we are interested in. Coming back to the example of the tree, if we were asked to draw the tree we can sketch the main branches and then cover them with leaves at random. The situation changes completely if we are interested in the motion of the leaves itself because then we would need to consider more details. For instance, we would need to know how the air flows around the leaves even in a calm day.

The representation of moist processes by the data assimilation methods is another big challenge to overcome if we want to improve our ability to forecast intense rainfall. Apart from the lack of simple relationships between moisture variables (water vapour and liquid and ice cloud particles) and other atmospheric variables, another important problem is that the theory assumes a very specific shape for the probability distribution function of the errors in the first guess. This shape is the very well-known, bell-shaped Gaussian (or normal) distribution. Errors in moisture variables simply refuse to be described by such a simple shape and have come up with all sorts of probability distribution functions.

The problems just outlined define the challenges that we will be tackling in project FRANC. These can be summarised in two questions:

• How should we take into account the relationships between errors the first guess of moisture variables and errors in other atmospheric variables?
• How should we account for the non-Gaussian properties of errors in the first guess of moisture variables?

The answers to these questions should highlight improvement steps to incorporate into operational data assimilation systems such as the Met Office system. In this way project FRANC’s efforts will translate into benefits for the public in terms of better forecasting capabilities and consequently improved means to prevent and mitigate the catastrophic effects of flooding.

The way in which these questions will be addressed is by designing new so-called moisture control variables. Moisture control variables are variables containing the same information as the original variables (in our case water vapour, cloud particles, temperature, pressure and vertical velocity) but with enhanced properties. They can be more Gaussian, for example. The approach taken consists of using analytic and statistical regressions to relate moisture variables to other variables. The new moisture control variables will then be separated into balanced parts, explained by regressions, and unbalanced parts, whose statistical properties will then be analysed. Once we have a few options for moisture control variables, we will investigate the advantages and disadvantages of using and implementing each option into operational data assimilation systems.

So please keep visiting this blog where we will keep posting summaries of our results from data assimilation and other intense-rainfall-related matters as they come along.

We hope too add some figures to this blog early next week.