Building a predictive framework for studying causality in complex systems

By: Nachiketa Chakraborty

I’m Nachiketa Chakraborty, a postdoctoral researcher working on the ERC project CUNDA (Causality under Non-linear Data Assimilation) led by Peter Jan van Leeuwen. My central goal is to come up with a Bayesian framework for studying causal relations within complex systems that are higher dimensional and have non-linear, coupled processes. My background is in astrophysics, with a special interest in using mathematical techniques (like time-series analyses) to study the origin of variability in emissions from astrophysical sources involving black holes at their center [1,2,3]. It is fascinating to note that these methods are in fact highly relevant in building this causal framework for the processes in the Earth sciences, such as for the inter-ocean exchange problem in Indian, the South Atlantic and the Southern Ocean – a critical ingredient of the climate system [4].

Study of cause and effect is central to every science. It is what allows us to identify processes that are instrumental in giving rise to the phenomena we observe. For instance, it is the force of gravity that causes objects in air to fall back towards the Earth. In geosciences, this relationship between causes and effects plays out in more complex ways. For one, this is because there are multiple competing causes for every effect. Furthermore, this relationship is highly nonlinear; in other words, amplifying or increasing the strength of the causal process does not simply increase or decrease the observed effect proportionally. Finally, the different processes are interdependent. Therefore, it is not possible to decouple different effects and study them independently building up our understanding from simpler models. 

All these aspects come into play in the longstanding inter-ocean exchange problem in oceanography. We wish to know if exchange between the Indian, the South Atlantic and Southern Ocean is caused predominantly by local effects (namely, the highly turbulent eddying south of the African continent), or is in fact due to global scale dynamics (such as large scale modes of the Southern Ocean). What is known currently is that processes ranging from the very local to very global are plausible drivers. Disentangling these requires development of sophisticated methods building from existing methods [5].

As a person from a physics background using mathematical techniques to analyse data, causality represents a perfect area to blend these two. In both mathematics and physics, causality is a key concept. Within physics, central to the concept of causality is the existence of a plausible mechanism that causes the effects we observe and time ordering. This is quantified through equations which can be used to predict observations.

Mathematically, causality as a concept has many components. First, we need a set of conditions defining the presence of a causal connection or cause and effect. The most popular way of using data to do this is by establishing statistical dependencies between the variables observed. This brings us to the next key mathematical aspect of causality which is the mathematical measure quantifying causation. Shannon’s information theory measures are the most popular and best studied of the measures quantifying causality. This is the strategy that is used to establish causality in systems or areas where we do not have known plausible mechanisms (such as econometrics) or it is too complex to have one (as in biological systems). Finally, in order to properly quantify causality within processes in systems, we need to provide the uncertainty on the estimates. In order to do that, we wish to adopt a Bayesian approach to causality. Combining the physical and the mathematical, data-driven approach, we hope to build towards solutions for complex systems.


[1] Ait Benkhali, F., W. Hofmann, F. M.Rieger, and N. Chakraborty, 2019: Evaluating Quasi-Periodic Variations in the γ-ray Lightcurves of Fermi-LAT Blazars, Astron. Astrophys., DOI:, available at   

[2] Morris, P., N. Chakraborty, and G. Cotter, 2017: Deviations from normal distributions in artificial and real time series: A false positive prescription, Mon. Not. R. Astron. Soc., 2, 2117–2129,

[3] Chakraborty, N., 2020: Investigating Multiwavelength Lognormality with Simulations: Case of Mrk 421, Galaxies 2020, 8(1), 7,

[4]  de Ruijter, W. P. M., A. Biastoch S. S. Drijfhout J. R. E. Lutjeharms R. P. Matano T. Pichevin P. J. van Leeuwen W. Weijer, 1999: Indian‐Atlantic interocean exchange: Dynamics, estimation and impact, J. Geophys. Res.104, C9, 20885– 20910, doi:10.1029/1998JC900099.

[5] Amblard, P.-O., and O. Michel, 2012: The Relation between Granger Causality and Directed Information Theory: A Review. Entropy15, 113–143, doi:10.3390/e15010113.

This entry was posted in Climate, data assimilation, Oceans. Bookmark the permalink.