Discussion on Causality

What Machine Learning aims to do is, with an underlying assumption of the probability distribution of certain variables, filter out meaningful details from data to uncover certain features. These features can be used to classify data, or make predictions.

The “Learning” is based on statistical inference, where data is collected and probability distributions are developed, refined, and structures that link the variables together are worked out.

Judea Pearl develops on, what he calls Causal analysis. There is a mechanism where we are assigned data from nature. We assume a probability distribution and assign relationships between the different variables. This is called a Causal Model. The model allows us to predict two things:

  1. The relationship between the variables given the underlying distributions
  2. How the probability distributions of the different variables affect each other

In this schema, there are two levels of inference: On causal mechanisms and probability distributions. Given this schema, we can inquire in many ways. We can ask questions on observations, interventions, and counterfactuals. Pearl makes the claim that this Causal schema’s strength is where it deals with counterfactual questions.

Pearl contrasts his position against the position of those who rely on pure statistical inference. He likens that position to the subjects of Plato’s Allegory of the Cave. In the Allegory, an entire community is chained in a cave. They are restricted in movement. They see their shadows on the wall opposite the entrance. Their shadows move as they do, so they eventually identify themselves with their shadows. As a result, they lose all understanding of depth, light, and colour.

In the absence of causal models, Machine Learning only works with what can be observed, and loses what

For a Causal Model, you can develop a graph or a decision network where many causal relationships are mapped out, and the holistic consequences of an intervention can be mapped out. The network is Bayesian, with variables that can stand for possible interventions. The model has four types of components:

Endogenous variables
System variables (with a probability distribution))
Background variables (interesting as they can be postulated and are not observed)
Functions (how an observed variable relates to the observed variables and unobserved.

You can then map out the relationship between the variables. You can use the underlying probability distribution to create an induced probability distribution on endogenous variables. If you have a relationship between two variables, you will write y = beta*x + alpha + Noise (one variable depends on the other). This noise can be error or uncertainty.

How do you work with Counterfactuals?

Counterfactual situations allow us to consider situations that have not occurred. Using the schema, we can see how one variable affects another in different situations. You can take the model, with all the causal relationships in the network, replace certain elements, and create, what Pearl calls, a “mutilated” model. Counterfactuals can be studied by model restrictions or expansions. Pearl calls these mutilated models, as he has primarily studied model restrictions, but the theory has developed from there. He has two fundamental principles of counterfactual.

  1. Law of Structural Counterfactual
  2. Law of Structural Independence

The tight influence between variables characterizes a causal relationship. The absence of a direct arrow in the casual graph implies independence, which is not statistical, but conditional on a separating factor. Observed and background variables can be used to summarize millions of background processes. By developing these processes and the model can learn.

By arranging the model this way, one can see the consequences of actions in the model. The model can develop variables and probability distributions. An action can be simulated. All of this requires a causal analysis. Using assumptions one can look at implications, specifically testable implications. The Causal framework allows one to relate various actions in a complicated network of variables. This approach has a statistical overlay on a model approach (with the goodness of fit, testable implications, etc.)

He develops a causal calculus (called the Do-calculus) based on three fundamental rules, titled: Ignoring Observations, Action/Observation exchange, and Ignoring Interventions

External validity in causal models:
How much can you extrapolate or generalize?
Pearl says that the causal assumptions will determine how much you can take from the model. The strength of causal models can develop counterfactuals.

Eg: what if you educate a person on income. The causal model you develop/a world view will determine what the effect is.

Incorrect causal assumptions will lead one astray, but by tuning the assumptions to improve the model.

All you have is data and inputs, but by altering the causal story, the probability distributions will change, leading to the creation of a causal graph. You can find the factors that create differences, which can complicate the model and make it richer.

There are theorems that relate the Do-calculus to applications to the general population.

Missing data:
How does this approach deal with missing data? Pearl argues that causality is fundamental in inferring missing data. To justify this, Pearl makes use of a theorem that states that there is no universal algorithm to recover missing data. Without investigation of the model, one cannot infer lost data, so causal inference is essential in recovering lost data

1 Like

I have not yet properly completed and observed Pearl’s Book of Why and its causality explanation.
However there is a thing which we need to be clear when it comes to non-rule based algorithms classes… i.e, machine learning or statistical inference and relationship-probability based algorithms. These algorithms when inspected and audited for transparency at the code level will not make sense to anyone possibly.

This is where the fear of code transparency comes into picture and where many Free Computing (data, information, software, hardware) movements comes to stall and finds itself strangled.

However, the methods, models, rules of how the network is laid out, how they are fed with data, what data, evaluation methods, testing with reality, etc… are all based on rules that are understandable by mathematicians, statisticians, programmers, designers in general. So, even if one cannot go and see how the machine works when it works, (even if done it wont make sense and appear psychedelic) or cannot see the world through its eyes and senses… the construction of these machines can be put to question.

In a way these statistics driven learning algorithms are governed by programming models that makes sense. Example : GAN network.

on social sense, i have recently learned that the bias on these algorithms (that fall prey to the inherent bias of the society that reflect in data and eventually manifested in algorithmic form) can be questioned on :

  • Should the algorithms be just reflective ?
  • Should the algorithms be nudged to inclusive (boosting representation of the under represented): representational equality ?, losing the reflective character ?
  • Should the algorithms be nudged to answer us what we aspire for future (more egalitarian future…) ? (again whose future imagination will be a debate)

I will reflect more on pearl’s causality, once i finish it thoroughly following @myprostheticsoul comments.

Lets keep this going… interesting and learning more through discussion.

Pearl gives a good description of the philosophical history of causality and the application to AI systems. From the book, I am getting a better understanding about how causality becomes mathematized and hence automated. The debates are sophisticated, but they can be tracked as technologies evolve.

Three dimensions we might need to look into to get a supplementary perspective:

  1. How does causality enter into economics, and through economics, policy and planning?
  2. How do causality and statistical approaches cement behavioral psychology, and through behavioral psychology, infiltrate data capitalism?
  3. What role do counterfactuals play in measurements of human beings, especially in medicine, management of labour, and assessment of credit?
1 Like

Dear All,
Looking for suggestions for Readings in the History of Causality and debates around it in grounded cases of national planning histories, and in the institutionalization of sciences, universities etc. Suggestions are welcome.

This is a summary of the first two parts of the second lecture on causality, presented on 22-1-21. Part three will be posted subsequently.

his lecture focused more on a Marxist understanding of Causality, as developed by Althusser in Reading Capital, which can be accessed with this link.

Reading Capital by Louis Althusser 1968 (marxists.org)

Part II Chapter 9, discusses Marx’s theoretical revolution. Marx develops his causality as coming from two understandings:

  1. Mechanistic Causality: Here, an effect is associated with a cause. There are causal sequences, where explanations can be developed by analyzing the sequences of causes and effects. This is limited as it does not reflect the totality of events.

  2. Expressive Causality: Originally from Leibnitz, expressive causality looks at how the parts express aspects of the whole.

Marx moves beyond classical economists by looking at the totality in his causal explanations. Individual elements express aspects of the totality. For example, he argues that it is possible to study the full aspect of bourgeois society by studying the commodity form. For a logician, this can be studied as compositionality. Althusser points out on the Marxist view expresses a structural causal theory. The elements of the whole cannot be made exterior to the whole, as in mechanistic causality. The elements are also not expressions of the whole, as in “expressive” causality. The whole has a reciprocative relationship with the elements, which are oriented towards the whole.

The whole reflects a specific combination of the elements. The whole is an absent cause in the elements of the whole. It is present in a reciprocative way. Marx uses the term Darstellung , or representation, to explain this relationship. The whole and parts are inseparable. This model of causality forces us to think of complex relations. The relations do not exist in isolation, but the relations also are related to these things. This creates a problem where we have to look at extremely complex relations. Althusser calls this problem “Conjuncture.” On the problem of conjuncture are the problem of prior conjuncture and the problem of transition among other things.

All social phenomenon exists relationally. They do not exist equally. All social relations depend on dominance. There is a hierarchy in effectivity, with certain maximal elements, which dominate over others. There are patterns and structures to study.

Marx declares that economic relations are decisive in the last instance. It is not the sole determinant, but the driving force within the economic sphere. In the social structure, the determinants of the structure can be different and related to each other, hence the relations are related to each other. There are at fixed distinct and autonomous levels.

Relations of structures to causal mechanism as described in Pearl:

In this tradition, there are two broad streams:

  1. Causal: where explanations come from the outside, and are often used in scientific traditions

  2. Hermeneutic: where explanations come from inside

The causal explanation would rely on gathering data and trying to look at relations between. In this discussion, we will focus on causal mechanisms.

Using statistical methods, you can find associations between variables. A causal structure tries to look at points of independence or as inputs, result in changes in points of dependence, or as outputs. What Pearl says is a variation of Mechanistic Causality, except in place of a chain of cause and effect relations, there is a graph or network of variables that influence each other. Causality is imposed on the relationship between the variables. Marxist theories would say the causality is not imposed on the structure is identical to it. In policy and economics, often understandings of causes become drivers of causes themselves. Sometimes, decisions to participate in certain kinds of behaviors can signal to others to participate in those behaviors.

On causal analysis, Marxist theories work to contextualize the causal mechanism. The whole and the parts should define each other. This leads to a question, how do you decide which factors to focus on, which does not have a clear answer.

This is a summary of the 29 January session on Causality. Kishor presented on the institutional and academic practice of Causality in the fields of Market Research, Economics and Medicine, mostly focusing on the American part of the story.

(Note: Scattered source material)

Locate the discussion of causality in 3 specific disciplines with political dimensions – Market Research, Economics and Medicine

Other disciplines where also similar discussions could be located – Law, Psychology, Cognitive Science, Social Sciences

Each topic was discussed in two different ways: (I) Things that were happening historically within each of the disciplines, within the institutional perspectives and (II) Things that were happening on the outside of the disciplines, with an impact on things within the disciplines.

Overview of the philosophy of Causality

Aristotle, David Hume, John Stuart Mill were 3 key philosophers who laid down the basic philosophical framework of causality.

(a) Aristotle was asking the question: “What causes things to be?”

Four types of causes – Actual, Potential, Efficient and Teleological. Most of modern science is based on Efficient cause and Teleological cause. Efficient cause is about what moves something, or what makes something happen. Teleological cause is about to what end is it going towards. Post Darwinian revolution, modern sciences have been largely concerned about Teleological questions – such as how we got to where we are today, through a model of history.

(b) Hume has a different model.

3 basic principles that have been very influential in modern science. “A causes B” if (i) contiguity between A and B, (ii) inseparability – if A happens, B must necessarily happen, and (iii) temporal ordering, A has to happen before B. In the ‘Book of Why’, Judea Pearl talks about how Hume’s model sets the stage for his work on causality.

( c) Mill’s intuitive methods of typifying causality by going through various situations, as laid down in his book ‘System of Logic’.

Overall, figures like Hume and Mill have been highly influential on how methods from natural sciences got to be adapted to what are called ‘human sciences’. But we need to be careful against a philosophical over-determination on these grounds, since a lot of the actual disciplinary work around causality in the human sciences is also influenced by common sense, ordinary languages. In this sense, the disciplinary debates are best understood more as guidelines that practitioners have been following, rather than strict philosophical principles.

Market Research

Begins as an offshoot of psychology starting from 1920s. Focused mainly on survey work, quantitative correlations, often quite simplistic statistical methods – aspiring to be ‘objective’ and ‘scientific’. Post WWII, there seems to be a shift towards more qualitative methods – could be because of new media technologies like the television. By 1980s, advertising was seen as much a discipline relying on quantitative methods, as on qualitative methods. From 2000s, powered by computers, more and more focus on complicated number crunching, network mappings, real time data analytics. This has taken us away from the regime of causality, and as a reaction, brought back the question of causality – in comparison with mere association mappings. ‘Why are things associated to one another?’ Of course, all this is going on in conversation with social science research. The cruder quantitative methods of early 20th Century coincide with structuralist paradigms in sociology, the move to qualitative coincide with more micro-turns in American social sciences, focused on questions of ‘who is doing the research’, ‘who is being studied or observed’, ‘who is the consumer’, etc. This was also effected by cross hirings between market research firms and social science researchers.


  1. Not sure of the shift from quantitative to qualitative. There was a shift from the macro to the micro, but the micro also seems to be heavily quantitative, reliant on constant measurements. The neoliberal micro research also seems to have it’s own causal regime – such as something like ‘individual greed causes prosperity’.

  2. What is the parallel story in India, particularly with regards to the cross-breeding between Indian social sciences and Indian market research.


Post WWII, 3 schools of economics in Western academia came up – MIT, Yale and Chicago. A key moment for the Yale school was the Cowles Commission – a movement to mathematize economic theory. While Keynes developed a theory that was amenable to mathematics, Cowles went much further. One of their distinction from the Chicago school was their emphasis on planning. Cowles was seen largely as the school that would triumph over the Keynesians. But eventually the Chicago school supplanted them both.

The Cowles Commission published monographs that dealt with the theory of causality. This was in connection with the Cold War-era rush towards science and technology research on both sides of the world – leading in turn to mathematizing economics as a ‘science’. For example, ‘Process Analytic approach’, ‘Structural Models approach’. The monographs published by the Cowles commission – such as those dealing with dependent and independent variables, noise, uncertainty, proofs, theorems, natural experiments, etc – are connected with some of what we discussed in the earlier sessions. Kenneth Arrow was a part of this project. This school laid the theoretical groundwork for economics as the discipline that it is today.


  1. The other key part of the story is the Developmental school by people like Joan Robinson and others, who posited themselves against the Chicago school. The equilibrium style of economics – as espoused by the Chicago school, and other Israeli economists – is one of multiple ways of doing economics. But there have been people who opposed it. Developmental economics – at least parts of it – were for instance in opposition to the equilibrium style, and in fact also to an extent to the mathematization of economics. Kenneth Arrow, the Yale school, and some parts of the MIT school were somewhere in between. They were opposed to equilibrium theory, and instead help up the Keynesian principles based on planning – but in doing so, they heavily relied on quantifications and mathematization of the discipline. We should refrain thus from equating the mathematical approach with the equilibrium theory.

  2. May be one way to categorize these different schools is to classify them in terms of a unidirectional, linear kind of causality, as opposed to a dialectical causality – the kind that Marx talked about. Rather than focusing on things like the extent of mathematization, or the take on equilibrium theory, and so on. For instance, one core aspect that will then have to be looked at is the idea of ‘contradictions’, and how these different schools dealt with the concept of contradictions as a major driving force of economic logic. One way to look at mathematization attempts could then be whether the mathematization is being done in order to properly describe those fundamental contradictions, or to somehow mathematically resolve those contradictions.


When we talked to practitioners in the field, the question of causality seems to come up in 2 different ways: (a) evidence-based research, and (b) big data driven approaches, specifically when it comes to personalized medicines for instance.

Evidence based approach: When a doctor practices medicine, they have to rely as much on scientific data (coming from scans, test reports, and so on), as on their own intuition and experience rooted in the local context. But post-1970s, partly because of the rise of health management systems such as insurance companies, private nursing homes, etc, the legal and regulatory pressures manifested in the push towards medical research that was ‘evidence based’, as opposed to the more intuitionist approaches. This is for example where the Randomized Control Trials system becomes strong. RCTs become an easy, ‘theory-neutral’ way to present one’s case in a scientific sense, without having to go through the extremely complicated and less understood biochemical causal pathways. Though RCT methods date back to as far as the 17th Century, post 1970 we see a spike in their importance.

Big data driven approach: This is a much more recent trend, such as in what is called as ‘metabolomics’ - “a systematic study of the unique chemical fingerprints that specific cellular processes leave behind”. Particularly since 2016-17, it has been argued that these big-data driven statistical measurements of the biochemical transactions in the body, when superposed with biochemical causal pathway diagrams – leads to a far better understanding about how the human body works. The market side of this story is the market-push towards personalized medicine, the technologies for which are currently very expensive.


  1. The point about legal pressures pushing these disciplines like medicine or economics into certain kinds of ‘evidence-based sciences’ is interesting, and perhaps deserves more detailed inquiry. In particular, looking at the legal theater as a space where such social norms around causality, evidence, scientific rational, rational ethics, etc get engineered.

  2. Two disciplines that kept coming up in course of preparing for this presentation were psychology (how do people’s minds make causal associations), and law (where it’s very anthropological – there is no legal definition of causality, and how causality gets established in a court thus varies from judge to judge, that is, it’s quite anthropological). There is no clear resolution of the question of causality in legal terms – which is why often times the key question in criminal proceedings is not whether someone’s actions ‘caused’ some damage, but rather whether the action was ‘irresponsible’. In other words, legality often has to convert questions of causality into questions of ethical norms. When it comes to causality in the legal domain, there are for instance distinct principles that are often in conflict with each other – for example, whether it was the action that caused something, or whether there was a mental decision behind that action that caused something, or between proximal causes and ultimate causes, and so on.

  3. But in general, the fields such as those of technologies, law, etc seem to have a lot to do with what’s going on in these other fields such as medicine, market research, economics, etc. One example is perhaps the two different approaches to causality – inferential and a priori . Could this be a conflict between two classes – albeit small classes, such as the scientist versus the person who hires the scientist. This is in a similar spirit to Marx’s Paris manuscripts where he documents the conflict between the physiocrats and the monetarists on the question of land – the former looked at land as a distinct form of wealth, while the latter looked at it as just another form of capital – a conflict that was basically the class conflict between the feudal classes and the emerging capitalist classes. Can we say something similar about the conflict between the inferential and the a priori? For instance, some kind of ideological predisposition that scientists tend to operate with when it comes to the question of what knowledge is - ‘How does the world work? Why things are happening the way they are?’ This looks like some kind of class position. But people who hire them often don’t care about such questions = which is yet another class position.

  4. We need to bring back the ‘political’ in this discussion of causality, which is where our earlier presentations on ‘structural causality’ become important. The way Althusser talks about ‘structure’, and how that ‘structure’ warps and informs the very questions we ask regarding causal relations, in other words the way the structure guides the underlying causal arguments – is worth keeping in mind. For example, when in economics they say ‘hierarchies are inevitable’ - that opens up a whole new paradigm of causal structures. And from this perspective, it is not clear what can be fundamentally new about this big data-based feedback mechanisms. Because what these inferential arguments do is to ignore structures – that is part of the political agenda itself. But if structures dictate causal paradigms, then more and more data is not going to effect any fundamental shift in such causal paradigms. This means we also have to be careful with ethical arguments. We perhaps need to move from ethical arguments, towards the discussion of the ‘political material structure’, and not just the so-called ‘ethical structure’.

  5. Doesn’t causality in the positivist way reinforce structures of inequality? It does, but then it depends on what we mean by ‘positivist’. For example, this inference based formal epistemology is depoliticizing the debate, and in effect perpetuating inequality. For example, the complete depolitization of the Netherlands’ ethics for judges course.

References for the above discussion:

On Market Research:
Bailey, L.F., 2014. The origin and success of qualitative research. International Journal of Market Research. 56(2), 167-184.
Boddy, Clive Roland. “Causality in qualitative market and social research.” Qualitative Market Research: An International Journal (2019).
Sarstedt, Marko, and Erik Mooi. “The Market Research Process.” A Concise Guide to Market Research . Springer, Berlin, Heidelberg, 2019. 11-24.
Oman, Susan, and Mark Taylor. “Subjective well-being in cultural advocacy: A politics of research between the market and the academy.” Journal of Cultural Economy 11.3 (2018): 225-243.

On Economics:
Hoover, Kevin D. “Causality in economics and econometrics.” The new Palgrave dictionary of economics 2 (2008).
Erickson, Paul. “Mathematical models, rational choice, and the search for Cold War culture.” Isis 101.2 (2010): 386-392.
Van Horn, Robert, and Matthias Klaes. “Chicago neoliberalism versus Cowles planning: Perspectives on patents and public goods in Cold War Economic Thought.” Journal of the History of the Behavioral Sciences 47.3 (2011): 302-321.

On Medicine:
Gillies, Donald. Causality, probability, and medicine . Routledge, 2018.
Rodwin, Marc A. “Commentary: The politics of evidence-based medicine.” Journal of Health Politics, Policy and Law 26.2 (2001): 439-446.
Rosoff, Arnold J. “Evidence-based medicine and the law: the courts confront clinical practice guidelines.” Journal of Health Politics, Policy and Law 26.2 (2001): 327-368.

From my learning on complexity theory, the whole is always more than the sum of its components/elements. There is a similar reflective in the above summary. Relationships matter in both variables as well as the context. The “context” sometimes called the “environment” within which the variables/elements interact with some kind of relationship. This environment also becomes a variable that influences the elements either as a boundary phenomenon or else becomes one of the variable if it is active. There is a similar explanation on explaining the complexity with Macro-Micro view too.