The answer isn’t obvious.
To many lay people and scientists alike, it seems clear that one of the primary goals of scientific inquiry is to discover causal relationships in nature. Scientists want to learn the causes of events like epidemics, market crashes, the global increase in average temperature, online radicalization, and the formation of the solar system. Experiments, simulations, data collection, and data analysis are all done with the goal of learning the causes of things. This search for causes, we typically believe, is in keeping with the healthy functioning of science.
Bertrand Russell, one of founding figures of contemporary analytic philosophy, disagreed. To him, the idea that nature works according to relations of cause and effect was a superstition at odds with scientific progress. In a 1917 paper titled “On the Notion of Cause”, he wrote:
All philosophers, of every school, imagine that causation is one of the fundamental axioms or postulates of science, yet, oddly enough, in advanced sciences such as gravitational astronomy, the word ‘cause’ never occurs…. The law of causality, I believe, like much that passes muster among philosophers, is a relic of a bygone age, surviving like the monarchy, only because it is erroneously supposed to do no harm.Russell, B. (1917) “On the Notion of Cause”. Ch. IX in Mysticism and Logic and Other Essays. London: Unwin. p. 132.
Besides doing his best to channel Meghan Markle, Russell is also expressing a powerful thought. He was writing at a time of great advances in mathematical physics. Chief among them was Einstein’s theory of general relativity, which is what he is referencing here when he mentions “gravitational astronomy”. What was striking to Russell was that these theories are expressed in perfectly symmetric mathematical equations, where the occurrence of a past event can be derived from the occurrence of a future event, and vice versa. These equations do not tell us anything about any privileged explanatory direction between past and future, or cause and effect. Causation, Russell concluded, simply isn’t a part of mature physics.
But what about the idea that causal notions are “erroneously believed to do no harm?” Here, I take Russell to be saying that other sciences, e.g. biology, chemistry, economics, or psychology, which in Russell’s time had not reached the level of predictive success enjoyed by physics (and this is arguably still true today), are harmed by insisting that the theories they generate ought to describe relations of cause and effect. Were they to free themselves of the shackles of causal reasoning, Russell seems to be implying, these sciences might achieve a level of predictive success on par with physics.
A similar idea gets a more modern expression in an excellent 2003 paper by John Norton called “Causation as Folk Science”. Norton, like Russell, argues that causation is unscientific. His argument proceeds as follows:
EITHER conforming a science to cause and effect places a restriction on the factual content of a science; OR it does not. In either case, we face problems that defeat the notion of cause as fundamental to science. In the first horn, we must find some restriction on factual content that can be properly applied to all sciences; but no appropriate restriction is forthcoming. In the second horn, since the imposition of the causal framework makes no difference to the factual content of the sciences, it is revealed as an empty honorific.Norton, J. “Causation as Folk Science.” Philosopher’s Imprint. Vol. 3. No. 4. pp. 3-4.
Norton’s argument is simple. If causation is scientific, then the logic of causation should allow us to rule out some theories of how nature works. Mathematics, on this view, is a part of science. This is because any would-be scientific theory that entails a mathematical contradiction ought to be tossed out as a candidate theory of some how some system of interest works. (Whether this claim is actually true would stir up a lot of debate among philosophers of both science and mathematics, but we’ll grant for now that mathematics constrains what science can say.) Similarly, our empirical observations also constrain the content of our scientific theories. For example, we repeatedly observe that ice is lighter than liquid water, and so we should reject scientific theories that contradict this fact.
Causation, Norton argues, does not play this role. Nothing about the insistence that nature is governed by causal relations allows us to discard any otherwise-acceptable scientific theory. Thus, causation is a part of “folk science”. Causal claims, he concludes, are acceptable for everyday conversation (e.g. I can say, without being misunderstood, that the drop in temperature overnight caused my windshield to ice over) but they are not a part of the scientific image of the world.
Norton draws an analogy between the idea that nature is governed by causal relations and the idea that heat is a substance, the latter of which eighteenth-century scientists believed was true. We now know that when a room gets hotter, the particles of air in that room begin moving faster, creating the sensation of heat. Heat is not a separate substance over and above whatever already exists in the room. Nevertheless, it is perfectly fine in ordinary conversation to say “I lit the stove and the room filled with heat”, even if such a claim is, strictly speaking, out of keeping with our scientific understanding of heat. Causal claims, Norton argues, have a similar status; fine for ordinary use, but strictly in tension with our best science.
However, Norton’s paper does not do justice to the interdisciplinary attempt to model, in a mathematically rigorous way, the causal structure of systems in nature. This work has its roots in the 1920s writings of the biologist and statistician Sewall Wright, but really kicks off in the 1980s with the work of the statisticians Harri Kiiveri and Terry Speed, and the computer scientists Joseph Halpern and Judea Pearl. In the 1990’s, machine learning experts like David Heckerman, Daniel Geiger, and David Chickering began to make important contributions to causal modeling, as did a number of technically-minded philosophers of science, including the Carnegie Mellon team of Peter Spirtes, Clark Glymour and Richard Scheines, and others such as Daniel Hausman, James Woodward, and Christopher Hitchcock. This research program came into maturity with two books published in 2000: Pearl’s Causality, and the second edition of Spirtes, Glymour and Scheines’ Causation, Prediction, and Search.
Although the technical details of a causal model can get mathematically complicated, the basic idea is simple. Causal relations hold between variables, each of which describes a complete set of ways that a system could be (e.g., the variable ‘Gas Connected’ in the causal model above could have two values, 0 and 1, where 0 means that the gas is connected, and 1 means that the gas is not connected). Variables are then related via arrows, which indicate the presence of a causal relationship between two variables. Specifically, if there is a chain of arrows from a variable X to a variable Y, then X is a cause of Y. If there is an arrow directly from X to Y, then X is a direct cause of Y. So, in the graph above, ‘Gas Connected’ is a cause, but not a direct cause, of ‘Meat Cooked’, and ‘Flame’ is both a cause and a direct cause of ‘Meat Cooked’.
The value of each variable in the model is determined by a function of its direct causes, plus an error term that is independent of the error in any other variables in the graph (note that this is consistent with some variables in the model being entirely determined by their direct causes, with no error). So, in the graph above, whether or not the meat cooks is determined by whether or not the flame is present and the meat is on the grill, plus some amount of random error that is not correlated with anything else in the model. Judea Pearl shows (via fairly simple mathematics), that such a set-up will result in any assignment of probabilities to all possible settings of the model having an important property, provided that the model contains all common causes of two or more variables. That property, known as the Causal Markov Condition, is stated as follows: all variables in a causal model are independent of their non-effects, given their direct causes.
To illustrate, if the graph above satisfies the Causal Markov Condition, then the gas level is independent of whether the igniter is on, once we account for whether the gas is connected and the position of the gas knob. You can identify more independence facts that are entailed by the fact that this graph satisfies the Causal Markov Condition, if you enjoy spending your time that way as much as I do.
For our purposes here, one does not need a deep understanding of the Causal Markov Condition. More important is that one understands a key implication of the Causal Markov Condition. Crucially, the Causal Markov Condition entails what the physicist and philosopher Hans Reichenbach called “the Principle of the Common Cause”. This principle says that all correlated variables must either be causally related, or share a common cause. To illustrate, a variable describing whether or not pedestrians on a given street have their umbrellas open will typically be correlated with a variable describing whether or not drivers on the same street are using their windshield wipers. Here, there is an obvious common cause in the form of the weather. In other cases, correlated variables are causally related, e.g. a variable describing whether a person smokes and a variable describing whether or not they develop lung cancer.
In addition, when a model satisfies the Causal Markov Condition, we are able to derive claims about what will happen in the graph under different hypothetical interventions changing the values of the variables in the model. This allows us to draw a close connection between causal models and the experimental methods through which some causal claims are established and tested. This feature of causal models has also played a starring role in recent discussions of whether recommendation algorithms treat people fairly.
Does causal modeling rebut Norton’s argument that causation is not scientific? Surprisingly, few have taken up the question directly. In an interview in 3AM magazine, Glymour makes plain his distaste for Norton-style skepticism about the scientific status of causation:
Anyone who seriously thought causation is a fiction, a social creation of some kind unlike the everyday facts of the world…such a person would be paralyzed, without reason for planning any one action rather than another. To get out of my office, shall I open the doorknob or wait for the doorknob to open? If I move my legs will I find myself at the door? If I move to an apartment with thin walls, will I hear my neighbors, and they me? … An ad hominem: people who say causality is a fiction are not doing much thinking.Glymour, C. Interview with Richard Marshall, 3AM Magazine.
While I agree in part with the spirit of what Glymour is saying here, I don’t think that he really addresses Norton’s challenge. Glymour is correctly pointing out that causal models are useful for predicting the outcomes of hypothetical interventions in the world. But recall that Norton is interested in the factual content of a science, i.e. what is says about what actually happens in the world. That causal models are useful in counterfactual reasoning about what would happen if some agent intervened on the world in some way will do nothing to convince Norton that they constrain the factual content of science.
Russell observed that causal relations don’t appear in a fundamental theory. He suggested that the notion of cause is a folk notion that has been superseded by global laws of temporal evolution and has no place in exact science. [But] causal models are generalizations of the structural equations used in engineering, biology, economics and social science. In a causal model, a complex system is represented as a modular collection of stable and autonomous components called “mechanisms”. The behavior of each of these is represented as a function, and changes due to interventions are treated as local modifications of these functions. The dynamical law for the whole is recovered by assembling these in a configuration that imposes constraints on their relative variation.Ismael, J. (fothcoming). “Against Globalism about Laws.” in The Experimental Side of Modelling, eds. Bas van Fraassen and Isabelle Peschard. p. 8.
The idea here seems to be something like this: Russell was impressed by the lack of causation in the laws of physics, but these laws are really just summaries of locally instantiated causal models of the kind described in the statistical literature on causal modeling. In other words, Ismael does not believe that nature is governed by the general equations of physics, which make no mention of causality. Rather, she argues, nature is a patchwork of causal models which are summarized, in a non-causal way, by physical laws. Judea Pearl seems to support a similar view in the preface to his book, where he writes of a conversion from the Russell-Norton view to a view that is more like Ismael’s:
[I used to think that] causality simply provides useful ways of abbreviating and organizing intricate patterns of probabilistic relationships. Today, my view is quite different. I now take causal relationships to be the fundamental building blocks both of physical reality and of human understanding of that reality, and I regard probabilistic relationships as but the surface phenomena of the causal machinery that underlies and propels our understanding of the world.Pearl, J. (2000). Causality. Cambridge: Cambridge University Press. pp. xiii-xiv.
Ismael and Pearl may be right, but for my part, this response to Russell and Norton is too metaphysical. I do not know how to adjudicate between the view that the world is made out of causal models, which are then summarized by the equations of physics, and the view that the equations of physics tell the whole story of nature, such that causal models are just a convenient way of understanding them in a local context. Both views seem equally compatible with our existing evidence, and with any evidence that we could possibly collect. Further, nothing in the quotes above tells us how causal modeling restricts the factual content of a science, such that Norton’s challenge for any theory of causality remains unanswered.
Nevertheless, I do believe that proponents and practitioners of causal modeling have a response to Norton’s challenge, and a fairly straightforward one at that. Very simply, the mathematical fact that a causal model of the kind described above must satisfy the Causal Markov Condition, and by implication the Principle of the Common Cause, means that causal modeling does entail a factual constraint on scientific theorizing.
To see how this works, suppose that we set up a model of some system that satisfies the following three tenets of the causal modeling framework: i) all variables are functions of their direct causes and an error term; ii) all error terms are uncorrelated; and iii) all common causes of two or more variables are included in the model. Pearl shows that any probability distribution over the variables in such a model must satisfy the Causal Markov Condition, and by implication the Principle of the Common Cause. When we observe the system over time, we can test whether or not our observations fit a probability distribution that satisfies these conditions. If they do not, then we can either claim that our observations are flawed, or revise our model. However, on pain of obstinate skepticism, we can only blame our observations for so long. At some point, if our observations systematically violate conditions that our model says they ought to satisfy, then we ought to revise our model.
In this way, causal modeling gives us a constraint on physical theorizing. In at least some cases, our physical theories will be expressed as a set of equations relating variables, possibly with error terms. This in turn allows us to represent the system under study as a causal model. That is, we can represent the system using a formalism that explicitly uses causal language. We can then make observations of the system, and decide whether those observations fit a probability distribution with properties that the causal model says they ought to have. If our observations don’t fit such a distribution, then the model we proposed at the beginning of this procedure cannot be the right one. In other words, as a result of stating our theories in causal language, we get a procedure for accepting or rejecting some models as representations of the physical world. This is just the sort of procedure that Norton claims no causal theory can provide.
There is a lot being skipped over here. First, I haven’t said anything about what it means for our observations to “fit a probability distribution”. There are many ways of understanding the relationship between empirical observation and probabilistic modeling, and I won’t get into the details of various approaches here. However, I believe that however we flesh out this relationship, my argument that causal models provide a constraint on physical theorizing still goes through.
Second, I have not said anything about an ingenious thought experiment called
“the Dome,” which Norton uses to show that not all indeterministic systems admit of a probabilistic description. While the relationship between causal models and the Dome would be better treated in a proper philosophy article than a blog post, suffice it to say that as long as some systems are best represented as causal models, my argument above still goes through; causation need not be universal to be scientific.
Finally, I have not said anything here about quantum mechanics. This is for the best. Quantum mechanics is a well-defined scientific theory that makes specific claims about the best mathematical representation of the behavior of very small objects. Some philosophers are well-versed in quantum theory, but others have gotten themselves into trouble when they have attempted to make pronouncements about quantum phenomena without sufficient knowledge of the theory’s technical details. It is a significant understatement to say that I do not fully understand quantum mechanics.
However, several authors have argued that on some interpretations of the famous Einstein-Podolsky-Rosen experiments in quantum mechanics, the Principle of the Common Cause (and by implication, the Causal Markov Condition) is violated. This moves the debate over the scientific status of causation into a much narrower terrain. Rather than being a debate about the role of causality in all of science, it is now a debate about what constitutes the best formalization of certain aspects of quantum mechanics. Here, active research by the physicists Fabio Costa and Sally Shrapnel, who aim to adapt the causal modeling formalism to a quantum context, may offer a promising response to Norton’s argument in this domain. At the very least, proponents of causal modeling who argue for the scientific respectability of causal language can no longer be accused, as a group, of being entirely ignorant of quantum mechanics, though it is doubtful that they ever could be fairly accused of this.
One way in which philosophy of science can be of great benefit to all intellectually curious people is by trying to spell out, clearly and correctly, how and why the image of the world presented by science is different from the image of the world presented by ordinary perception. Russell and Norton are taking on exactly this sort of worthwhile project when they argue that nothing about the laws of physics implies that nature is governed by forces of cause and effect. I hope to have shown here that despite these good intentions, innovations in causal modeling over the last few decades call into question their skeptical attitude towards the scientific status of causation.
At the same time, it is worth acknowledging that the causal modeling project is also, in some senses, a project that highlights a gap between the scientific and common-sense images of the world. Anecdotally, many people think of causation as a kind of metaphysical glue, an irreducible oomph in the world that grounds our explanatory practices, to say nothing of criminal law or the practical realities of our daily lives. However, there is no obvious place for this oomph in the causal modeling framework. Instead, what we get are a set of testable constraints on what we ought to observe if the world is causally structured in a certain way. While this account of causation might be disappointing to some, I believe that causation understood in this way is causation enough.