Causal Inference

Study Design

Part A: Review Causal Theory

After your meeting with Mr. Broomberg, you start outlining the steps you'll need to take to provide an informed opinion concerning whether the study provides sufficient evidence of a causal link between watching television and early initiation of smoking. While reading the article and jotting down some extremely informative notes, you begin asking yourself a deceptively simple question: exactly what is a 'cause'?

While gestating on this perplexing question, you decide to do a little office cleaning. Your "office" is actually a cubicle in an old storage room deep in the bowels of the Epiville Department of Health, and it is cluttered with old papers and newspaper articles. You decide it's time to move these boxes of files when one headline from an old Epiville Press newspaper catches your attention. You pause in your cleaning and read.

1. Which of the following statements best explains the Mayor's false belief that increasing the stork population would increase the birth rate?

  1. The Mayor implied there was an association between the stork population and the birth rate that was not there.
  2. The Mayor mistakenly thought that 2 factors which are statistically associated (the stork population and the birth rate) must be causally related.
  3. The Mayor neglected to look at the statistical significance of the association. Only if the results were statistically significant could he conclude that a causal relationship exists.
Answer (a) — incorrect: Looking at the figure provided in the article, there is clearly an association between the stork population and the birth rate. His mistake was not in ascribing an association between these two, but rather in speculating on the reason for such an association.
Answer (b) — correct: One of the central tenets of epidemiologic study is that association does not necessarily imply causation - there may be alternate explanations for this association (such that a third factor, in this case the Industrial Revolution, is causing both a decrease in storks and a decrease in the birth rate).
Answer (c) — incorrect: 'Statistical significance' only tells you the probability of observing the data that you observed (or something more extreme) given the hypothesis that there is no association between the exposure and outcome. Statistical significance does not indicate that an association is causal. It is merely concerned with whether an association could be due to chance given sampling variability in your data.

Since this news report has got you thinking about causal theory again, you decide that your work environment is "clean enough." You're ready to get back to work when you notice today's Epiville Press, so you decide to check the baseball scores from last night.

2. You are struck with the manager's claim that I-Pod did not cause the team to lose the game. Which of the following (A only, B only, or both A and B) could also have caused the Riskfactors to lose the game?

  1. The Riskfactors failed to score 2 runs in the 7th, 8th, or 9th inning.
  2. Riskfactor pitcher, Vinnie Virchow, gave up a home run and the Riskfactors did not score any runs.
  3. Both A and B
Answer (a) — incorrect: While this is a potential cause of the loss, there is a more complete answer available.
Answer (b) — incorrect: While this is a potential cause of the loss, there is a more complete answer available.
Answer (c) — correct: Note that there are multiple causes of the event operating here. In epidemiology, we speak of a "web of causality" and look to explain whether certain "risk factors" are "A cause" (not THE cause) of a particular outcome.

Part B: Review Rothman's heuristic and the Sufficient-Component Cause model

The two newspaper articles highlight some important concepts about causality. In epidemiology, some of these concepts have been coalesced into a theory of disease causation, based on the premise that there are multiple causes for most given diseases. This theory was made "famous" (for epidemiologists, at least) by Kenneth Rothman and his heuristic showing causes of disease as distinct pies (Aschengrau & Seage, pp 399-401).

A cause of an outcome is defined as "something that makes a difference" (Susser, 1973), or, more formally, as an "event, condition, or, characteristic that preceded the outcome of interest such that, had it not occurred, the outcome would not have occurred when and how it did" (Rothman and Greenland, pg. 8).

Looking at the newspaper article on last night's baseball game and the answers to Question 2 above, you hypothesize that the following factors were "component causes" in last night's loss to the Biostatown Frequentists:

  1. I-pod striking out with the bases loaded.
  2. Vinnie Virchow giving up a solo home run.
  3. The Epiville Riskfactors not scoring any runs in the 7th, 8th, or 9th inning.
  4. I-pod getting caught stealing second base to end the game.

However, there are likely other "component causes" to last night's loss. Ken Rothman proposed a "sufficient-component cause" model of causation to explain the idea that there are often many contributing causes to a single event. Consider the following heuristic which represents the potential "causes" of last night's loss:


A = The Epiville RiskFactors scoring fewer runs than the Biostatown Frequentists
B = Vinnie Virchow giving up a home run
C = I-Pod striking out with the bases loaded in the 6th inning
D = RiskFactors failing to score in the 7th, 8th, or 9th inning
E = RiskFactors failing to score through the 8th inning
F = I-Pod hitting a single with 2 outs in the 9th inning
G = I-Pod getting caught stealing in the 9th inning

Each "pie" represents one sufficient cause of the outcome of interest, in this case, last night's loss. Thus, the "cause" of the loss could be viewed as operating through cause A alone, or through the combination of causes B, C, and D, or through the combination of causes B, E, F, and G.

In this heuristic, if the three "sufficient causes" represent all of the potential causes of last night's loss, then cause A represents a sufficient cause of disease (if A occurred, then the RiskFactors would inevitably have lost).

Causes B-G are "component" causes, since they are neither necessary for the outcome of interest (since there is another sufficient-cause pie which does not include them) nor sufficient (since, for example, having component B would "cause" the loss only if components C and D also occurred.

Rothman's schematic has proven remarkably useful in explaining the multicausal nature of diseases. This schematic is useful at explaining why, for example, we consider smoking to be a cause of lung cancer even though (a) not everyone who smokes gets lung cancer and (b) not everyone who has lung cancer, smoked. Rothman's schematic is also useful at demonstrating which "causes" of a particular outcome, if removed, would lead to the greatest reduction in the incidence of that outcome.

However, there is a limitation to this model. Namely, his heuristic says nothing about the mechanisms through which the causes operate. In fact, "cause A" can be thought of as containing causes B, C, and D (or B, E, F, and G). Under this model, all "causes," no matter how or when they operate, are given essentially the same importance. The sufficient-component cause model's inattention to causal pathways has lead to the criticism that utilizing it leads to "black box" epidemiology, identifying "risk factors" without gaining any insight into the mechanisms through which they operate.

Now that you've seen the difficulty inherent in making a causal claim about something that has been observed, you begin to look at the Attorney General's case. You start by reading the article and answering the following questions:

3. Attorney General Mike Broomberg asks you if you think that watching television is "a cause" of early smoking initiation. You think back to your Principles of Epidemiology class and recall that epidemiologists have a fairly distinct definition of "cause." Which of the following statements best describes the hypothesized causal link between television viewing and smoking initiation?

  1. All individuals with long hours of television viewing during childhood will go on to start smoking early in adolescence.
  2. All individuals who initiate smoking early on in childhood would have been exposed to substantial television viewing during childhood.
  3. Individuals who watch substantial amounts of television early on in childhood are more likely to initiate smoking early, compared with those individuals who watch television less frequently during childhood.
Answer (a) — incorrect: While this is a description of a sufficient cause of disease, the article cited by the Attorney General does not claim that everyone who watches television will initiate smoking.
Answer (b) — incorrect: This is a description of a necessary cause of disease. The article cited by the Attorney General does not claim that television viewing is required for individuals to initiate smoking.
Answer (c) — correct: This statement allows for a "multicausal" theory of causation. While television watching is neither necessary nor sufficient, it is still hypothesized to be a cause.

4. You decide to draw some "Rothman Causal Pies" to help you think about how watching television might act as a cause of smoking initiation. Based on the causal pies given below and assuming that they are true, which of the following is a true statement?

T = Television Watching
P = Peer Pressure
L = Low Self-Esteem

  1. Television watching is a necessary cause of smoking initiation.
  2. Television watching is a sufficient cause of smoking initiation.
  3. Television watching is a cause of smoking initiation that is neither necessary nor sufficient.
Answer (a) — incorrect: In the above model, it is possible to initiate smoking without watching television, due to a combination of peer pressure and low self-esteem. Since television watching is not present in every causal pie it is not a necessary cause.
Answer (b) — incorrect: In the above model, watching television alone cannot cause smoking initiation, but instead must be accompanied by peer pressure or low self-esteem. Since television watching does not have its "own pie", it is not a sufficient cause.
Answer (c) — correct: In the above model, watching television in the context of peer pressure or low self-esteem is a cause of initiating smoking.