# Data Analysis

(Note: In the popup window, be sure to scroll down after each correct answer.)

4. Age was a potential confounder in this study. Choose an appropriate diagram representing the relationship of this potential confounder with exposure and outcome.

Answer (a) — correct: In this diagram age meets the requirements to be a confounder because, as depicted in the diagram, age is a risk factor for breast cancer and is associated with pesticide use, but it is not a result of pesticide use.
Answer (b) — incorrect: This diagram illustrates that age is an intermediate in the pathway between pesticide use and breast cancer. If a factor is in the pathway between exposure and outcome, it is called a mediator.
Answer (c) — incorrect: In this diagram, age does not meet the necessary conditions for a confounder because it is not associated with the exposure.

5. Explain how you would assess whether a potential confounder alters an effect estimate after adjusting for it in a multivariate model.

Answer (a) — incorrect: You must compare the crude and adjusted OR's to evaluate confounding. Remember, the crude estimate simply reflects the association between the exposure and outcome; it does not take into account the effect of potential confounders.
Answer (b) — incorrect: You must compare the crude and adjusted OR's to evaluate confounding. Remember, the adjusted estimate simply reflects the association between the exposure and outcome after controlling for a potential confounder. Without the crude to compare back to, we would not know what happened to the OR after taking other risk factors into account.
Answer (c) — correct: It is important to compare the adjusted OR with the crude OR to see the change in the effect estimate. To evaluate the magnitude of confounding, the rule of thumb is to look at the percent change in the adjusted estimate. If the adjusted estimate differs from the crude by 10% or more, then it is customary to consider that variable as a confounder. The adjusted odds ratio is reported to describe the exposure-disease association controlling for the confounder.

6. Lawn/garden pesticide use was significantly associated with breast cancer after adjusting for age, level of education, and other combined pest group (OR=1.34, 95% C.I. 1.11-1.63). Given that investigators also determined that a host of other factors (e.g., age of menarche, oral contraceptive use, and family history of breast cancer) did not meet the criteria for confounders, can the authors conclude that they have removed all sources of confounding in the examination of this association?

Answer (a) — incorrect: We can never be absolutely sure that our estimates are unbiased. First, it is unlikely that all measured confounders were measured without any error. For example, in the Teitelbaum et al. (2007) study, individuals self-reported on history of oral contraceptive use. It is likely that women were not 100% accurate in their memory of length of oral contraceptive use and exact dose, and thus the measurement of this variable is less than perfect. Further, there may be unmeasured confounding affecting the association. In an observational study such as this case-control study, we can never be sure that we have measured all confounders of the association. However, experimental studies such as RCT design, when sufficiently large, are capable of creating, on average, comparability between exposed and unexposed on all measured as well as unmeasured confounders by randomization of exposure.
Answer (b) — incorrect: The investigators reported that adjusting for history of breast cancer did not appreciably affect the results of the study. That is, among those with a history of breast cancer, the association between lawn/garden pesticide use and breast cancer was not different than: a) those without a history of breast cancer, and b) the crude estimate unadjusted for family history of breast cancer. Thus, family history of breast cancer (as measured) did not contribute to confounding of the association between lawn/garden pesticide use and breast cancer.
Answer (c) — incorrect: Cases and controls are never comparable on all risk factors for the outcome. Matching of cases to controls on age was used to remove confounding by age, but there may be many more measured and unmeasured risk factors which were not matched that need to be controlled.
Answer (d) — correct: We can never be absolutely sure that our estimates are unbiased. Residual confounding is confounding that remains even after many confounding variables have been controlled. It can occur if there is systematic error in the measurement of the confounders, if there are unmeasured confounders that have not been controlled, or if confounders were classified into categories that are too broad (Aschengrau & Seage, pp. 300-301).

7. What if, during data analysis, investigators found that use of vitamin supplementation was associated with pesticide use and was an independent risk factor for breast cancer? Should they attempt to control for this potential confounder?

Answer (a) — incorrect: Many variables may act as confounders in one study. While it is important to hypothesize which factors may confound an association, it is also important to evaluate other potential confounders during the analyses as well. In doing so, you must report the process of how you selected potential confounders (i.e., a priori confounders in the Methods section and a posteriori confounders in the Results section), and discuss your findings in the Discussion section.
Answer (b) — correct: It is important to report the selection process of confounding variables in your work. A priori confounders should be reported in the Methods section, a posteriori confounders in the Results section.
Answer (c) — incorrect: It is not always possible to know all potential confounders at the beginning of a study. This may happen when investigating an exposure-disease association which has not been studied well, or if cost and feasibility make it impossible to address all potential confounders at the design phase of a study. Therefore, it is necessary to consider confounding at the analysis phase of a study as well.

8. Suppose investigators wanted to control for education as a potential confounder in the design stage of the analysis. Which of the following would be appropriate to control for education as a potential confounder at the design stage?

Answer (a) — incorrect: Stratified analysis is used in the analysis stage of the research process. Stratification means the effect of an exposure is evaluated within strata (levels) of a confounder (e.g., looking at the exposure-disease association among those with low education only and then among those with higher education only). Once you calculate the OR's for each stratum (and if they are similar to one another), you then compare them with the crude OR. If there is a large difference (a commonly used rule of thumb is >10%) between the stratified and crude OR's, you can conclude that the variable may be confounding the exposure-disease association.
Answer (b) — incorrect: This method of confounding control is called restriction. While it is a method to control confounding at the design stage, there is another answer choice that is also a method to control for confounding at the design stage.
Answer (c) — incorrect: This method of confounding control is called matching. While it is a method to control confounding at the design stage, there is another answer choice that is also a method to control for confounding at the design stage.
Answer (d) — incorrect: Stratified analysis is not a method to control confounding at the design stage.
Answer (e) — correct: Restriction and matching are two methods to control for confounding at the design stage. With restriction, entrance into the study is determined by whether the subject falls into a pre-determined category of the potential confounder. With matching, study subjects are selected so that the potential confounder is distributed identically across the comparison groups (Aschengrau & Seage, pp. 294-297).

9. Teitelbaum et al. (2007) matched cases to controls on age. A more recent study, Itoh et al., (2008) , also examined the association between pesticides and breast cancer and matched cases to controls on both age and geographic location. Which study was better at controlling for confounding in the design phase of the study?

Answer (a) — incorrect: Although it is not good to match controls to cases on too many factors, matching on fewer factors in itself does not guarantee that confounding is accurately controlled. It is best to match on the minimally necessary number of factors. However, matching on as few factors as possible may miss important sources of confounding that could be controlled in the design stage.
Answer (b) — incorrect: Matching on as many factors as possible is not a good strategy. It may be difficult and expensive to find controls for each case. It is not possible to determine which study was better at controlling for confounding by looking at the number of matched factors.
Answer (c) — correct: What should be of foremost importance when controlling confounding is that the variables are (1) confounders based on your theory of the exposure-disease relation, (2) meet the necessary conditions to be a confounder, (3) were measured properly and, (4) their effects were removed at the analysis stage. The reason Itoh et al. (2008) matched cases to controls on geographic location is because their source population for this study was very large and there were reasons to believe that background rates of breast cancer differed by geographical area. Teitelbaum et al.'s geographical area was limited to Nassau and Suffolk counties only.