# Data Analysis

Now that you have collected the data, you quickly glance over the information and realize that there are a number of ways to analyze it. The most appropriate analysis of the data collected in this study employs the use of person-time as a way of taking into account the fact that subjects may have been followed for varying amounts of time (Please see Aschengrau & Seage pp. 220-221).

Learn more about person-time calculations. In our retrospective cohort study, all individuals will enter the study at the same moment in time (September 1, two years ago). However, not all will exit at the same time. How can they exit the study? Any number of ways, including:

1. The development of Susser Syndrome (once they have the disease, they are no longer at risk of developing it);
2. Death from other competing causes;
3. Loss to follow-up (Please see Aschengrau & Seage pg. 219-220).

Loss to follow-up presents a unique challenge in epidemiological studies. Clearly, without regular contact with study participants, it may not be possible to estimate when, and if, a person developed the disease of interest. In these situations, your calculations may be severely compromised. Epidemiologists employ two different estimates of effect to assess exposure-disease relationships in cohort studies: the risk ratio and the rate ratio (Please see Aschengrau & Seage pp. 67-69). Since this is your first real work as a budding epidemiologist, you decide to analyze the data using both measures of effect and later on compare them.

6. Calculation of the risk ratio from person-time information. [Aschengrau & Seage, Chapter 3]

The data collected by your team yield the following information:

• Number of cases among exposed - 74
• Number of cases among unexposed - 120
• Total number of exposed individuals - 1,900
• Low exposure group - 1,000
• Medium exposure group - 650
• High exposure group - 250
• Total number of unexposed individuals - 7,400
none:
Disease + Disease - Total
Exposed 74 1,900-74 1,900
Unexposed 120 7,400-120 7,400

Disease + Disease - Total
Exposed 74 1,826 1,900
Unexposed 120 7,280 7,400
none:

The formula for calculating risk is: (Number of exposed cases per 2-yr time period) / (Total number of exposed persons per 2-yr time period)

= 74/1,900

= 0.0389 (or 39 cases per 1,000 exposed per 2-yr time period)

The risk of developing Susser Syndrome among those exposed to SUPERCLEAN (for at least 6 months) is 39 cases per 1,000 exposed per 2 years.

none:

(Number of unexposed cases during 2-yr time period) / (Total number of unexposed persons during 2-yr time period)

= 120/7,400

= 0.0162 (or 16 cases per 1,000 unexposed per 2-yr time period)

The risk of Susser Syndrome among those unexposed to SUPERCLEAN (for at least 6 months) is 16 cases per 1,000 exposed per 2 years.

none:

(Risk of disease among the exposed) / (Risk of disease among the unexposed)

= 0.0389/0.0162

= 2.40

none: Those who were exposed to chemicals involved in the SUPERCLEAN production for at least 6 months have a 2.40 times higher risk of developing Susser Syndrome than those who were not exposed to SUPERCLEAN production.

## Intellectually curious?

In the preceding example, you estimated the magnitude of risk due to exposure to SUPERCLEAN by comparing those with exposure to those without exposure. However, the exposure data could be characterized more accurately by dividing into three exposure categories, i.e., low, medium and high exposure. If the risk increases with the increase in exposure level, then one can conclude that there is a dose-response relationship in the data, i.e. biological dose gradient. The presence of the dose-response relationship strengthens our conviction that the relationship is causal.

Please calculate the incidence risk in the three exposure groups using the following data:

Level of ExposureDisease +Disease -Total
Low209801000
Medium30620650
High24226250
Unexposed12072807400

none:

number of exposed cases pre 2-year time period
total number of exposed persons per 2-year time period

= low exposure group

= 20/1000 = 0.0200 (or 20 cases per 1,000 exposed per 2-yr time period))

Low-dose group = 20/1000 = 0.0200 (or 20 cases per 1,000 exposed per 2-yr time period))
Medium exposure group = 30/650 = 0.046
High exposure group = 24/250 = 0.096
Unexposed group = 120/7400 = 0.0162

Risk ratio calculations:
Relative risk in the low exposure group = 0.020/0.0162 = 1.23
Relative risk in the medium exposure group = 0.046/0.0162 = 2.84
Relative risk in the high exposure group = 0.096/0.0162 = 5.92

What is your conclusion with regard to dose-response relationship in these data?

7. Calculation of the rate ratio [Aschengrau & Seage, Chapter 3].

The data collected by your team yield the following information:

• Number of cases among exposed - 74
• Number of cases among unexposed - 120
• Number of exposed person-time of observation (PYO)- 3,675
• Low exposure group- 2,000 PYO's
• Medium exposure group- 1,225 PYO's
• High exposure group- 450 PYO's
• Number of unexposed PYO's- 14,550
none:
Disease + Total PYO's over 2-yr time period
Exposed 74 3,675
Unexposed 120 14,550
none:

(Number of exposed cases during 2-yr time period) / (PYO's among exposed persons during 2-yr time period)

= 74/3,675

= 0.0200 (or 20 cases per 1,000 PYO's)

The rate of Susser Syndrome among those exposed to SUPERCLEAN (for at least 6 months) is 20 cases per 1,000 PYO's.

none:

(Number of unexposed cases during 2-yr time period) / (PYO's among unexposed persons during 2-yr time period)

= 120/14,550

= 0.0082 (or approximately 8 cases per 1,000 PYO's)

The rate of Susser Syndrome among those unexposed to SUPERCLEAN is 8 cases per 1,000 PYO's.

none:

(Rate of disease among the exposed) / (Rate of disease among the unexposed)

= 0.0200/0.0082

= 2.44

none: Those who were exposed to chemicals involved in the SUPERCLEAN production for at least 6 months have a 2.44 times higher rate of developing Susser Syndrome than those who were not exposed to SUPERCLEAN production.

8. Calculation of rate ratio in different age strata.

The data collected by your team yield the information:

Age Group Exposed Unexposed
Number of Cases PYO Number of Cases PYO
< 30 43 2,188 75 9,249
≥ 30 31 1,487 45 5,301
Total 74 3,675 120 14,550

Age Group 20 - < 30 yrs: 43/2,188 = 0.0197 (or 20 cases per 1,000 PYO's)

Age Group 30 - 40 yrs: 31 / 1,487 = 0.0208 (or 21 cases per 1,000 PYO's)

Age Group 20 - < 30 yrs: 75 / 9,249 = 0.0081 (or 8 cases per 1,000 PYO's)

Age Group 30 - 40 yrs: 45 / 5,301 = 0.0085 (or 9 cases per 1,000 PYO's)

Age Group 20 - < 30 yrs: (0.0197 / 0.0081) = 2.43

Age Group 30 - 40 yrs: (0.0208 / 0.0085) = 2.45

Among persons aged 20 to <30 years of age, those who are exposed to the chemicals involved in the SUPERCLEAN production for at least 6 months have a 2.43 times higher rate of developing Susser Syndrome than those who are not exposed to SUPERCLEAN production.

Among persons aged 30-40 years of age, those who are exposed to the chemicals involved in the SUPERCLEAN production for at least 6 months have a 2.45 times higher rate of developing Susser Syndrome than those who are not exposed to SUPERCLEAN production.