Diseases associated with specific exposures may have little or no observable background rate in the absence of the exposure. Examples include mesothelioma (environmental asbestos), aplastic anemia (benzene), bronchiolitis obliterans (artificial butter flavorings), Reye’s syndrome (aspirin in children), and angiosarcoma of the liver (vinyl chloride). Relative-rate models of exposure-response produce unstable near-zero baseline risk and unbounded coefficients, especially when age confounding requires baseline age dependence. The same problem arises in a proportional-hazards context. Baseline risk volatility also threatens meta-analyses, a procedure that assumes uniformity.
Using Poisson regression,
The model specification was as follows:
rate = [exp(α)] × [1 + βcumX] or rate ratio = 1 + βcumX,
where cumX is an exposure metric, α is the intercept defining baseline risk, βcumX is the excess rate ratio, and β is the excess rate ratio coefficient.
Analyses were conducted as follows:
Attributable cases only Attributable cases only, analyzed with fixed intercept With added nonattributable cases With added nonattributable cases analyzed with intercept fixed at known baseline risk (number of baseline cases/person-years of observation).
With the standard model, the excess rate ratio coefficient, β, varied widely across 1000 populations: mean = 13.4 (SD = 94.5) and range = 0.1-2834; with constrained intercept, the mean = 5.9 (0.76); range = 3.7-8.8. The mean of log(excess rate ratio coefficient) was 1.54 (SD =1.3) versus 1.77 (0.13) with fixed intercept (
In 100 simulated populations, each with 100 iterations of added baseline cases, estimates of excess rate ratio coefficient were much less variable than with standard models, especially with intercept fixed at the known baseline risk. The mean excess rate coefficient was now close to nominal with or without the fixed intercept (0.00005964 and 0.00006009, respectively). When the average squared deviation of the estimated excess rate coefficient was calculated within each set of 100 baseline iterations, the mean of those averages across the 100 simulated populations with intercepts fixed (0.45 × 10−10), was comparable to that without baseline enhancements but with fixed intercepts (0.59 × 10−10).
Simulations with small populations (n = 50) demonstrated greater bias (
Supplemental digital content is available through direct URL citations in the HTML and PDF versions of this article (
Matthew Wheeler assisted with the R-programming and A. John Bailer provided statistical advice. This benefited from comments from David Umbach, Sally Thurston, Ellen Eisen, and Randall J. Smith.
Summary Comparisons of Estimation Performance With and Without Fixed Intercept or Random Baseline for Large and Small Population Simulations
| Large | Small | |||
|---|---|---|---|---|
| Mean | (SD) | Mean | (SD) | |
| Log(excess rate ratio coefficient), log(β) | ||||
| Estimated intercept/no baseline (n = 1,000) | 1.54 | (1.34) | 0.63 | (1.86) |
| Fixed intercept/no baseline (n = 1,000) | 1.77 | (0.13) | 1.69 | (0.44) |
| Excess rate coefficient, [exp(α)] × β (×105), nominal value = 6.000 | ||||
| Estimated intercept/no baseline (n = 1,000) | 5.095 | (0.90) | 4.082 | (2.19) |
| Fixed intercept/no baseline (n = 1,000) | 5.981 | (0.77) | 5.989 | (2.40) |
| Fixed intercept/random baseline, avg (n = 100 | 5.964 | (0.67) | 6.278 | (2.71) |
| Squared deviation: (excess rate coefficient − 6.0 × 10−5)2 (×1010) | ||||
| Estimated intercept/no baseline (n = 1,000) | 1.62 | (1.77) | 8.48 | (9.63) |
| Fixed intercept/no baseline (n = 1,000) | 0.59 | (0.87) | 5.76 | (9.33) |
| Fixed intercept/random baseline, avg (n = 100) | 0.45 | (0.46) | 7.34 | (14.5) |
SD indicates standard deviation.
Based on 100 iterations of study population each analyzed with 100 random baselines; average for each study population across the set of its 100 random baselines.