The University of Wisconsin Population Health Institute has published
The Institute uses federal and other national data sources that provide county estimates of health-related measures. Measures from federal data sources are commonly censored if the number of events or the sample size is small, because of privacy concerns; the reported estimates are often imprecise (
To address these questions, we used hierarchical Bayesian models for the health outcomes data in the 2010
This work builds on previous efforts using multilevel models to generate small-area estimates (
The Institute analyzes data for more than 25 health-related measures in
Rather than report an individual rank for each measure,
| Measure | Weight |
|---|---|
| Premature mortality | 50% |
| % Reporting fair or poor health | 10% |
| Mean no. of poor physical health days per month | 10% |
| Mean no. of poor mental health days per month | 10% |
| % Live births with low birth weight (<2,500 g) | 20% |
The weights represent the relative contribution of each measure to the composite health outcomes score.
In this article, we restricted our analysis to the 5 health outcome measures. Because measure-based ranks do not align with
The 2010
| Measure | Source | Years |
|---|---|---|
|
| ||
| Premature mortality (years of potential life lost before age 75 per 100,000 population) | National Vital Statistics System, National Center for Health Statistics | 2004–2006 |
| Raw mortality and population counts by age group | Underlying cause of mortality query, CDC Wide-ranging Online Data for Epidemiologic Research ( | |
| % Reporting fair or poor health | Behavioral Risk Factor Surveillance System | 2002–2008 |
| Mean no. of poor physical health days per month | Behavioral Risk Factor Surveillance System | 2002–2008 |
| Mean no. of poor mental health days per month | Behavioral Risk Factor Surveillance System | 2002–2008 |
| % Live births of babies with low birth weight (<2,500 g) | National Vital Statistics System, National Center for Health Statistics | 2000–2006 |
National Vital Statistics System (
Behavioral Risk Factor Surveillance System (
County data on race/ethnicity, sex, and age for 2008 were accessed through the US Census Population Estimates program (
The percentage of counties with missing health outcome data ranged from 3.1% to 13.7%. Rural counties accounted for the majority of counties with missing data (72%–92%). Values for vital statistics measures were suppressed if based on 5 or fewer events. BRFSS censored values for counties with fewer than 50 respondents or a 95% confidence interval width greater than 20%.
For each health outcome measure, we entered the data into 2 generalized, linear, mixed-effects models with state- and county-level random effects and either an intercept only (model 1) or an intercept plus county-level demographic covariates as fixed effects (model 2). We used the demographic variables to inform estimates, not to adjust for county differences. The Poisson model specifications are below:
For
The mortality rates underlying premature mortality followed a Poisson distribution; data for each age group included number of events and population denominator.
We used a binomial distribution to model low birth-weight births and self-reported health. Low birth-weight data included a census of all live births and the number of births of babies with low birth weight. Self-reported health data included a point estimate and 95% confidence limits. The implied standard errors given by the confidence limits and the reported prevalence were used to obtain the effective numerator and denominator for each county.
The measures of poor physical and poor mental health days approximated a Gaussian distribution but were log-transformed to control for over-dispersion. Reported data for these measures included mean value by county, the number of respondents, and 95% confidence limits. We used the inverse of county-specific variances, obtained from the confidence limits, as weights to account for sampling error.
Demographic covariates included categorical variables for percentage African American, Asian, American Indian, and Latino; percentage female; percentage under age 18 years and over age 64 years; and urbanization. We chose not to include as covariates any measures we considered modifiable factors, particularly those that comprise the health factors rank, which include variables such as poverty and education. Urbanization was categorized as large metropolitan, small metropolitan, and rural. Race/ethnicity measures were categorized as low (< 6%), medium (6%–39%), and high (≥ 40%) proportion of the population. Covariates for sex and age were divided into 4 categories (
| Measure | Category 1 (Low) | Category 2 | Category 3 | Category 4 (High) |
|---|---|---|---|---|
| % Female | <45% | 45%–50% | 50%–55% | >55% |
| % Younger than 18 | <20% | 20%–23.5% | 23.5%–27% | >27% |
| % Older than 64 | <10% | 10%–15% | 15%–20% | >20% |
Model parameters were estimated by maximum likelihood. Empirical Bayes estimates of the random effects were obtained by conditioning on the estimated variance parameters. Samples of the regression coefficients and state- and county-level random effects were drawn from a multivariate normal approximation to their joint-posterior distributions. Posterior samples of the county-specific estimates were obtained from the sampled vectors. To generate composite scores for ranking, posterior samples for the individual measures were transformed to (national)
We used random-number generation procedures for binomial, Poisson, and normal distributions to produce posterior-predictive data sets based on the posterior samples and the reported population data for each measure. Similarities among parameters calculated from the posterior-predictive data sets and the original data were quantified by posterior predictive
Posterior samples of county ranks were obtained by ranking the county-specific
We found that the models for self-reported health, low birth-weight births, and poor mental health days were best able to replicate the IQRs of the original data. The models that fit less well were poor physical health days and premature mortality, for which the observed IQRs fell within the distribution of the IQRs calculated from posterior-predictive samples less than 5% of the time. Overall, the challenges in model fit make our posterior rank estimates less reliable indicators of county performance for these 2 measures.
Using data from all US counties allowed us to explore national rank performance. We observed significant clustering by state; this clustering, along with wide 90% credible intervals, made in-state credible intervals impractical and inconsistent with our models. When we used posterior samples from the empty models, the mean width for the 90% credible intervals was 565 ranks (18 percentile ranks) for health outcomes. Adding demographic covariates into the hierarchical models increased rank precision marginally: the mean width of the confidence intervals decreased by 2.7 percentile ranks.
The probability of counties ranking in their assigned national quartiles for health outcomes is based on the empty (
Choropleth map of US county rank certainty in composite health outcomes (empty model).
Choropleth map of US county rank certainty in composite health outcomes (demographic model).
The estimates for health outcome measures used in
Our results are consistent with Hall and Miller’s examination of rank performance, in which a small group of “highly performing” entities tends to remain fixed in rank (
Our primary goal was to use the posterior samples of the county-specific parameters to estimate precision, but these samples can provide other advantages, such as the flexibility to report national performance as well as performance within state. The Institute does not report national ranks for numerous salient reasons, including the importance of focusing media attention on county health in all 50 states. However, reporting national percentile ranks alongside or in place of in-state ranks can provide richer information on how counties perform.
The hierarchical models also allow for the calculation of race-, sex-, and age-adjusted estimates for ranking. In their current configuration,
In this study we examined cross-sectional data with a limited set of demographic covariates. These demographic covariates, furthermore, were assumed to be national-level fixed effects as — for many states — there is insufficient data for state-level fixed effects. Future work will explore the use of demographics covariates as state-level random effects to better account for the variable association between race/ethnicity and health outcomes by state.
There are several potential extensions of these hierarchical models to improve point and interval estimation of ranks, including the use of more extensive demographic covariates in the cross-sectional model, the use of longitudinal data on a single health outcome and the use of multiple related outcomes in a single hierarchical model. These expanded models may be particularly useful in addressing challenges with model fit with the poor physical health days and premature mortality measures.
The models provide automatic spatial smoothing of county event rates within states through the inclusion of a state-level random effects component. Although this eliminates the need to specify or estimate a spatial lag parameter, as done elsewhere for states (
Now in its fourth year,
The work represented in this paper was supported by the Robert Wood Johnson Foundation under grant no. 65017, Mobilizing Action Toward Community Health.
The opinions expressed by authors contributing to this journal do not necessarily reflect the opinions of the U.S. Department of Health and Human Services, the Public Health Service, the Centers for Disease Control and Prevention, or the authors' affiliated institutions.