Background: Heat wave and health warning systems are activated based on forecasts of health-threatening hot weather.
Objective: We estimated heat–mortality associations based on forecast and observed weather data in Detroit, Michigan, and compared the accuracy of forecast products for predicting heat waves.
Methods: We derived and compared apparent temperature (AT) and heat wave days (with heat waves defined as ≥ 2 days of daily mean AT ≥ 95th percentile of warm-season average) from weather observations and six different forecast products. We used Poisson regression with and without adjustment for ozone and/or PM10 (particulate matter with aerodynamic diameter ≤ 10 μm) to estimate and compare associations of daily all-cause mortality with observed and predicted AT and heat wave days.
Results: The 1-day-ahead forecast of a local operational product, Revised Digital Forecast, had about half the number of false positives compared with all other forecasts. On average, controlling for heat waves, days with observed AT = 25.3°C were associated with 3.5% higher mortality (95% CI: –1.6, 8.8%) than days with AT = 8.5°C. Observed heat wave days were associated with 6.2% higher mortality (95% CI: –0.4, 13.2%) than non–heat wave days. The accuracy of predictions varied, but associations between mortality and forecast heat generally tended to overestimate heat effects, whereas associations with forecast heat waves tended to underestimate heat wave effects, relative to associations based on observed weather metrics.
Conclusions: Our findings suggest that incorporating knowledge of local conditions may improve the accuracy of predictions used to activate heat wave and health warning systems.
Citation: Zhang K, Chen YH, Schwartz JD, Rood RB, O’Neill MS. 2014. Using forecast and observed weather data to assess performance of forecast products in identifying heat waves and estimating heat wave effects on mortality. Environ Health Perspect 122:912–918;
Heat waves have been linked to increased risk of mortality, hospital admissions, heat stroke, heat exhaustion, and cardiovascular and respiratory diseases (
We know that forecast data vary in quality for different weather parameters. For example, temperature usually has a more accurate forecast than dew point temperature (DPT) (
This is an important question for the design of HHWS because the use of the forecast parameters with performance most comparable to the observed weather in association with mortality to trigger public health interventions would be preferred.
The objective of this study was to investigate how well forecast models reproduce heat waves seen in the observations from Detroit, Michigan, using one definition of heat wave and the impacts of weather forecast quality on heat–mortality associations. Previous studies on heat–mortality associations in Detroit have reported that hot weather is significantly associated with excess mortality in this city, and heat has a disproportionate burden on people who have diabetes, are less educated, or are black—a disparity that could be explained in part by unequal access to home air conditioning (
To inform our analysis, we followed the suggestion of
Observed weather data. Hourly weather observations at the Detroit Metropolitan Airport monitoring station (station name: Detroit/Metropolitan) were obtained from the National Climate Data Center (
We chose to use AT in this study because AT includes temperature and humidity information in a way similar to the heat index, and it is more easily applied than the heat index [which is limited to certain temperature thresholds (i.e., > 26.7°C) and relative humidity thresholds (i.e., > 40%)].
Forecast weather data. Weather forecast products are generated by postprocessing forecast output from several numerical weather prediction models using statistical approaches or local meteorologists’ judgments.
Model Output Statistics (MOS) products are generated by the National Weather Service’s (NWS) Meteorological Development Laboratory (
The Revised Digital Forecast (RDF) data are an operational forecast product produced by local meteorologists according to the outputs of numerical weather models and MOS models, their judgments, in addition to other information such as weather soundings (R. Pollman, personal communication). Local meteorologists usually make a decision regarding issuing a heat alert by considering many factors, including predictions from numerical forecast models, MOS products, and other forecast products as well as their local knowledge and others (Pollman R, personal communication).
Six different weather forecast products were obtained for the 2002–2006 study period. We first obtained five MOS products to represent these three model types and short-range/long-range forecasts, namely: the GFS-based short-range MOS forecast product (MAV: 6–72 hr in advance for most weather parameters), GFS-based extended-range MOS forecast product (MEX: extended range, 24 and 192 hr), and GFS-based ensemble MOS forecast product (ENS), NGM-based MOS forecast product (FWC), and NAM-based MOS forecast product (MET). In addition, we had access to the RDF product and extracted one archived local forecast data set (station name: KDTX, Detroit/White Lake, MI) from RDF retained by the
Air pollution data. Increases in daily air pollution concentrations have been associated with increases in mortality, and air pollution levels often covary with weather conditions. Therefore, we wanted to include air pollution concentrations in the models. Daily concentrations of ozone (O3) and particulate matter with aerodynamic diameter ≤ 10 μm (PM10) were obtained from the U.S. Environmental Protection Agency’s Aerometric Information Retrieval System (AIRS) monitoring network (
Preparation of the combined data set for analysis. We merged mortality data, weather observations, forecasts, and air pollution data by date. For both observed and forecast data, we defined a heat wave indicator as periods of ≥ 2 consecutive days with daily mean AT ≥ 95th percentile of the observed or predicted summertime distribution (1 May–30 September) determined separately for each year.
We used SAS (version 9.2; SAS Institute Inc., Cary, NC) to extract daily data from forecasts and observations and to calculate biases (predictions minus observed values) on a daily basis and summarized biases using average root mean squared errors.
We employed generalized additive models (GAMs) to model mortality counts as a function of the continuous AT metrics and indicator variables representing heat waves. This was done across weather observations and forecasts in the warm-season (1 May–30 September) study period. GAMs are commonly used in air quality and air pollution/heat epidemiology studies where seasonal patterns in outcome variables (e.g., mortality) and nonlinear associations between health outcomes and, say, temperature, require additional modeling flexibility (
Separate GAMs were fit to estimate associations between daily mortality counts (the outcome variable) and daily average AT metrics derived from weather observations and from the different forecast products with three lag periods. We simultaneously included daily AT (as a continuous variable), to estimate the effect of heat, and an indicator variable for heat wave days (HW
Log [
where
We modeled AT–mortality associations and summarized heat effects by estimating the percentage difference in mortality associated with a given AT change. To facilitate comparisons across observations and forecasts, we used the GAMs described above to estimate the percentage difference in mortality on days with observed or predicted daily mean AT averaged over the same day and the previous day (ATlag01) of 25.3°C compared with 8.5°C for all data sets (observed and forecast). These temperature references represent the 90th and 50th percentiles of the observed daily mean AT distributions during 2002–2006, and they are consistent with the percentiles used by
We conducted sensitivity analyses to examine whether estimated heat and heat wave effects differed when adjusted for O3 (on the same day), PM10 (on the previous day), or O3 and PM10 concentrations. We modeled both air pollutants using splines with 4 df, and selected the lag periods previously reported to have the strongest associations with mortality (
Daily mean biases (average RMSEs) for TMP, DPT, and AT relative to observed values.
| Forecast product | 1-Day | 2-Day | 3-Day | ||||||
|---|---|---|---|---|---|---|---|---|---|
| TMP | DPT | AT | TMP | DPT | AT | TMP | DPT | AT | |
| Abbreviations: AT, apparent temperature; 1-day, forecast 1 day in advance; 2-day, forecast 2 days in advance; 3-day, forecast 3 days in advance; DPT, dew point temperature; TMP, forecast temperature. | |||||||||
| FWC | 1.35 | 1.49 | 2.22 | 1.61 | 1.76 | 2.37 | 2.53 | 2.26 | 2.89 |
| MAV | 1.18 | 1.44 | 2.18 | 1.33 | 1.68 | 2.21 | 1.83 | 1.92 | 2.45 |
| MET | 1.20 | 1.65 | 2.07 | 1.41 | 1.81 | 2.15 | 2.79 | 2.12 | 3.14 |
| MEX | 2.33 | 2.26 | 2.80 | 1.66 | 1.78 | 2.45 | 1.91 | 2.04 | 2.66 |
| ENS | 2.21 | 2.18 | 2.69 | 1.66 | 1.76 | 2.44 | 2.02 | 2.08 | 2.72 |
| RDF | 1.49 | 1.58 | 2.11 | 1.58 | 1.75 | 2.30 | 2.61 | 2.33 | 3.31 |
Heat wave days predicted by forecast products 1-, 2-, or 3-days in advance compared with heat wave days defined based on observed data, Detroit summers (1 May–30 September) 2002–2006.
| Observation and forecast products | 1-Day | 2-Day | 3-Day | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Total | False positive | False negative | Correct | Total | False positive | False negative | Correct | Total | False positive | False negative | Correct | |
| Abbreviations: 1-day, forecast 1 day in advance; 2-day, forecast 2 days in advance; 3-day, forecast 3 days in advance. | ||||||||||||
| Observed | 14 | 14 | 14 | |||||||||
| FWC | 22 | 12 | 4 | 10 | 22 | 12 | 4 | 10 | 19 | 10 | 5 | 9 |
| MAV | 17 | 9 | 6 | 8 | 19 | 7 | 2 | 12 | 16 | 5 | 3 | 11 |
| MET | 21 | 11 | 4 | 10 | 20 | 10 | 4 | 10 | 20 | 13 | 7 | 7 |
| RDF | 13 | 5 | 6 | 8 | 22 | 11 | 3 | 11 | 11 | 8 | 11 | 3 |
Estimated percentage difference in mortality associated with observed and forecast ATlag01 of 25.3°C compared with 8.5°C during the summertime (May–September) in Detroit, 2002–2006, with and without adjustment for air pollution. All models were adjusted for heat wave days, day of the week, day of the year, and calendar year. (
In general, patterns of estimated changes in mortality associated with observed or predicted ATlag01 = 25.3°C versus 8.5°C were similar after adjustment for ambient air pollution (
Estimated percentage difference in mortality associated with observed and forecast heat wave days compared with non–heat wave days during the summertime (May to September) in Detroit, 2002–2006, with and without adjustment for air pollution. All models were adjusted for ATlag01, day of the week, day of the year, and calendar year. (
The present study addresses an epidemiologic question with potentially significant implications for the design of HHWS and the projection of future mortality risks attributable to heat and heat waves by conducting a case study in Detroit. The question was how does the performance of weather forecasts affect heat/heat wave–mortality associations and the likelihood of triggering an alert. Previous work has not examined this question systematically, which is an important omission given the increased trend of temperature and the growing frequency and severity of heat waves in a warm climate.
We explored six weather forecast products for Detroit derived from different algorithms that postprocess outputs from several numerical forecast models. Two of these, MEX and ENS, had error characteristics that suggested that they should not be used. We compared estimated associations of mortality with heat and heat wave days predicted using the four remaining forecast products to associations between mortality and observed heat and heat wave days. Our results suggest that, although the local operational forecast (i.e., RDF) was not always the most accurate in terms of biases compared with weather observations, it generally produced the estimates of heat and heat wave effects closer to associations with observed data than other forecast products. In addition, it produced far fewer false-positive calls, and similar numbers of false-negative calls, for heat waves. The estimated heat and heat waves effects varied with the forecast product and issuing time frames. The choice of forecast product could play a critical role in operating an HHWS more effectively.
Among the calculated weather metrics, AT was predicted with the largest bias, regardless of the forecast product used, followed by DPT and temperature. Not surprisingly, the accuracy in forecasting DPT was lower than that for temperature. Temperature is the more robustly forecast and spatially representative observation, and DPT is commonly thought to be more difficult to predict compared with temperature because it is largely affected by local land features such as lakes and rivers. AT is calculated from two directly forecasted variables (temperature and DPT), and thus it results in larger errors than either temperature or DPT. MEX and ENS had larger bias in temperature for a 1-day-ahead forecast than 2- and 3-day-ahead forecasts, possibly because large initialization errors in the forecasting systems were not much reduced at the 1-day time span.
The 1-day RDF forecast predicted the fewest false-positive heat wave days and an intermediate number of false-negative heat wave days compared with the other products. Although 1-day FWC and MET forecasts correctly predicted 2 more heat wave days than RDF, these products also predicted 6 or 7 more false-positive heat wave days. This finding suggests that the 1-day RDF forecast is an overall better product than others to use in order to issue heat alerts because the number of “wrong” alerts would be largely reduced when observed weather does not meet the heat alert criteria. This has important implications in risk communication because people do not trust heat warning systems if a system issues too many alerts. In addition, the 1-day RDF forecast had 8 correctly identified heat wave days compared with the observations that were similar to 8 to 10 days correctly predicted by using other forecast products. This suggests that using the RDF forecast product reduced the number of false-positive days with the cost of only a relatively lower number of correctly identified days. Although 2- and 3-day MAV forecasts predicted fewer false-negative heat wave days and correctly predicted more true-positive heat wave days than all other forecast products, local meteorologists pay more attention to the 1-day-ahead forecast and use all MOS products, as well as other forecast products, in their decision-making process for issuing a heat alert. Finally, this comparison across forecasts and weather observations highlighted the challenge in predicting weather extremes because all forecast models and postprocessing adjustments are designed for estimating the averages of temperature and other weather conditions (
Minor-to-significant differences of the estimated heat effects between forecasts and observations were observed across forecast products and time frames. Most of the forecast products overestimated associations between heat and mortality when compared with associations based on observed heat. The forecasts with the smallest biases may not necessarily result in the closest estimates of heat effects to those derived from the observations, and 1-day-ahead forecasts did not always result in estimated heat effects closer to those from observations than 2- or 3-day-ahead forecasts, suggesting that more information (e.g., bias propagation in forecast systems) is needed to better understand these relationships.
Compared with heat effects, associations between predicted heat wave days and mortality were smaller than associations with observed heat wave days and mortality. Forecast–mortality models captured heat wave effects less well than heat effects because forecast models perform worse in predicting extreme weather conditions. This suggests that we might expect more uncertainties in heat wave alerts issued by HHWSs compared with heat advisories. Heat wave alerts are more severe than heat advisories. In addition, although the forecasts made 1 day in advance are expected to be closer to the observations than other forecasts, their estimated associations with heat waves were not closer to the associations with the DTW observations than those estimates from other forecast periods, possibly because of the error propagation reason mentioned earlier. Overall, the comparison of heat/heat wave effects between observations and forecast demonstrated the importance of choice of weather forecast product in designing a HHWS.
Heat effects derived from both observations and forecasts were attenuated when models included O3, or PM10, or both, whereas estimated heat waves effects from forecasts and observations were similar. A recent national heat effects analysis for 107 U.S. communities reported that associations between heat and mortality slightly decreased with pollution adjustment in temperature-mortality models (daily mean temperature) (
The comparative analysis of weather forecasts presented here points out the challenge in issuing heat alerts based purely on numerical or statistical forecast models given that forecast quality varies with forecast products and issuing time frames. Heat alerts can activate public health interventions, and decisions on issuing them can depend not only on forecasts but also on NWS officials’ awareness of place-specific conditions (e.g., holidays, parades, fairs, major conferences in the area, health status of resident populations, and perhaps air quality forecasts in the future). Thus, local knowledge in both weather and population health, and cooperation among the meteorological, health, and social service sectors, not just forecasted conditions, are critical input in issuing heat warnings and alerts.
This study has a few limitations. First, our findings may be not generalizable to other cities, and further evaluation with data from more cities is needed. Second, some parameter specifications in our data analysis are subjective primarily because no consensus exists on the definition of heat waves and quantification of heat effects, for example, percentile-based heat wave definitions in summer months are different from the heat index used by NWS and may not capture all heat wave days. Duration was not considered in the analysis because of the small number of heat waves in the study period. Third, further work is needed to translate our findings into forecasters’ practice to support their decision making—and potentially improved triggers of heat alerts—because this is beyond the scope of this paper. Local forecasters make forecasts and heat alerts based on multiple forecast products as well as their judgment based on their local knowledge of historical weather and other factors mentioned above. Fourth, our findings on the impact of forecasts on heat–mortality associations are based on long time series of forecasted weather conditions, on average. In practice, forecasters look at forecasts a few days in advance and weather observations in just the previous days, so additional analyses, beyond the scope of this paper, would be required to evaluate the application of forecast data as heat alert triggers. One obstacle to this type of analysis is the difficulty in acquiring clean and reliable historical data sets showing which days were declared heat alerts or heat advisories by the NWS. We attempted to extract such information from archived NWS warnings, watches, and advisories, but we were stymied by the lack of uniformity in these documents. We believe that the development of improved heat alert triggers should account for forecast quality as well as many factors as discussed above. Choosing potentially improved triggers is a challenge because of the lack of consensus on the evaluation criteria as well as on the definitions of heat wave days. Potential criteria include which health outcomes to use as the “sentinel events” (e.g., mortality, hospital admissions, emergency department visits), robustness, false-negative and false-positive rates, effective communication about heat alerts to the public, and, perhaps, economic benefit–cost analysis. A final limitation of our analysis was that we used a mortality data set from which external causes had been previously excluded, thus impairing our ability to examine causes of death such as overdoses and intentional self-harm that may plausibly be linked to extremely hot weather. Future analyses could address questions regarding cause of death.
Examining the impacts of weather forecasts on heat/heat wave–mortality associations and the performance of various forecast products in predicting heat waves is important for designing HHWSs and improving projection of heat-related health risks. Our analysis demonstrates the challenge in predicting health effects of weather extremes based on numerical and statistical forecast models. Forecasts showed higher associations between heat and mortality, and lower associations between heat waves and mortality, than observed weather. The impacts of weather forecast quality on mortality risk depended on forecast product and forecast time frames. Heat and heat wave effects derived from a local operational forecast product were generally closer to those calculated based on observations than those based on other forecast products. Our findings provide insights into issuing heat alerts and suggest that local knowledge on weather and population health are critical factors in HHWSs.
We thank Z. Zhang for assistance in data preparation.
This research was funded through support of the Graham Environmental Sustainability Institute and the Risk Sciences Center at the University of Michigan, the U.S. Environmental Protection Agency Science to Achieve Results (grant R832752010), the Centers for Disease Control and Prevention (grant R18 EH 000348), and the National Institute of Environmental Health Sciences (NIEHS; grants ES-016932 and ES-020695). J.D.S. was supported by NIEHS (grant R21-ES020695).
This paper does not necessarily reflect the views of the supporting organizations.
The authors declare they have no actual or potential competing financial interests.