We describe a method for comparing the ability of different alert threshold algorithms to detect malaria epidemics and use it with a dataset consisting of weekly malaria cases collected from health facilities in 10 districts of Ethiopia from 1990 to 2000. Four types of alert threshold algorithms are compared: weekly percentile, weekly mean with standard deviation (simple, moving average, and log-transformed case numbers), slide positivity proportion, and slope of weekly cases on log scale. To compare dissimilar alert types on a single scale, a curve was plotted for each type of alert, which showed potentially prevented cases versus number of alerts triggered over 10 years. Simple weekly percentile cutoffs appear to be as good as more complex algorithms for detecting malaria epidemics in Ethiopia. The comparative method developed here may be useful for testing other proposed alert thresholds and for application in other populations.

Accurate, well-validated systems to predict unusual increases in malaria cases are needed to enable timely action by public health officials to control such epidemics and mitigate their impact on human health. Such systems are particularly needed in epidemic-prone regions, such as the East African highlands. In such places, transmission is typically highly seasonal, with considerable variation from year to year, and immunity in the population is often incomplete. Consequently, epidemics, when they occur, often cause high illness and death rates, even in adults (

A number of such systems have been proposed or implemented, but the comparative utility of these systems for applied public health purposes has not been rigorously established. For example, the World Health Organization has advocated the use of alerts when weekly cases exceed the 75th percentile of cases from the same week in previous years (

Another approach, known as early warning, attempts to predict epidemics before unusual transmission activity begins, usually by the use of local weather or global climatic variables that are predictors of vector abundance and efficiency, and therefore of transmission potential (

We describe a method for evaluating the public health value of a system to detect malaria epidemics. We use this method to evaluate several simple early detection systems for their ability to provide timely, sensitive, and specific alerts in a data series of weekly case counts from 10 locations in Ethiopia for approximately 10 years. The fundamental question we address is whether detecting excess cases for 2 weeks in a row, under a variety of working definitions of "excess," can be the basis for a system that anticipates ongoing excess malaria cases in time for action to be taken.

We collected datasets consisting of weekly parasitologically confirmed malaria cases over an average of 10 years from health facilities in 10 districts of Ethiopia (

District | Follow-up (y) | Daily microscopically confirmed cases | |||
---|---|---|---|---|---|

Mean | SD | Minimum | Maximum | ||

Alaba | 11.3 | 39.0 | 27.3 | 0 | 163.0 |

Awasa | 7.7 | 11.3 | 11.0 | 0 | 77.4 |

Bahirdar | 7.3 | 22.1 | 15.2 | 0 | 83.3 |

Debrezeit | 11.2 | 25.3 | 25.8 | 0.9 | 146.7 |

Diredawa | 9.8 | 25.3 | 29.5 | 0.4 | 329.9 |

Hosana | 11.3 | 19.4 | 17.4 | 0.1 | 95.7 |

Jimma | 10.3 | 13.2 | 14.0 | 0.3 | 85.3 |

Nazareth | 9.3 | 17.7 | 16.0 | 0 | 109.3 |

Wolayita | 9.3 | 13.9 | 12.1 | 0 | 113.1 |

Zeway | 8.3 | 22.0 | 17.5 | 1.1 | 102.0 |

We investigated four classes of algorithms for triggering alert thresholds. In each case, an alert was triggered if the defined threshold was exceeded for 2 consecutive weeks. (This choice is intended to improve the specificity of the alert system for any given threshold.) If another alert was triggered within 6 months, it was ignored, on the assumption that intervening after the first alert would prevent another epidemic within the next 6 months. For the purposes of historically based thresholds (1 and 2 below), the thresholds for each year were calculated on the basis of all other years in the dataset for a given health facility, excluding the year under consideration.

The threshold was defined as a given percentile of the case numbers obtained in the same week of all years other than the one under consideration. The use of percentile as alert threshold is straightforward, and the method is relatively insensitive to extreme observations.

We defined the threshold as the weekly mean plus a defined number of SDs. Mean and SD were calculated from case counts, smoothed case counts, or log-transformed case counts.

Some studies have indicated that the proportions of positive slides were significantly higher than the usual rate during epidemics (

We hypothesized that rapid multiplication of the number of normalized cases from week to week might signal onset of an epidemic. To test this hypothesis and the usefulness of detecting such changes as a predictor of epidemics, we defined a set of alert thresholds on the basis of the slope of the natural logarithm of the number of normalized cases. An advantage of the slide positivity and log slope methods over the others is that they can, in principle, be used to construct alert thresholds in the absence of retrospective data.

To circumvent the difficulties inherent in defining a "true" epidemic and to compare the properties of these thresholds on a scale that reflects the potential, operational uses of alert thresholds, we evaluated each alert threshold algorithm for the number of alerts triggered and the number of cases that could be anticipated and prevented ("potentially prevented cases") if that alert threshold were in place. Potentially prevented cases (PPC) for each alert were defined as a function of the number of cases in a defined window starting 2 weeks after each alert (to allow for time to implement control measures). The window of effectiveness was assumed to last either 8 or 24 weeks (to account for control measures whose effects are of different durations). Since no control measure would be expected to abrogate malaria cases completely, we considered two possibilities for the number of cases in each week of the window that could be prevented: 1) cases in excess of the seasonal mean and 2) cases in excess of the seasonal mean minus 1 SD. When the observed number of cases in a week is less than the seasonal mean or the seasonal mean minus the SD, PPC is set to a minimum value of zero for that week.

Method for calculating potentially preventable cases (PPC) by using weekly mean. PPC is obtained from cases in excess of the weekly mean with an 8-week window.

To compare the performance of dissimilar alert types on a single scale, a curve was plotted for each type of algorithm that showed mean percent of PPC (%PPC) over all districts versus average number of alerts triggered per year, with each point representing a particular threshold value. "Better" threshold types and values are those that potentially prevent higher numbers of malaria cases with smaller numbers of alerts.

To evaluate the improvement in timing of alerts provided by each of these algorithms, we calculated PPC for alerts chosen on random weeks during the sampling period. We also made comparisons to two alert-generating policies that could not have been implemented but are in some sense optimal in hindsight. First, we evaluated a policy of triggering one alert each year on the "optimal" week, i.e., the week with the maximum value of PPC. The value of PPC corresponding to the optimal week simulated an "optimally timed" policy of annual interventions; thus, it represents one alert every year. Second, we retrospectively went through data for each site to identify the optimal timing of alerts if one had perfect predictive ability; namely, we compared PPC for a single alert generated on every week of the dataset and chose the optimal week for one alert; then we went through the remaining weeks and chose the optimal week for a second alert, and so on. This system allowed us to plot an upper bound curve for the best choice of alert times, given a defined alert frequency.

The dataset consists of a total of 687,903 microscopically collected malaria cases from a health facility in each of 10 districts over an average of 10 years. On average, each of the 10 health facilities treated 11–39 malaria cases daily and >300 cases per day during the peak transmission season (

The number of alerts triggered and %PPC obtained for each level of a threshold by type of algorithm varied in the 10 districts (

Percent of potentially preventable cases (PPC) by number of alerts per year for different algorithms. (A) and (B) were obtained from cases in excess of the weekly mean with window of effectiveness of 8 and 24 weeks, respectively. (C) and (D) were obtained from cases in excess of the weekly mean minus one SD for window of 8 and 24 weeks, respectively. The scale of y-axis is higher for (B) and (D) because they are based on 24 weeks of PPC (based on the random alert, the %PPC for the 24-week window is three times that of the 8-week window of effectiveness).

The alert threshold algorithm based on percentile performed as well as or better than the other algorithms over the range of number of alerts triggered that we examined. For a given number of alerts triggered, it prevented a greater %PPC compared to other methods. Relative to optimally timed alerts, the percentile algorithm performed well, within 10% to 20% of the best achievable performance. The slope on log scale algorithm performed slightly better than the random but much worse than the other algorithms.

Threshold algorithms defined as the weekly mean plus SDs based on different forms of the data (normalized case counts, smoothed case counts, or log-transformed case counts) performed similarly, except that the algorithms based on the smoothed cases and log-transformed cases triggered fewer alerts at a given threshold value compared to the algorithm based on normalized cases.

For highly specific threshold values (triggering relatively few alerts), the slide positivity proportion showed a lower %PPC than any other algorithm except the log slope. This pattern was reversed at more sensitive threshold values; slide positivity thresholds of <65% showed a higher %PPC than the other threshold methods for a given number of alerts per year.

The annual alert, which corresponds to intervening every year during a fixed optimal week (generally just before the high transmission season), prevented 28.4% of PPC. However, an equivalent %PPC was prevented by the weekly mean and percentile algorithms with only 0.5 alerts per year.

The preceding numbers refer to the weekly mean with 8-week window assessment (

In all alert threshold algorithms, the %PPC rises with increasing number of alerts and then levels off approximately at 0.4 to 0.6 alerts per year. The interrelationship between levels of percentile used, number of alerts triggered, and %PPC is presented in detail to illustrate the factors that would contribute to choosing a cost-effective threshold value.

District | Six alert threshold levels based on seasonal percentile | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|

95th percentile | 90th percentile | 85th percentile | 80th percentile | 75th percentile | 70th percentile | |||||||

No. alerts | % PPC | No. alerts | % PPC | No. alerts | % PPC | No. alerts | % PPC | No. alerts | % PPC | No. alerts | % PPC | |

Alaba | 0.44 | 18.6 | 0.53 | 20.1 | 0.62 | 24.5 | 0.62 | 23.4 | 0.8 | 28.8 | 0.97 | 30.1 |

Awasa | 0.55 | 28.1 | 0.65 | 28.1 | 0.65 | 35.6 | 0.91 | 32.9 | 0.91 | 32.9 | 1.0 | 20.2 |

Bahirdar | 0.55 | 27.9 | 0.55 | 27.9 | 0.55 | 27.9 | 0.82 | 37.6 | 0.82 | 38.4 | 0.82 | 38.4 |

Debrezeit | 0.27 | 19.8 | 0.54 | 28.5 | 0.54 | 37 | 0.54 | 36.2 | 0.54 | 36.2 | 0.8 | 39 |

Diredawa | 0.61 | 25.2 | 0.61 | 26.6 | 0.82 | 26.9 | 0.82 | 26.9 | 1.1 | 31 | 1.1 | 31.1 |

Hosana | 0.35 | 25.6 | 0.62 | 32.6 | 0.71 | 33.5 | 0.8 | 34.3 | 0.97 | 28 | 0.97 | 28 |

Jimma | 0.39 | 24.9 | 0.39 | 24.9 | 0.78 | 31.8 | 0.87 | 33.1 | 0.97 | 32.1 | 0.87 | 31.9 |

Nazareth | 0.54 | 33.9 | 0.54 | 33.9 | 0.86 | 34 | 0.86 | 34 | 1.1 | 18.7 | 1.2 | 19.7 |

Wolayita | 0.54 | 24.8 | 0.54 | 24.8 | 0.86 | 30.5 | 0.86 | 30.5 | 0.97 | 29.9 | 1.1 | 29.9 |

Zeway | 0.36 | 30.2 | 0.36 | 30.2 | 0.84 | 37.2 | 0.84 | 37.2 | 0.84 | 36.4 | 0.84 | 28.5 |

Total | 0.46 | 25.9 | 0.53 | 27.8 | 0.72 | 31.9 | 0.79 | 32.6 | 0.9 | 31.2 | 0.97 | 29.7 |

Percent of potentially preventable cases (PPC) obtained using weekly and monthly data with an 8-week window.

We have described a novel method for evaluating the performance of malaria early detection systems for their ability to trigger alerts of unusually high malaria case numbers with sufficient notice so that control measures can be implemented in time to have an effect on the epidemic. By defining the performance of an algorithm in terms of the potentially prevented cases falling in a given time window after the alerts are generated, we attempted to capture the public health value of an alert system, which is its ability to predict excess malaria cases. Given the same number of alerts triggered by different potential detection algorithms, the objective is to identify an alert threshold algorithm that triggers alerts at the beginning of unusually high transmission periods, on the assumption that such periods are the ones in which interventions are likely to prevent the most cases.

Given the wide variations in malaria transmission, no standard expectation exists about what proportion of cases can be averted with what intervention. With the assumption that the magnitude of the effect of an intervention would be related to the difference between the observed number of cases and size of the long-term seasonal mean and SD, we calculated PPC. In other words, we assumed that an intervention would lower the number of cases towards the underlying seasonal mean or, if very effective, to l SD below the underlying mean. The sensitivity of the relative performance of the different algorithms was tested by using different window periods (8 or 24 weeks) of effectiveness of possible intervention methods. These window periods are based on the duration of effects of common interventions, such as insecticide spraying, which have residual activity of 8 to 24 weeks (

At relatively smaller number of alerts triggered, threshold algorithms based on percentile anticipated the highest percentage of the potentially preventable malaria cases of all approaches. The percentile algorithm's good performance relative to the optimally timed alerts indicates that it triggers alerts at the beginning of epidemics rather than in the middle of ongoing epidemics. Given the attractive characteristics of the percentile algorithm, a further question is what percentile level one should use. Beyond 0.4 to 0.6 alerts/year, the %PPC leveled off because most of the peaks with higher numbers of cases, possibly epidemic periods, were detected with fewer alerts by using 85th to 90th percentiles. The leveling off of %PPC occurs because we assume that an alert triggered at week t, which leads to application of intervention measures, will prevent another alert until week t + 24. In practical terms, an intervention initiated after an alert was triggered by a less-specific alert threshold during relatively lower transmission might provide little benefit for a community in reducing malaria transmission, especially if it consumed scarce resources that would then be unavailable during periods of higher transmission.

In situations in which cost is not an issue and yearly application of preventive measures is possible, slide positivity proportion could be recommended. It performed as well as or better than all types of algorithms when all algorithms were set to trigger an average of one alert per year. During malaria epidemics, the slide positivity proportion becomes very high (

Comparative performance of different alert thresholds was insensitive to the length of the window and the choice of function to define potentially prevented cases. This study indicated the use of weekly data rather than monthly data in constructing threshold methods and in follow-up prevented more cases, consistent with the World Health Organization's recommendations (

A key limitation of our study was that the use of a long-term measure of disease frequency from a retrospective dataset assumes that the long-term trend did not change significantly and that the method of data collection remained the same. Factors such as change of laboratory technician affect the number of slides that are judged positive for malaria parasites. Such changes should be considered, and revising the threshold values frequently with the most recent data and standardized training of laboratory technicians are advisable. Moreover, existing interventions (which may, in some places, have been based in part on algorithms of the sort we considered) could also interfere with the trend. In this analysis, we did not exclude epidemic years from the data since, on the one hand, we do not have a standard definition of malaria epidemics and, on the other hand, all possible data points should be used to calculate measures of disease frequency and scatter to come up with potential threshold levels unless the data points were considered as outliers.

We deliberately chose to evaluate only simple, early detection algorithms, rather than more complex ones that might require climate or weather data or complicated statistical models. In the dataset we considered, the best of these simple algorithms performed quite well relative to the best possible algorithm, which suggests that they may be adequate for many purposes. In principle, the method we propose could easily be applied to evaluate more complex, early warning algorithms and to test whether their added complexity results in substantially better performance. It is an open question whether the same methods would work as well in localities (or for diseases) with different patterns of variation in incidence, for example, in those with less pronounced seasonal peaks in incidence.

In conclusion, we have shown that simple weekly percentile cutoffs appear to perform well for detecting malaria epidemics in Ethiopia. The ability to identify periods with a higher number of malaria cases by using an early detection method will enable the more rational application of malaria control methods. The comparative technique developed in this study may be useful for testing other proposed alert threshold methods and for application in other populations and other diseases.

A team from the Ethiopian National Malaria Control Program visited different health facilities to look for complete epidemiologic data with consistent malaria case definitions. After the field trip, health facilities with relatively high-quality recording and information systems were chosen. Health personnel from each facility selected for the study and staff from the National Malaria Control Program were given training on compiling data on illness and death from the existing patient logs.

Raw data for each week (_{hij}_{hij}

Each point in the dataset refers to some measure of malaria prevalence in a given health facility (

_{t}

_{ht}_{ht}_{ht}_{ht}_{t}_{ht}_{ht}_{t}_{ht}_{ht}_{ht}_{ht}_{ht}∕X_{ht}_{hj}_{h} –

T_{hijs}_{hTs}_{phij} –_{h.j}_{hij}_{hj}_{Y}_{hij}_{hij}_{hij}^{₂}]∕[ Z_{hj}

Threshold is exceeded when _{hij}_{hij}_{hij}_{phij}_{phij}

Threshold is exceeded when _{hij}_{hij}_{h}_{ji} = µ_{hij}_{Y}_{hij}_{hij}_{hij}

Normalized counts: the number of normalized weekly cases was used to derive the weekly mean and SD.

Smoothed normalized counts: To improve data smoothness, moving averages _{hij}_{hij}

Log-transformed series: To obtain data with reduced right skew, logged weekly counts _{hij}_{hij}

Slide positivity proportion (_{ht}_{ht}_{ht}_{ht}_{ht}

We defined a set of alert thresholds based on the slope (_{ht}_{ht} = L_{ht} – L_{ht-1}

The threshold is exceeded when _{ht}

Potentially prevented cases (PPC) for each alert were defined as a function (_{1}_{ij}_{ij}_{ij}_{2}_{ij}_{ij}_{ij} - σ_{Yij}_{ij}_{ij}_{Yij}_{1}) or the weekly mean minus the SD (in calculating _{2}), _{k}^{1}_{h}_{Ts} = _{h}_{ji} - m_{hji})2) ^{2}_{h}_{Ts} = _{hji} – [m_{hji} - s_{Y}_{h}_{ji}]),

where Φ_{hTs} = number of alerts triggered by threshold type

For each value of each type of threshold at each health facility, the number of potentially prevented cases was transformed into a proportion (percentage), by adding the number of potentially prevented cases for the alerts obtained and dividing this sum by the sum, over all weeks in the dataset, of the number of potentially prevented cases in that week. Let %PPC_{h}_{Ts} denote percent of PPC_{h}_{Ts} and %PPC_{Ts} denotes the mean of %PPC_{h}_{Ts} from the different health facilities.1) %^{1}_{h}_{Ts} =^{2}_{h}_{Ts}=

∕

(Note: here the t and ij notations are used interchangeably.)

To compare the performance of dissimilar alert types on a single scale, a curve was plotted for each type of algorithm showing mean %PPC vs. average number of alerts triggered per year, with each point representing a particular threshold value.

To calculate the expected PPC for randomly timed alerts, the excess cases under excess case definition (_{k}_{h}_{rΦ} represents the expected PPC for Φ randomly chosen alerts in dataset from health facility h.

To determine the optimal week, we calculated PPC for a policy of triggering an alert automatically during week

To calculate the expected PPC for optimally timed alerts, we followed a recursive procedure. First, we searched through all weeks in the data set and chose the single week on which an alert would have the maximum PPC under a given case definition (_{k}

We also compared the efficiency of the weekly percentile method applied to weekly data vs. the same method applied to monthly data. For this purpose weekly data were converted into monthly data, and alert threshold levels based on the percentile were built and a similar procedure was used, except that an alert was triggered when the observed monthly value exceeded the threshold determined by the method in any single month. For this comparison, we considered PPC formula _{1}, with φ = 8 weeks. A set of computer programs written in Stata to perform the methods presented in this article is available.

We thank the Ministry of Health of Ethiopia for allowing us to access the information, and Andrew Spielman and Christina Mills for comments.

The Fogarty International Center of the National Institutes of Health funded (grant number 5D43TW000918) this study. Financial support for data collection was provided by World Health Organization/RBM. The Ellison Medical Foundation gave support to M.L.

Time series of normalized weekly average daily malaria cases for 10 districts. Years are according to the Ethiopian calendar, in which year y begins on September 11 of year y+7 in the Western calendar.

Percent potentially preventable cases (PPC) by number of alerts per year from all districts for each alert threshold algorithm.

Dr. Teklehaimanot worked as a medical director and zonal malaria control program officer in Ethiopia for 4 years. Since 2001, he has been enrolled in the doctoral program in epidemiology at the Harvard School of Public Health. His primary research interest is in early warning of malaria epidemics.