Syndromic surveillance has been widely adopted as a real-time monitoring tool for timely response to disease outbreaks. During the second wave of the pH1N1 pandemic in Fall 2009, two major universities in Washington, DC collected data that were potentially indicative of influenza-like illness (ILI) cases in students and staff. In this study, our objectives were three-fold. The primary goal of this study was to characterize the impact of pH1N1 on the campuses as clearly as possible given the data available and their likely biases. In addition, we sought to evaluate the strengths and weaknesses of the data series themselves, in order to inform these two universities and other institutions of higher education (IHEs) about real-time surveillance systems that are likely to provide the most utility in future outbreaks (at least to the extent that it is possible to generalize from this analysis).
We collected a wide variety of data that covered both student ILI cases reported to medical and non-medical staff, employee absenteeism, and hygiene supply distribution records (from University A only). Communication data were retrieved from university broadcasts, university preparedness websites, and H1N1-related on campus media reports. Regional data based on the Centers for Disease Control and Prevention Outpatient Influenza-like Illness Surveillance Network (CDC ILINet) surveillance network, American College Health Association (ACHA) pandemic influenza surveillance data, and local Google Flu Trends were used as external data sets. We employed a "triangulation" approach for data analysis in which multiple contemporary data sources are compared to identify time patterns that are likely to reflect biases as well as those that are more likely to be indicative of actual infection rates.
Medical personnel observed an early peak at both universities immediately after school began in early September and a second peak in early November; only the second peak corresponded to patterns in the community at large. Self-reported illness to university deans' offices was also relatively increased during mid-term exam weeks. The overall volume of pH1N1-related communication messages similarly peaked twice, corresponding to the two peaks of student ILI cases.
During the 2009 H1N1 pandemic, both University A and B experienced a peak number of ILI cases at the beginning of the Fall term. This pattern, seen in surveillance systems at these universities and to a lesser extent in data from other IHEs, most likely resulted from students bringing the virus back to campus from their home states coupled with a sudden increase in population density in dormitories and lecture halls. Through comparison of data from different syndromic surveillance data streams, paying attention to the likely biases in each over time, we have determined, at least in the case of the pH1N1 pandemic, that student health center data more accurately depicted disease transmission on campus at both universities during the Fall 2009 pandemic than other available data sources.
In the spring of 2009, a novel H1N1 influenza virus, now denoted pH1N1, emerged in North America and spread to the rest of the world in less than two months [
Concerned with the re-emergence of the virus when students returned at the end of the summer, IHEs realized the need for surveillance systems capable of providing real-time situational awareness to guide the implementation of preventive measures to protect students' health and contingency plans to maintain basic educational functions. During the pH1N1 pandemic, the President's Council of Advisor on Science and Technology (PCAST) was recommending using syndromic surveillance, which, by using pre-diagnostic data, is thought to have a distinct advantage over the traditional surveillance method in terms of timeliness [
The validity and utility of syndromic surveillance, however, or of particular types of syndromic data, are not well understood [
We collected a wide a variety of data that covered both student influenza-like illness (ILI) cases reported to student health center (SHC) and hospital emergency department (ED) visits at both universities, and student ILI cases reported to non-medical staff, employee absenteeism, and hygiene supplies distribution records from University A. Unless otherwise noted below, all data series were available on a weekly basis. The sources are described in detail in Additional File
University broadcasts, preparedness website updates, and H1N1-related on-campus media reports were retrieved from emails, web pages and paper prints available on campus. H1N1-related messages were classified into five major categories for university A, which included information about the advice line, presence of flu cases, vaccination, instructions on voluntary reporting to deans and the availability of personal hygiene supplies. For university B, seven categories were available, including student health center data, hospital emergency department visits, pH1N1 hotline calls, vaccine and personal hygiene supplies, as well as the requirement for a physician's note for excused absences due to illness. All messages were counted based on their appearance in any of the media sources outlined above. In addition, relevant policies were collected and reviewed by interviewing key staff members.
The total number of ILI visits to the student health service and telephone consultations were collected using the following case definition: fever (> 100 F) AND (cough and/or sore throat) in the absence of a known cause other than influenza. Student identification data were reviewed by SHC staff to ensure that individuals were counted only once. Data were available on a daily basis from August 29, 2009 to April 30, 2010, and were aggregated by adding cases in each 7-day period from Saturday to the following Friday.
The number of clinic visits of ILI patients aged 17-24 years old at the EDs of hospitals associated with both universities was obtained from the ED electronic health records in the aggregate (number of cases/week). University A's ED data were retrieved based on the following criteria: age 17-24 years and fever, with other causes of fever than influenza manually filtered out. University B's ED visits for ILI were counted using following criteria: age 17-24 years old and a chief complaint of "flu" or "fever," or a discharge diagnosis of "influenza" or "viral syndrome." Student status was not available from either university ED. Data were available on a daily basis from August 29, 2009 to April 30, 2010, and were aggregated by adding cases in each 7-day period from Saturday to the following Friday.
The number reported includes ILI cases in student athletes (as reported by their team trainers), student ILI cases self-reported to the deans of all four undergraduate schools, and ILI cases reported by resident assistants. Data other than deans' reports were available on a weekly basis from August 29, 2009 to April 30, 2010. Deans' reports were not available from all deans until September 12, 2009.
Employee absenteeism data include real-time reports on ILI-related absences among Facilities Office and Dining Services staff, and employee absenteeism from 2009 and 2008, retrieved retrospectively from a payroll system that tracks employee absences for compensation purposes. The closest available data for non-union employees' were "unscheduled leave" days, whereas for unionized employees it was "sick leave." In order to simplify the analysis, the two data sources were added together with the awareness that the ILI-related absenteeism for both groups of employees may have been overestimated. No faculty members or student workers are represented in this dataset.
Supply distribution data include the aggregate number of pre-packaged meals, masks and thermometers picked up in student resident halls, based on reports from the residence hall offices (RHO). The data were available from August 28, 2009 to April 10, 2010 on weekly basis.
For comparison purposes we used data based on the Centers for Disease Control and Prevention's (CDC) ILINet surveillance data [
For the ACHA data series, the "attack rate" is defined by the ACHA as the number of weekly reports of new cases divided by the number of students in the IHEs' reports that week for each state, which were grouped according to CDC's regional categories (note that this is
To make the data from different sources comparable, all data, including the ACHA data (ILI attack rate), CDC ILINet data (percentage of hospital visits with ILI), and Google Flu Trends data (influenza related web queries) were normalized into an activity index by dividing the actual count in each week by the average count for that data series for the period from August 28 through December 18, 2009, a period for which data were available for all series and reflected the height of the Fall 2009 wave of pH1N1. This analysis is intended to identify the timing of the outbreak on each campus, not the absolute level of cases. In this analysis, we have made the assumption that the number of students is constant throughout the semester, at least relative to the fluctuation in the number of cases.
Because there are no data that describe the actual rates of pH1N1 infection, or its consequences, on the two campuses or the community in which they sit, we adopted a "triangulation" approach in which multiple contemporary data sources, each with different expected biases, are compared to identify time patterns that are likely to reflect biases versus those that are more likely to be indicative of actual infection rates. This public health systems research approach is grounded in the understanding that surveillance data are the result of decisions made by patients, health care providers, and public health professionals about health-care seeking behaviour and provision of health care and reporting suspected or confirmed cases to health authorities. Moreover, every element of this decision-making is influenced by the informational environment (i.e. media coverage, implementation of active surveillance), processing and reacting to the information on an individual level (i.e. the health care seeker's self-assessment of risk, incentives for seeking medical attention and self-isolation, the health care provider's ordering of laboratory tests), and technical barriers (i.e. communication infrastructure for data exchange, laboratory capacity), all of which change constantly.
One of the authors (YZ) had access to some identified data in her efforts to compile data for operational purposes at University A, but all of the analyses for this paper were conducted with aggregate data only, and this research was treated as "exempt" by the IRBs of both universities.
In Panel A of Figure
For University B, as shown in Panel B of Figure
As shown in Figure
Figure
University A's employee absenteeism data are shown in Figure
Figure
The primary limitation of this analysis is the lack of definitive knowledge about the actual number of pH1N1 cases at the two universities - a "gold standard." To address this problem we developed an approach that compares ("triangulates") multiple data systems, each with its own expected biases over time, to identify those that most likely mirror actual disease trends. Epidemiologists are typically aware of these potential biases in a qualitative sense, and present their analysis of the available data with appropriate caveats. In our approach, which benefits from hindsight, we attempt to use information about the likely direction and time patterns of these biases to understand the surveillance system and the validity and utility of different syndromic surveillance data sources. This type of analysis is necessarily qualitative and contextual; rather than serving as a recipe for doing this in other settings, this analysis should be seen as an example that illustrates the concept. This analysis also illustrates to the call in U.S. National Health Security Strategy Implementation Plan (released for public comment in 2010) for the development, refinement, and wide-spread implementation of quality improvement tools, specifically methods "to collect data ... from real incidents ... to identify gaps, [and] recommend and apply programs to mitigate those gaps [
Another limitation of the data analysis is the uncertainty of whether the ILI cases captured by the surveillance system were pH1N1. As recommended by the CDC interim guidelines [
As described in more detail below, this approach suggests that the peak in cases at both universities at the beginning of the semester, a peak not seen in data for the surrounding community, is probably real and a reflection of expected disease dynamics. The lower peak, especially at University A, when pH1N1 was widespread in the community might reflect the removal of susceptible cases earlier in the semester, or simply surveillance fatigue. This analysis also suggests surveillance artifacts - surveillance fatigue and changing incentives driven by the exam schedule - that are likely to influence surveillance data in future outbreaks, and that should be taken into account in the interpretation of these data.
Both universities experienced the first and the highest peak in student ILI cases immediately after Fall semester classes started in early September 2009, which corresponds to peaks found in other universities and colleges in Region 3 (Delaware, the District of Columbia, Maryland, Pennsylvania, Virginia, and West Virginia). It should be noted, however, that both of these universities contributed to the ACHA reports. The CDC ILINet data for the same region and Google Flu Trends data for Washington, DC, on the other hand, did not peak until late October. University A also experienced a second, lower peak in cases with a two-week delay in early November, according to the SHC and ED data. When comparing the SHC and ED data from University A and University B (as shown in Figure
In the comparison between ACHA and CDC ILINet data across all states, the tendency of an early increase in ILI cases among college students in seven out of ten regions, as shown in Additional File
All of the data analysed in this report are based, to some degree, on students and staff taking action based their illness. Such behaviour is driven not only by the fact of being sick, but also by the incentives to report, including perceptions of barriers to help-seeking behaviour (i.e. geographic distance, queuing, chance of exposure to other infected patients), the likely benefit to be gained (medical and non-medical) by reporting, the timeliness of the help to be delivered, as well as the informational environment the students and staff are exposed to. In particular, two factors - surveillance fatigue and reporting incentives - seem capable of explaining some of the patterns in the data.
As seen in Figure
Surveillance fatigue is likely to be more obvious in systems that use human resources not primarily designated for disease prevention and health promotion. For instance, the reports from the RA at University A increased to their highest level in the first week after classes resumed and dropped dramatically afterwards. Although ILI activity could still be observed from other data sources after the second peak through Spring 2010, the reports from RAs completely stopped at the end of November. The RA reporting system might have been sensitive to student ILI cases in the early stages, considering the relatively low barrier of utilizing the resources (close proximity, no queuing), and the expectation of immediate help (supply distribution, accommodation relocation). However, when the reporters and those receiving the reports are all laypersons to public health practice, fading interest can be magnified in the microenvironment between the two parties.
At University A, undergraduate students were instructed to notify their deans about their influenza-like illness as a substitute for medical proof of illness otherwise required to justify absence from class. This was published on August 28, 2009, and not emphasized afterwards. However, as noted in Figure
To translate these results into recommendations for IHEs regarding the design and implementation of surveillance systems for future disease outbreaks, other factors must also be taken into account. For instance, surveillance activities conducted by trained health care workers are more likely to capture actual ILI cases based on clinical findings. Moreover, well-informed healthcare workers who conduct surveillance as part of their regular responsibilities are more likely to maintain a relatively stable and predictable report triggering threshold, in line with the CDC and WHO (World Health Organization) guidelines [
During the 2009 H1N1 pandemic, University A and B both experienced a peak number of ILI cases at the beginning of the Fall term. This pattern, seen in a variety of surveillance systems at these universities and to a lesser extent in data from other IHEs, most likely results from students bringing the virus back to campus from their home states coupled with a sudden increase in population density in dormitories and lecture halls.
Through comparison of data from different syndromic surveillance data streams, paying attention to the likely biases in each over time, we have determined, at least in the case of the pH1N1 pandemic, that student health center data more accurately depicted transmission on campus in both universities during the Fall 2009 pandemic than other available data sources. Although maintaining an unduplicated list from visits and phone calls was time consuming, it was felt to be necessary to manage the situation. Other systems that were used at University A required major staff efforts to collect the data and were apparently less accurate. Reporting systems based on student reports to their deans may be relatively inflated during examination periods or other times when it students need to be formally excused from class, but such systems combined with a liberal excused absence policy (not requiring a physician's note) can help to relieve over-utilization of medical resources for non-medical purposes.
The authors declare that they have no competing interests.
YZ and LM collected data from University A and B, respectively. All authors participated in designing the study, analysing and interpreting the data and drafting the manuscript. All authors read and approved the final manuscript.
The pre-publication history for this paper can be accessed here:
Click here for file
Click here for file
Click here for file
First of all, we would like to thank many staff members at the two universities who helped to collect the data used in this analysis.
This article was developed in collaboration with a number of partnering organizations, and with funding support awarded to the Harvard School of Public Health under cooperative agreements with the U.S. Centers for Disease Control and Prevention (CDC) grant number(s) 5P01TP000307-01 (Preparedness and Emergency Response Research Center). The content of these publications as well as the views and discussions expressed in these papers are solely those of the authors and do not necessarily represent the views of any partner organizations, the CDC or the US Department of Health and Human Services nor does mention of trade names, commercial practices, or organizations imply endorsement by the U.S. Government. We also would like to thank the O'Neill Institute for National and Global Health Law at Georgetown University for their support.