We introduce a novel mathematical approach to investigating the spread and control of communicable infections in closed communities.

Mathematical modeling has a rich and growing tradition in epidemiology (

Effective measures to control mycoplasma outbreaks are needed to limit the associated illness and substantial costs. Previous work has addressed candidate strategies, including infection control practices to prevent the exchange of respiratory droplets between patients and caregivers, cohorting members of the community who display symptoms of a respiratory infection, and antibiotic prophylaxis of asymptomatic members of the community (

Using a network model approach, we show how data on interactions in real-world communities can be translated into graphs—mathematical representations of networks—and how to predict the course of an epidemic from the structure of a graph. We found that the assignment of caregivers to patient groups is more critical to the course of an epidemic than the cohorting of patients. Within our models, the most effective interventions are those that reduce the diversity of interactions that caregivers have with patients. For example, an institution with many wards can avoid a large outbreak by confining caregivers to work in only one or very few wards.

Here we model an institution with spatially disjointed wards. Patients are confined to a single ward, and caregivers work in one or more wards. Each person or ward is represented by a “vertex” in the graph. “Edges” connect people to the wards in which they reside or work.

Health-care institution network. Each vertex represents a patient, caregiver, or ward, and edges between person and place vertices indicate that a patient resides in a ward or a caregiver works in a ward.

A key property of graphs is their degree distribution. The degree of a vertex is the number of other vertices to which it is connected. In

Throughout this model, we allow transmission to occur between people and places. We do not mean that bacteria actually infect a space by residing on inanimate objects or in the air. Rather, we mean that the person has transmitted the bacteria to another person who resides or works in that place. Conversely, when a place transmits to a person, we mean that the bacterium is transmitted to an uninfected person living or working in that place.

We begin by considering only the caregivers and wards. Later we add the patients to the model. (All notations are defined in the

Notation | Definition |
---|---|

Number of wards in the facility | |

Number of caregivers working in the facility | |

_{w} | Average no. of caregivers working in a ward |

_{c} | Average no. of wards in which a caregiver works |

Probability that a given caregiver works in a given ward | |

_{χ} | Probability that a caregiver works in |

_{χ} | Probability that a ward has |

ƒ_{0}( | Probability generating function (pgf) for the degree distribution of caregivers |

g_{0}( | pgf for the degree distribution of wards |

ƒ_{1}( | First select a random ward, and then select a random caregiver working there. This expression represents the pgf for the number of other wards in which that caregiver works. |

g_{1}( | First select a random caregiver, and then select a random ward associated with that caregiver. This expression represents the pgf for the number of other caregivers working in that ward. |

_{w} | Probability of transmission from a ward to a caregiver |

_{c} | Probability of transmission from a caregiver to a ward |

Ф_{0}(x) | pgf for the number of wards affected by transmission from a random caregiver |

Ф_{1}(x) | First select a random ward and assume that it is affected by the bacterium, then select a random caregiver working there. This expression represents the pgf for the number of other wards affected by that caregiver. |

Γ_{0}( | pgf for the number of caregivers affected by transmission from a random ward |

Γ_{1}( | First select a random caregiver and assume he/she is infected, then select a random ward in which that caregiver works. This expression represents the pgf for the number of other caregivers infected by individuals working/living in that ward. |

_{} | Average number of wards affected in an outbreak |

1 - _{c} | The size of the caregiver giant component—the largest set of infected caregivers that are all connected through work in common wards |

_{w} | The size of the ward giant component—the largest set of affected wards that are all connected through common caregivers |

β_{w}( | pgf for the number of patients in affected ward |

pgf for the total number of patients in the facility who are infected during an epidemic |

Pgfs can be mathematically manipulated to give many useful results. For example, the derivative gives the average of the distribution, e.g., the mean number of wards assigned to a caregiver, or the mean number of caregivers working in a ward. We can also answer the following question using pgfs: If an infected caregiver exposes a ward, how many other caregivers, on average, will be vulnerable to infection because they also work in that ward? Appendix A defines our pgfs and describes the derivations that answer this question.

Transmission of

We derive two complementary estimates for the size of an outbreak. The first is appropriate for conditions not conducive to large outbreaks, such as a pathogen with low transmissibility, or an institution with few interpersonal interactions. The second applies to conditions that favor large outbreaks.

We begin with two questions. If a healthy caregiver works in an infected ward, how many other wards will eventually become infected as a result of that caregiver’s interaction with that ward? Similarly, if an infected caregiver works in a yet uninfected ward, how many other caregivers will eventually become infected as a result of that caregiver’s activity in that ward? Answers to these questions vary from ward to ward and from caregiver to caregiver. Therefore, we calculate probability distributions for the spread, which we represent by using pgfs.

First, consider an edge linking an infected ward to a caregiver.

Future transmission diagram I, summing all possible future transmissions stemming from a caregiver who works in an infected ward.

Next, we start with an edge from an infected caregiver to a ward. As shown in

Future transmission diagram II, summing all possible future transmissions stemming from a ward in which an infected caregiver works.

With these two pgfs, we derive the average size of a small outbreak, starting from a single infection:

_{}

Where ƒ′ denotes the first derivative of _{c}_{w}_{0}(1) ). The term ƒ′_{1}(1) assumes that we choose any ward at random from the entire network, then choose one of the edges connected to that ward at random, then follow that edge to a caregiver, and finally calculate the number of other wards assigned to the caregiver. On average, that will be ƒ′_{1}(1). Likewise g′_{1}(1) is the average number of other caregivers working in a ward that we reach by first choosing a caregiver at random and then randomly choosing one of the wards in which the caregiver works. These terms contain information not only about the average degrees of caregivers and wards but also about the probability that a given caregiver or node will become infected in the first place.

The expression for _{} diverges when

_{w}τ

_{c}ƒ′

_{1}

_{1}

This expression represents the transition between a regime in which only small isolated outbreaks of disease can occur and one in which a full-blown community-wide epidemic can occur. A community will cross that transition point if transmission rates are sufficiently high (_{w}_{c}_{ƒ}′_{1}(1)and g′_{1}(1) ). Equation no. 1 provides an estimate of the epidemic size below the threshold only. It is based on the assumption that interactions are rare enough that a person or a place only encounters the infection once. When interactions are more common and the community lies above the epidemic transition, we must use a different estimate for the size of the outbreak.

The “giant component” of the graph is the largest connected set of vertices that have all been infected. The size of the outbreak above the epidemic transition is exactly equal to the number of vertices in this giant component. We calculate the size of the giant component _{c}

_{c =}

_{0}(1) ,

where Ф_{0}(1) is the probability that an infected caregiver will produce no further infections (Appendix B). A similar expression describes the number of wards affected in an epidemic:

_{w}= Γ

_{0}(1).

These expressions reflect both the fraction of the population infected and the probability that an outbreak will reach epidemic proportions in the first place. Since _{c}_{w}

Equation nos. 3 and 4 allow us to estimate the size of an epidemic on the basis of transmission probabilities and the degree distribution of caregivers to wards. To make specific numerical predictions, we must first calculate pgfs for the degree distributions. Here we make the simple assumption that the degree distributions follow a Poisson distribution for both the number of wards associated with a given caregiver and the number of caregivers associated with a given ward. This assumption is equivalent to requiring that all caregivers have an equal likelihood of working in any ward and that a caregiver is assigned to any given ward independent of his or her other ward assignments. In the absence of more specific information about assignment to wards, this assumption seems a reasonable first step. This distribution assumes an infinite population and is generally applied to very large populations. Although perhaps not the ideal model for small institutions, this distribution is used here because it yields pgfs with convenient mathematical properties (see Appendix C).

Data gathered by the Centers for Disease Control and Prevention (CDC) during a recent mycoplasma outbreak allowed us to extract values for the parameters in our theory. In 1999, an outbreak of mycoplasma pneumonia occurred in a psychiatric institution (

We assumed that each patient was confined to a single ward. While this was not true for all patients at the institution, it simplified the mathematics and allowed us to make a reasonable approximation of the epidemiology. Interactions between patients in separate wards will increase the threat of a full-blown epidemic and make early intervention all the more critical. Including such interactions in the model is possible by adding edges to the graph that connect patients to multiple wards. This scenario can be solved exactly by using techniques similar to those presented here.

If we assume that the degree distributions for wards and caregivers are Poissonian, the epidemic threshold (equation no. 2) is equivalent to τ_{w} τ_{c} μ_{w}μ_{c}=1.

In other words, when the product of the transmission rates, the average number of caregivers per ward, and the average number of wards per caregiver exceeds 1, epidemics become possible. In the psychiatric institution, _{} and the threshold becomes _{}.

_{c} = 1,2,3,4,5 ). For the most densely connected case, when each caregiver works in five wards on average, the epidemic threshold is crossed at very low rates of transmission. When the community is less densely connected, it can withstand much higher infectivity without giving rise to epidemics.

Epidemic thresholds. Each line assumes a different value for _{c}_{c}_{w}_{c}_{c}_{c}_{c}_{c}

Combining equation no. 2 with equations 5, 6, 7, and 8 from Appendix C, we derived the following:

_{0}(1) = exp[

_{w}

_{c +}τ

_{c}

_{c}

_{w +}τ

_{w}

_{0}(1) - 1 ] -1) ].

Given values for demographic parameters _{c}_{w}_{0}(1) that satisfies equation no. 9 numerically. Then, the predicted number of caregivers infected during an epidemic is _{c} = 1-_{0}(1). (The number of affected wards is similarly derived.) Since we know neither the exact distribution of caregivers in wards nor the transmission rates between caregivers and wards, we solve for the size of the epidemic outbreak in a range of values of the three independent parameters _{c}_{c}_{w}

_{c}_{c} = 0.6_{w} =0.06_{c}_{c}

Size of epidemic. Predicted and actual number of caregivers and wards affected in an outbreak. These predictions assume that the transmission rate from caregivers to wards is _{c}_{w}

This analysis suggests that the likelihood of an epidemic and the eventual size of an epidemic, should one occur, are highly sensitive to the degree distribution for caregivers. Transmission of

The derivations given here are exact in the limit of large network size. To assess their accuracy on networks like these with a few hundred vertices, we have constructed specific graphs that realize these distributions and performed computer simulations of the spread of epidemics on them. Each simulation constructs a network with 15 wards and 440 caregivers, where the degree distribution of each caregiver is binomial with _{c}_{c}= 14 days (for caregivers) and _{w} = 21 days (for wards) and that contact between a caregiver and a ward occurs independently of any other such contact. Initially a single, randomly chosen caregiver is infected. Every day, transmission occurs from an infected caregiver to a connected ward with probability _{c}_{}. Likewise, the daily transmission rate from an affected ward to a healthy caregiver that works there is _{}.

Simulated outbreak sizes. Frequency distributions of the numbers of wards and caregivers affected in 1,000 epidemic simulations are shown for _{c}

Comparing derivations to simulation. This graph compares the analytical predictions to the size of a simulated outbreak averaged over 1,000 simulations for each value of _{c}

Our numeric method also allows us to pinpoint transmission rates that are consistent with the empirical observations. Assuming the average caregiver works in one to four wards, we identify transmission rates that predict the observed numbers of affected caregivers and wards. We find that _{c}_{w}

Based on the outbreak data, the probability that a particular patient will become infected if at least one other patient in the ward is infected is 0.15 (0.02) for confirmed cases or 0.23 (0.02) when probable cases are included.

Distribution of transmission rates and ward sizes in the psychiatric institution.

We simulate the spread of

Simulated spread of

Network theory enables epidemiologists to model explicitly and analyze patterns of human interactions that are potential routes for transmission of an infectious disease. The statistical properties of an epidemic graph determine the extent to which an infectious agent can spread. By manipulating the structure of a graph, we can identify interventions that may dramatically alter the course of an epidemic, or even prevent one altogether, and translate them into measures that make sense in a real community. In this paper, we have used network methods to model the spread of a respiratory tract infection in a health-care facility.

How might this be applied to a real outbreak? We have considered data from a recent investigation of an outbreak of

In both the outbreak and our model (assuming parameters based on this particular institution), caregivers are less likely to become infected than are patients. This observation may mislead investigators and lead to inappropriate recommendations. Although caregivers are less likely to become ill, they are the primary vectors of infection in the facility. Our model suggests that transmission rates from patients to caregivers are lower than transmission rates from caregivers to patients. Therefore, once a caregiver is infected with

We suggest two complementary strategies: limit the number of wards with which caregivers interact, and reduce the probability that caregivers become infected through, for example, respiratory droplet precautions. This strategy limits the time and cost of laboratory testing as well as the risks for antibiotic use in uninfected persons. The activity of some ancillary staff (e.g., physical therapists and nutritionists) cannot be limited to a select number of wards. In these cases, alternative precautions against transmission of

We conclude with three caveats. First, the epidemic model includes all infections, even those that do not result in symptoms. Most persons with

Second, for mathematical tractability, our model assumes random (Poissonian) assignment of caregivers to wards. The quantitative (but probably not qualitative) results would differ under different degree distributions. In the future, we hope to analyze distributions taken from actual health-care institutions, when available.

Third, because of the long incubation period of

The theoretical tools are in place for building community-specific networks and analyzing the transmission of infectious diseases on these networks. Our approach enables mathematical experiments, in which the inputs are interventions—structural reorganization, cohorting, treatment, and the like—and the output is predictions about the spread of a disease (or lack thereof) on the network. This approach can both aid the development of general measures and lend insight into specific scenarios in which intervention is still possible.

Let _{χ}_{χ}

_{0}(

_{Χ}Χ

^{Χ}

_{0}(Χ) = ∑

_{χ}Χ

^{Χ}

Since _{χ}_{χ}_{} and _{0}(1)=1. The generating functions contain all the same information as the probability distributions but in a form that will be more convenient for our purposes. We can always recover the probability distributions again by differentiation _{}.

If we assume that each of _{w}_{c}_{0}(1) = _{c}_{0}(1) = _{w}

Suppose we now choose a caregiver at random and follow an edge to a ward in which the caregiver works. The pgf for the number of caregivers working this ward is _{}. Hence the distribution of caregivers working in this ward _{}.

Likewise, if we start from a specific ward and choose a random caregiver working in that ward, then the number of _{}.

We denote the probability of transmission from a caregiver to a ward as _{c}_{w}

_{1}(

_{w}

_{w}x

_{1}+

_{w}x

_{2}Γ

_{1}(

_{w}

_{w}x

_{1}(Γ

_{1}(

where p̃_{i} is the probability that the caregiver transmits the infection to _{1}(_{1}(_{0}(_{0(}Γ_{1}(

Next, the generating function for the cluster of infections arising from a randomly chosen edge from a person to a ward is thus Γ_{1}(_{c}_{c}_{1} (Ф_{1}(_{0}(_{0} (Ф_{1}(

Substituting into the formulas for Ф_{0}(x) and Ф_{1}(x), we find Ф_{0}(x) = _{0}[1-_{c}_{c}_{1}_{1}(_{1}(_{w}_{w}x_{1} [1- _{c}_{c}g_{1}_{1}(_{}, we differentiate Ф_{0}(1):

_{ =}Ф′

_{0}= ƒ

_{0}(1-

_{c}

_{c}g

_{1}

_{0}(1-

_{c}

_{c}g

_{1}

_{c}

_{1}(1)Ф′

_{1}(1) = 1 +

_{c}

_{0}(1) g′

_{1}(1)Ф′

_{1}(1)

Now, solving for Ф′_{1}(x), we find Ф′_{1}(x) = _{w}_{1}[ 1- _{c+} τ_{c}_{1}(Ф_{1}(_{w}x_{1}[ 1 - _{c}_{c}_{1}(Ф_{1}(_{c}_{1} (Ф_{1}(_{1}(_{}. We thereby arrive at the following expression for average outbreak size:

_{}.

Turning next to the size of the giant component, we know that 1 - _{c =}_{0}(1) = ƒ_{0}(1 -_{c}_{c}_{1}(Ф_{1}(1). Hence _{c}_{0}(1 -_{c}_{c}_{1}(Ф_{1}(1). Likewise 1 - _{w}_{0}(1) = g_{0}(1-_{w}_{w}ƒ_{1}_{1}(1)) implies _{w}_{0}(1 - _{w}_{w}ƒ_{1}_{1} (1)).

If the probability that a given caregiver works in some ward is

_{}.

Substituting for _{}. In the limit of a large number of wards, the binomial distribution approaches a Poisson distribution, and the generating function for the Poisson distribution is

_{}

Likewise, in the limit of many caregivers, g_{0} (^{μw}

Performing a bit more mathematical legwork, we find that

_{}

and similarly g_{1}(_{0}(_{c}

_{}

We calculate these rates by averaging the fraction of infected patients per ward across the 15 wards and compute the error by taking the standard deviation of these fractions, divided by the square root of the sample size.

We thank Joel Ackelsberg, Rich Besser, Terri Hyde, Catherine Macken, Mary Reynolds, and Deborah Talkington for their valuable insights and their help interpreting data from previous mycoplasma outbreaks.

This work was supported in part by a National Science Foundation Postdoctoral Fellowship in Biological Informatics to L.A.M. and National Science Foundation Grant DMS-0109086 to M.E.J.N.

Dr. Meyers is an assistant professor in the Section of Integrative Biology at the University of Texas at Austin. She uses a combination of theoretical, computational, and experimental approaches to research the evolution and spread of microbial communities.