Cluster randomized trials have been utilized to evaluate the effectiveness of human immunodeficiency virus (HIV) prevention strategies on reducing incidence. Design of such studies must take into account possible correlation of outcomes within randomized units.

To discuss power and sample size considerations for cluster randomized trials of combination HIV prevention, using an HIV prevention study in Botswana as an illustration.

We introduce a new agent-based model to simulate the community-level impact of a combination prevention strategy and investigate how correlation structure within a community affects the coefficient of variation–an essential parameter in designing a cluster randomized trial.

We construct collections of sexual networks and then propagate HIV on them to simulate the disease epidemic. Increasing level of sexual mixing between intervention and standard of care communities reduces the difference in cumulative incidence in the two sets of communities. Fifteen clusters per arm and 500 incidence cohort members per community provides 95% power to detect the projected difference in cumulative HIV incidence between standard of care and intervention communities (3.93% and 2.34%) at the end of the third study year, using a coefficient of variation 0.25. Although available formulas for calculating sample size for cluster randomized trials can be derived by assuming an exchangeable correlation structure within clusters, we show that deviations from this assumption do not generally affect the validity of such formulas.

We construct sexual networks based on data from Likoma Island, Malawi and base disease progression on longitudinal estimates from an incidence cohort in Botswana and in Durban as well as a household survey in Mochudi, Botswana. Network data from Botswana and larger sample sizes to estimate rates of disease progression would be useful in assessing the robustness of our model results.

Epidemic modeling plays a critical role in planning and evaluating interventions for prevention. Simulation studies allow us to take into consideration available information on sexual network characteristics, such as mixing within and between communities as well as coverage levels for different prevention modalities in the combination prevention package.

Individual-level HIV prevention approaches, including antiretroviral treatment as prevention, male circumcision, pre-exposure prophylaxis (in some populations) and preventing mother-to-child transmission, have shown efficacy. Efforts are underway to investigate whether combining them can achieve community-level control of HIV infection [

HIV incidence depends on subject-level factors, like risk behavior, and community-level factors, like sexual network characteristics. To reduce the need for treatment, a modified treatment as prevention approach that targets only high viral load carriers is part of a combination prevention strategy that is under study in a cluster randomized trial in Botswana. About 25% of new HIV-1 subtype C infections in southern Africa (where C is most prevalent) maintain high viral load levels for at least 1–2 years and have faster cluster of differentiation 4 (CD4) cell count decline [

Cluster randomized trials investigate both direct and indirect effects of prevention interventions on infectious diseases [

To address the well-known difficulties inherent in estimating

In HIV prevention studies, sample size depends on the magnitude of intervention effect as well as the HIV incidence in the control group, inaccurate estimates of which threaten power. The Mema Kwa Vijana trial of HIV prevention in Tanzania [

This paper describes sample size considerations for cluster randomized trials of combination HIV prevention, motivated by the design of a study in Botswana. We introduce a new agent-based simulation model to simulate the impact of combination prevention strategy and the coefficient of variation, taking into account different levels of the contamination effect. We also investigate how correlation structure within a community affects

The Botswana study investigates whether implementation of a combination of prevention interventions reduces HIV incidence. Villages in Botswana will be randomized into one of the two arms:

“standard of care” with antiretroviral therapy for HIV-infected individuals with CD4<350 cells/mm^{3} or AIDS;

antiretroviral therapy for the subjects above and for those with high viral load (>10,000 copies/ml), enhanced HIV testing and counseling, prevention of mother to child transmission, enhanced linkage of testing to care, and male circumcision.

HIV incidence will be estimated from a cohort identified through a random sample of 20% of households in each community that includes consenting eligible HIV-negative household members who are citizens (or their spouses) between ages 16 to 64 and are able to provide informed consent. Incidence cohort subjects are tested annually for HIV. Ease of logistics is the reason for sampling of households rather than individuals. The choice of a 20% sample represents a trade-off between adequacy of power and restriction of the attenuating effect of home-based testing in standard of care communities. To improve efficiency, the Botswana Study is qualitatively matched on population size, nature of health facilities, age structure, and geographic location; there is no available information matching on predicted incidence, which might be ideal.

Sample size was calculated from a formula developed for matched cluster randomized trials [_{0} and π_{1} are the true proportions of individuals who reach endpoint in the two arms; _{α/2} and _{β} are the usual upper tail normal probabilities. _{m}

To predict cumulative incidence over the study period in communities, we used an agent-based epidemic model - a simulation of the actions and interactions of autonomous agents to assess their effects on an entire system - to simulate the HIV spread on collections of generated sexual networks. Parameter values in the model (see

In our models, the evolution of sexual relationships are represented as a dynamic network, in which each node represents an individual (male or female), and each edge represents a sexual relationship between nodes. The networks are bipartite and only represent relation-ships between opposite genders, reflecting the fact that in Botswana heterosexual contact is believed to be the principle mode of transmission [

In a sexual contact network, the number of edges adjacent to a particular node is called its degree, and the degree distribution can be obtained by the collection of nodal degrees [

Using the methods proposed in Goyal et al. [

In addition to data from the Mochudi study and the Botswana/Durban cohort, our model takes into account community characteristics including population size, varying coverage levels for different prevention modalities, as well as individual characteristics including transmission risk, disease progression, condom use, linkage to care, and circumcision status.

At time 0, the start of the simulation, we set the initial condition for each community. Each eligible individual is assigned an initial HIV infection status based on the current prevalence in Botswana, estimated to be 24.8%, and independently of partnership characteristics or position in the network. Each infected individual is assigned to a viral load category (<400, 400–3,499, 3,500–9,999, 10,000–49,999, or 50,000+ copies/ml) as well as an initial CD4 count based on estimates of their distributions from the household survey in Mochudi. For CD4 counts below threshold for treatment, subjects are modeled as receiving antiretroviral therapy according to estimates from Mochudi. Background antiretroviral therapy coverage for CD4<350 cells/mm^{3} is set at 60.9% at the start based on a recent survey of the Mochudi district in 2011. The percentage of condom use is set as 40% and male circumcision rate at the start, at 12.7%, the estimated rate for Botswana [

Although the sample size formula we used can be derived from models assuming an exchangeable correlation structure within clusters, we find that deviations from this assumption do not affect the validity of the sample size formula. When this assumption is violated, the intraclass correlation ρ does not represent correlation between any two subjects in the same cluster, but instead represents the average correlation of observations from the same cluster. Even with arbitrary variance-covariance structure within cluster, the increase in variance resulting from cluster sampling, commonly measured by the design effect [

When departure from exchangeable correlation structure is expected, it is important that the studies used to estimate _{ij}

The data generating process for a continuous outcome _{ijk}_{j}_{i}, and _{ij}, and _{ij1}, …, _{ijbj}_{ij}).

Under this model,
_{j}_{j}_{1} < _{2} if any of the _{j}

Sexual mixing between intervention and standard of care communities will tend to increase incidence in intervention and decrease it in standard of care communities.

Simulation of the impact of the combination prevention is based on input parameters listed in

To obtain a simulated value of

Fifteen clusters per arm and 500 incidence cohort members per community yields 99% power to detect the anticipated difference in model-projected cumulative HIV incidence between standard of care and intervention communities (3.93% vs. 2.34%; see

We perform sensitivity analyses for scenarios associated with varying model input parameters that differ between standard of care and intervention communities, such as rates of male circumcision, HIV testing and counseling, and/or linkage to care.

Additional sensitivity analyses for scenarios associated with lower than projected treatment effects and varying rates of losses to follow-up (see

Mathematical modeling plays a critical role in planning and evaluating treatment for prevention [

The data on relationship duration exhibit “heaping”, i.e., grouping around certain values (e.g. integers) because subjects may round their responses. We know of no systematic tendency to round up or down responses, but even if it exists, we expect no substantial effect of heaping because the transmission probability per day is small. Patterns of sexual behavior and networking vary across populations. Because sexual network structure information for the communities under study are not available, we allow for considerably greater than observed variation in network structures by sampling degree distribution from a negative binomial distribution whose parameters were estimated from Likoma Island network data.

Our model did not incorporate different types of sexual relationships, e.g., regular and casual, with different frequencies of sex and probability of condom usage; the assumption that variation in these factors does not greatly impact on outcomes reflects limited available information. The impact of the intervention could be affected by differential rates of treatment uptake for people engaged in various types of relationships. The model also does not specifically target concurrency metrics, about which little relevant data are available. Some mathematical models imply an important role for concurrency, but correlation of concurrency and incidence was not observed in rural South Africa [

Although our simulation study assigns initial infection status randomly among the population, correlation may exist between HIV status and network properties. Further work is necessary to properly account for this potential correlation. Data currently available from Botswana are ego-centric, obviating the possibility of estimating the correlation. Using only partnerships residing within the same household may produce biased estimates as multiple partnerships are common in Botswana and many partners are not co-habiting. Ego-centric data also limit our ability to estimate parameters associated with mixing by activity level. Our model also assumes independence of knowledge of HIV infection status and sexual practice due to lack of available information.

Our simulation model randomly samples individuals, but the Botswana study will enroll all eligible members of randomly selected households. We expect the difference between the two sampling strategies to be small because in Botswana, many sexual partners do not live together, implying that correlation in HIV infection rates within household members may not be higher than that between households. If this does not hold, the treatment effect estimate from our model would not be affected, but the

All HIV incident cases are modeled to arise from within the simulated pair of communities. In the Botswana study, communities outside of the trial will receive standard of care. As it is possible that there will be a greater uptake of services in the control arm compared to the communities outside of the trial, sexual contacts with communities outside of the trial may modestly increase incidence in the control arm. For the intervention communities, the effect of mixing with outside communities should be mostly captured through our model of mixing with the control communities, though the effect of this mixing could be slightly greater if incidence is higher in the outside than in the control communities. We would expect only modest effects of mixing with outside communities above and beyond the mixing across study communities randomized to different conditions. Any increase in HIV incidence in control communities will result in a larger treatment effect and greater power than projected.

The Botswana study is one of the two large HIV prevention trials commissioned by the Presidents Emergency Plan For AIDS Relief that are currently underway. The other is HPTN 071 [

This research was supported by R01 AI24643, R01 AI51164, and R01 AI083036 from the National Institutes of Health, and U01 GH000447 from the Centers for Disease Control and Prevention. We thank the Editor, Associate Editor, and three reviewers for their comments, which improved the paper.

A schematic illustration of a static network of 2 communities. Solid circles and open circles represent individuals in different communities. Within each community, the location of circles does not represent their geographical locations.

Histogram of relationship durations and the corresponding Kaplan-Meier estimates in Mochudi.

Cumulative incidence of intervention and standard of care (SOC) communities over the 3-year period with varying levels of mixing, based on input parameters listed in

Number of clusters per arm versus cluster size needed to ensure >90% power to detect anticipated differences in 3-year cumulative HIV incidence between standard of care (3.93%) and intervention arms (2.34%), for varying coefficient of variation k.

Power to detect varying potential reductions of intervention effect in 3-year cumulative HIV incidence with varying rates of losses to follow-up.

Model input parameters to estimate impact of combination prevention package scale-up in intervention communities versus standard of care communities over 3 years

Parameters common to both communities: | |
---|---|

Parameter | Value |

Duration of Partnerships | See |

Degree Distribution | Negative Binomial (r = 5, p = .7, cutoff = 7) |

Probability of Transmission per 100 person-years | |

Viral Load < 400 copies/ml | 1 |

Viral Load 400 – 3499 copies/ml | 4.8 |

Viral Load 3500 – 9999 copies/ml | 12 |

Viral Load 10, 000 – 49, 999 copies/ml | 14 |

Viral Load 50000+ copies/ml | 23 |

HIV prevalence | 24.8% |

Percent on treatment at time 0 among those eligible (CD4 < 350 cells/mm^{3}) | 60.9% |

Reduction in transmission risk from knowledge of serostatus | 30% |

Duration of high viral load after infection | Estimates from the Botswana/Durban cohort |

Rate of CD4 decline | Estimates from the Botswana/Durban cohort |

Reduction in acquisition risk from circumcision | 60% |

Reduction in trans. risk for condoms | 85% |

Percent of individuals using condoms | 40% |

Parameters differ by treatment arm: | ||||||
---|---|---|---|---|---|---|

Standard of Care Arm | Intervention Arm | |||||

HTC | MC | Linkage to Care | HTC | MC | Linkage to Care | |

Baseline | 37% | 12.7% | 80% | 37% | 12.7% | 80% |

End of Year1 | 37% | 31.4% | 80% | 81% | 46.4% | 90% |

End of Year2 | 45% | 50.0% | 80% | 90% | 80% | 90% |

End of Year3 | 52% | 60.0% | 80% | 90% | 80% | 90% |

HIV testing and counseling.

Male circumcision.

The Botswana HIV/AIDS impact survey III results, 2008.

Male circumcision campaigns in standard of care communities will be ongoing, and may reach 60% coverage by the end of year 3 post randomization, if Ministry of Health targets are met.

Assume that the project aims to increase HIV testing and counseling coverage to ≥90% in intervention communities by the end of the second study year and maintain this thereafter.

Assume that the project aims to reach 80% male circumcision coverage in intervention communities by the end of the second study year and maintain this thereafter.

Projected cumulative HIV incidence in standard of care versus intervention communities over 3 years of study follow-up, based on results from 1500 pairs of communities.

Standard of Care | Intervention | ||
---|---|---|---|

Cumulative Incidence | Cumulative Incidence | % Reduction | |

End of Year1 | 1.74% | 1.42% | 18.4% |

End of Year2 | 2.98% | 1.99% | 33.2% |

End of Year3 | 3.93% | 2.34% | 40.5% |

Model input parameters, projected 3-year cumulative incidences and power associated with selected settings of sensitivity analyses, based on results from 1500 pairs of communities.

Setting 1 | Setting 2 | Setting 3 | Setting 4 | |||||
---|---|---|---|---|---|---|---|---|

MC | HTC | Linkage to Care | Varying all three | |||||

SOC | Intervention | SOC | Intervention | SOC | Intervention | SOC | Intervention | |

Baseline | 12.7% | 12.7% | 37% | 37% | 60% | 70% | MC | |

End of Year1 | 31.4% | 46.4% | 37% | 70% | 60% | 70% | ||

End of Year2 | 31.4% | 46.4% | 37% | 70% | 60% | 70% | ||

End of Year3 | 31.4% | 46.4% | 37% | 70% | 60% | 70% | ||

3-Year | ||||||||

Cumulative Incidence | 4.07% | 2.42% | 4.06% | 2.59% | 3.89% | 2.34% | 4.28% | 2.65% |

Power | ||||||||

k=0.3 | 91% | 82% | 89% | 87% |

Male circumcision.

HIV testing and counseling.

Standard of Care.