The parametric g-formula can be used to estimate the effect of a policy, intervention, or treatment. Unlike standard regression approaches, the parametric g-formula can be used to adjust for time-varying confounders that are affected by prior exposures. To date, there are few published examples in which the method has been applied.
We provide a simple introduction to the parametric g-formula and illustrate its application in analysis of a small cohort study of bone marrow transplant patients in which the effect of treatment on mortality is subject to time-varying confounding.
Standard regression adjustment yields a biased estimate of the effect of treatment on mortality relative to the estimate obtained by the g-formula.
The g-formula allows estimation of a relevant parameter for public health officials: the change in the hazard of mortality under a hypothetical intervention, such as reduction of exposure to a harmful agent or introduction of a beneficial new treatment. We present a simple approach to implement the parametric g-formula that is sufficiently general to allow easy adaptation to many settings of public health relevance.
Imagine an oncologist knocks on your door with the following problem: she wants to know how much she could reduce mortality among her bone marrow transplant patients by prescribing a new drug that prevents graft-versus-host disease, a side effect of allogeneic marrow transplantation.
The g-formula is an analytic tool for estimating standardized outcome distributions using covariate (exposure and confounders) specific estimates of the outcome distribution.
Epidemiologists often use regression models (for example, the Cox proportional hazards model) to adjust for confounding; this is equivalent to estimating stratum-specific hazard ratios and then averaging the information-weighted hazard ratios. When some of those confounders are also causal intermediates, this amounts to adjusting away some of the effect of exposure.
We show how the g-formula can be used with standard software tools that many epidemiologists already employ, and we illustrate it using publicly-available data from a small cohort study with accompanying SAS code in an
Using regression methods to control confounding requires making the assumption that the effect measure is constant across levels of confounders included in the model. Alternatively, standardization allows us to obtain an unconfounded summary effect measure without requiring this assumption. The g-formula is a generalization of standardization and can be expressed similarly. For example, the 10-year risk of death for a group of individuals, standardized across some dichotomous (1,0) risk factor
The g-formula relies on this same set of calculations. Suppose now that we have a cohort of 50-year-old men and we wish to estimate the change in 10-year risk of death from some exposure,
Assume 50% are exposed at the start. As shown in the calculations from
Use of the g-formula for effect estimation proceeds using the standardization formula above and substituting new values for probabilities of exposure defined by a hypothetical intervention. For example, if we wished to estimate the effect of exposure on the 10-year risk of death, we would use the g-formula to calculate the risk of death if we intervened to make all individuals “always exposed “ (i.e. set Pr(
While standardization using the g-formula is simple enough to do by hand in our hypothetical example, epidemiologic studies often have richer covariate sets and longer periods of follow-up. In such studies, stratification by many covariates can quickly lead to strata in which there are not enough data to calculate the conditional probabilities, and use of continuous covariates does not allow this stratification approach. Instead of directly calculating every conditional probability, we can use parametric regression modeling to calculate the conditional probabilities used to carry out computations shown above. Further, we can simulate data from the conditional probability distributions to approximate standardization. Robins referred to this approach of using modeled conditional probabilities to estimate standardized effect measures as the parametric g-formula.
We use the parametric g-formula to estimate the hazard ratio comparing mortality in a cohort of bone marrow transplant recipients (described in detail by Copelan etal
The study population arose from a multicenter trial of leukemia patients and comprises 137 individuals prepared for bone marrow transplants under a radiation-free regimen at four medical centers. Allogeneic bone marrow transplants were performed between 1 March 1984 and 30 June 1989, and patients were followed until death or administrative censoring at 5 years following transplant. Baseline covariates at time of transplant included age, sex, leukemia type (acute lymphocytic or acute myeloid leukemia), wait time from leukemia diagnosis to transplantation, and cytomegalovirus immune status (yes or no). Patients were followed to assess when, if at all, platelets returned to the normal range (as a measure of immune function) and the patient experienced leukemia relapse.
We illustrate how to apply the parametric g-formula to the cohort of bone marrow transplant recipients to estimate the effect of a hypothetical intervention that prevents graft-versus-host disease from occurring. While the cohort is small, our example can be easily adapted to larger observational studies with long-term follow-up.
We use a 3-step algorithm for the parametric g-formula to estimate hazard ratios comparing: a new drug that prevents graft-versus-host disease from occurring during follow up (“prevented”) versus no intervention (“natural course”). We compare “natural course” to “prevented” as a measure of how effective a drug that prevents graft-versus-host disease would have been at preventing mortality in our cohort (or similar populations). We present the algorithm in compact form in
We start with a person-period data set in which each record corresponds to one person-day and time-varying covariates are represented as (0,1) dichotomous variables for each person day. Within this person-period data, we fit a pooled logistic model (i.e. a logistic model fit to all person periods) to estimate the log-odds of each of the time-varying covariates (graft-versus-host disease, platelet level, relapse, death), for each person period (i.e. the conditional log-odds of the covariate taking on the value “1”).
From our original data of N=137 patients, we re-sampled with replacement M=137,000 “pseudo-patients,” retaining only baseline covariates. The sample should be as large as practical to minimize simulation error. This simulation is carried out as follows.
We simulated time-varying covariate and outcome data for each of the pseudo-patients using conditional probabilities generated in Step 1 and baseline covariates (with time-varying covariates set to 0 at baseline). Each pseudo-patient received values for these covariates on each day based on a draw from a Bernoulli distribution with the conditional mean given by the probability modeled in Step 1, above. Essentially, on each day
The dataset from this step is known as the “natural course” because there is no intervention. To ensure that covariate distributions from the “natural course” closely follow distributions in the observed data, we examined the Kaplan-Meier survival curve for death (
We repeated Step 2 using an intervention “prevented,” set graft-versus-host disease to 0 and generate all other covariates, forcing graft-versus-host disease to remain 0. By setting the values of graft-versus-host disease in the dataset, the proportion of deaths in the simulated dataset in which we set graft-versus-host disease=0 approximates the solution to the explicit formulas shown in our hypothetical example (in which the conditional probabilities of graft-versus-host disease onset are all set to 0).
We concatenated the datasets from Step 2, estimating the hazard ratio by comparing the hazards in the “natural course” dataset with those in the “prevented” dataset. This was done by using an indicator variable for the dataset (1=“natural course,” 0=“prevented”) and using that indicator as the exposure variable in a Cox model.
To estimate confidence intervals for the hazard ratio, we repeated Steps 1-3 on 4000 samples of size 137 taken at random with replacement from the original data. The standard deviation (SD) of the 4000 log-hazard ratios approximates the standard error of the log-hazard ratio, and was used to calculate 95% confidence intervals (CIs) using the normal approximation: log-hazard ratio ±1.96*SD(log-hazard ratio).
To compare the g-formula with standard methods, we estimated crude and covariate conditional hazard ratios (and 95% confidence intervals) for the effect of graft-versus-host disease on mortality using a Cox proportional hazards model for time-varying data (with observed data).
Patients had a median age of 28 years (interquartile range=21-35), and 60% were male (
The crude hazard ratio comparing the hazard of all-cause mortality among patients with and without graft-versus-host disease was 1.2 (95% CI=0.77-2.0;
In the simulated data under the natural course, the cumulative distribution functions for platelet count, relapse, death, and graft-versus-host disease closely followed those in the observed data (
In our analysis of survival data using a simple and yet easily adaptable application of the g-formula, we estimated the effects on mortality of an intervention to prevent graft-versus-host disease immediately after bone marrow transplant. Almost 60% of the bone marrow transplant patients in this cohort experienced graft-versus-host disease under the natural course; however, compared with a hypothetical intervention to prevent graft-versus-host disease, under the natural course we observed an increase in the relative mortality hazard of only 10%. Preventing graft-versus-host disease does not appear to markedly reduce mortality risk because other factors that influence the risk of subsequent mortality, such as leukemia relapse, are not decreased by the hypothetical intervention to prevent graft-versus-host disease. Rather, preventing graft-versus-host disease may increase the rate of relapse: 42% of “pseudo-individuals” experienced relapse under the “prevented” intervention, whereas 33% experienced relapse under the “natural course.” These observations agree with the typically coincident occurrences of graft-versus-host disease and graft-versus-leukemia, in which donor cells attack residual leukemia cells in the transplant recipient and may help to prevent relapse. Graft-versus-host disease has been observed to correlate with a lower rate of leukemia relapse and is hypothesized to reduce the probability of relapse through immune mediation processes, of which normal platelet count is an indicator.
Cox regression analysis with adjustment for time-varying covariates suggested a substantially higher mortality hazard for persons who have graft-versus-host disease when compared with those who do not have this disease. Because the Cox model cannot appropriately control time-varying confounding when the confounders are causal intermediates, we hypothesize that the Cox model overestimates the total effect of graft-versus host disease on mortality. Therefore, the difference between the g-formula estimate and the regression adjustment method may be due, in part, to over-adjustment, which introduces bias into our estimate of the total effect of graft-versus-host disease.
The g-formula can be used to estimate risk ratios or differences, which are easier to interpret and less subject to some biases than are hazard ratios.
There is little knowledge about how potential time-varying confounding by relapse and platelet levels affects estimates of excess mortality due to graft-versus-host disease. G-methods (the body of methods that derive from the g-formula, including structural nested models and marginal structural models) may provide an avenue toward a clearer understanding. G-methods have previously been shown to yield results congruent with clinical trial data (where time-varying confounding may be minimized) in situations where conventional analyses of observational data yield incongruous estimates.
In all of these studes except the one by Ahern et al., the potential for time-varying confounding precluded the use of conventional regression approaches. In all cases, the g-formula naturally lent itself to estimating the effects of potential interventions. Rather than estimating the more familiar contrast in measures of disease occurrence for a unit change in the exposure, the g-formula provides an estimate of outcome occurrence under a specific treatment regime under study. While regression adjustment and the g-formula allow for estimation of “etiologic” hazard ratios, only the g-formula easily allows estimation of the effect of preventing graft-versus-host disease in the population, which may be more useful for informing population-level interventions.
We must make several assumptions when using of the g-formula to estimate the effects of exposure on an outcome in observational data.
As epidemiologists, we may strive to measure and control for strong confounders of the exposure-outcome relationship to avoid making inference based on spurious relationships, and we often assume we have been successful – this is the assumption of conditional exchangeability. In our g-formula example, we impute the potential outcomes for each person based on an evolving covariate history generated by predictive models for the exposure, covariates and the outcome. If there were a strong, unmeasured, baseline confounder, we would have omitted an important variable from both the exposure model and the outcome model. However, as Robins et al
In more complex settings, such as if the omitted confounder were also a cause of another time-varying covariate, the g-formula may be subject to more bias than conventional models due to unmeasured confounding. Intuitively, this is because we could be accumulating bias over multiple models, rather than a single outcome model (as in the standard Cox model). However, we should note that, in complex time-varying settings, conventional models require stratification over time-varying confounders, which may be problematic for reasons discussed above, and conventional models may not be able to estimate useful effect measures, such as the impact of an intervention on a population. The g-formula is not subject to either of these shortcomings. Ultimately, the impact of unmeasured confounding in the g-formula (versus conventional models) is an open question; future work will provide guidance to epidemiologists working in settings where unmeasured confounding is expected to be problematic.
While sensitivity analyses exist for examining robustness to the assumption of no unmeasured confounding,
We assume the effect of exposure is the same whether we set it, as in the case of Step 5 of our algorithm or a clinical trial, or if it occurs naturally.
As an anonymous referee pointed out, our analysis can be used to guide future clinical trials by informing on the effects of a drug that completely prevents graft-versus-host disease – our estimate of a 10% reduction in 5-year mortality could used to plan such a study. If one were to study the effect of, say, a population intervention that dictated an instantaneous shift in body-mass-index (BMI) category, then g-formula results would not estimate the effect we could observe in any clinical trial.
The parametric g-formula is especially vulnerable to the assumption of correct model specification, due to the use of multiple models. As an example from our study, if the true effect of wait time to transplant had a quadratic association with exposure and a cubic association with mortality, then we would have misspecified two models and would likely have introduced bias into the hazard ratio. Our g-formula algorithm allows informal checking of this assumption by comparing the observed data to the data simulated under the natural course (e.g.
The g-formula does not require more information than standard methods to estimate effects of exposures or interventions, and its is not subject to bias due to stratifying on variables affected by exposure. The g-formula does not assume that the hazard ratio is homogenous over the levels of the confounders – that is, we are not restricted to estimating a summary hazard ratio that conditions on the covariates. Unlike methods that use stratification of the data by covariates, and then obtain a summary hazard ratio under the assumption of a constant hazard ratio over levels of the confounder, the g-formula permits us to obtain a summary hazard ratio without stratifying the effect measure over covariates. This relaxed assumption comes at the cost of wider confidence intervals, so there is a bias-variance trade-off to consider when deciding if the g-formula is an appropriate statistical tool for an epidemiologic analysis. If one knows that time-varying confounding is not present, or that exposure could not affect subsequent confounders, then standard regression methods may be more appropriate, due mainly to concerns about model misspecification in the g-formula. However, if time-varying confounding is possible, use of the g-formula provides some assurance that one is not introducing bias by inappropriately stratifying on effects of exposure, and it is useful to check results from standard regression models against those from the g-formula, as we have shown.
We could have estimated the hazard ratio for the marginal effect of graft-versus-host disease on mortality using inverse-probability weighted marginal structural models or adaptations of structural nested models.
Though the g-formula was first described in 1986 by Robins,
Directed acyclic graph showing hypothesized causal relationships among study variables for days
Parametric g-formula algorithm for bone marrow transplant data.
Incidence curves for (A.) death. (B.) return to normal platelet levels, (C.) relapse and (D.) graft-versus-host-disease (GvHD) from both observed (gray line) and g-formula natural course Monte Carlo (black line) data.
Survival functions: observed from the bone marrow transplant data; from the natural-course intervention in the g-formula; and from the hypothetical intervention “prevented” graft-versus-host disease (top line) after bone marrow transplants using the g-formula. The gray line indicates the observed survival curve, while the solid black lines indicate the survival curves from the Monte Carlo data for the g-formula interventions.
Estimating risk under interventions using the g-formula with the hypothetical example given in methods section of the main text.
| G-formula component | Value |
|---|---|
|
| |
| PR(exposed): 0.5 | |
| Risk among exposed: 0.025 | |
| Risk among unexposed: 0.025 | |
| Pr(exposed): 0.5 | |
| Risk among exposed: 0.060 | |
| Risk among unexposed: 0.051 | |
| 0.051*(1/2)*(1-0.025*(1/2)) + 0.025*(1/2) + | |
| 0.060*(1/2)*(1-0.025*(1/2)) + 0.025*(1/2) = 0.080 | |
| 0.051*(0)*(1-0.025*(0)) + 0.025*(0) + | |
| 0.060*(1)*(1-0.025*(1)) + 0.025*(1) = 0.084 | |
| 0.051*(1)*(1-0.025*(1)) + 0.025*(1) + | |
| 0.060*(0)*(1-0.025*(0)) + 0.025*(0) = 0.075 | |
| 0.051*(0)*(1-0.025*(0)) + 0.025*(1/2) + | |
| 0.060*(1)*(1-0.025*(1)) + 0.025*(1/2) = 0.084 | |
Characteristics
| Male | 80 | (58) | |
| Female | 57 | (42) | |
| No | 65 | (47) | |
| Yes | 72 | (53) | |
| No | 94 | (69) | |
| Yes | 43 | (31) | |
| No | 17 | (12) | |
| Yes | 120 | (88) | |
| ALL | 38 | (28) | |
| Low-risk AML | 54 | (39) | |
| High-risk AML | 45 | (33) | |
| Negative | 69 | (50) | |
| Positive | 68 | (50) | |
| Alive or censored | 57 | (42) | |
| Dead | 80 | (58) | |
| Age (years); median (IQR) | 28 | (21-35) | |
| Days to transplant; median (IQR) | 6 | (4-8) | |
| Follow up; median (IQR) | 547 | (183-1377) | |
| Days to relapse; median (IQR) | 467 | (122-1363) | |
| Days to normal platelets; median (IQR) | 18 | (14-27) |
No. (%), unless otherwise specified
ALL indicates acute lymphocyte leukemia; AML, acute myeloid leukemia; IQR, interquartile range
Hazard ratios comparing the hazard of all-cause mortality between patients with and without graft-versus-host disease (regression) or comparing cohorts with and without hypothetical intervention to prevent graft-versus-host disease (g-formula)
| Method | HR | (95% CI) | |
|---|---|---|---|
| Crude | 1.2 | (0.77-2.0) | |
| Baseline adjusted | 1.2 | (0.71-1.9) | |
| Fully adjusted | 2.3 | (1.4-3.9) | |
| Natural course vs. Prevent | 1.1 | (0.91-1.3) |
Baseline covariates include age at date of bone marrow transplant, wait time until transplant, sex, and cytomegalovirus status at baseline.
Adjusted for baseline covariates above and time-varying covariates, including days during which platelets had not returned to normal, cumulative days the patient had not experienced relapse, and indicators for relapse and platelets returning to normal on a given day.
Comparing the hazard of all-cause mortality between the entire cohort simulated under no intervention and the entire cohort of simulated to be unexposed (referent) at all time points.