pmc8101110429Am J Ind MedAm J Ind MedAmerican journal of industrial medicine0271-35861097-0274369387761016411210.1002/ajim.23471NIHMS1878275ArticleA tutorial on a marginal structural modeling approach to mediation analysis in occupational health research: investigating education, employment quality, and mortalityEisenberg-GuyotJerzyPhDabBlaikieKieranMPHbAndreaSarah B.PhDcOddoVanessaPhDdPeckhamTrevorPhDeMinhAnitaPhDbfOwensShaniseMSgHajatAnjumPhDbDepartment of Epidemiology, Mailman School of Public Health, Columbia University, New York, NY, USADepartment of Epidemiology, School of Public Health, University of Washington, Seattle, WA, USASchool of Public Health, Oregon Health & Science University-Portland State University, Portland, OR, USADepartment of Kinesiology and Nutrition, College of Applied Health Sciences, University of Illinois at Chicago, Chicago, IL, USADepartment of Environmental and Occupational Health Services, School of Public Health, University of Washington, Seattle, WA, USASchool of Population and Public Health, University of British Columbia, Vancouver, BC, CADepartment of Health Systems and Population Health, School of Public Health, University of Washington, Seattle, WA, USA

Author contributions:

JEG conceived and designed the study, acquired the data (jointly with KB), conducted the analyses, interpreted the results, and drafted the initial version of the manuscript. All other authors advised JEG on study conceptualization and design and results interpretation, and provided feedback on subsequent drafts of the manuscript.

Institution at which the work was performed:

Department of Epidemiology, Mailman School of Public Health, Columbia University, New York, NY, USA

Corresponding author: Jerzy Eisenberg-Guyot, PhD, Department of Epidemiology, Mailman School of Public Health, Columbia University, 722 West 168th Street, New York, NY 10032, je2433@cumc.columbia.edu
11320236202320320230162024666472483

Life expectancy inequities between more- and less-educated groups have grown 1–2 years over the last several decades in the U.S. Simultaneously, employment conditions for many workers have deteriorated. Researchers hypothesize these adverse conditions mediate educational inequities in mortality. However, methodological barriers have impeded research on the role of employment conditions and other hazards as mediating factors in health inequities. Indeed, traditional mediation analysis methods are often biased in occupational health settings, including in those with exposure-mediator interactions and mediator-outcome confounders that are caused by exposure. In this paper, we outline – and provide code for – a marginal structural modeling (MSM) approach for estimating total effects and controlled direct effects originally proposed elsewhere, which can be applied to common mediation analysis settings in occupational health research. As an example, we apply our approach to assess the extent to which disparities in employment quality (EQ) – a multidimensional construct characterizing the terms and conditions of the worker-employer relationship – explained educational inequities in mortality in a 1999–2015 U.S. Panel Study of Income Dynamics sample of workers with mortality follow-up through 2017. Under certain strong assumptions described in the text, our estimates suggest over 70% of the educational inequity in mortality would have been eliminated if EQ had been at the 80th percentile (100th=best) across exposure groups.

controlled direct effectmediation analysishealth disparitieshealth inequitiesoccupational healthsocial epidemiologyprecarious employment
INTRODUCTIONOverview

Over the last several decades, U.S. life expectancy inequities across educational groups have grown 1–2 years.1 Concomitantly, employment conditions have deteriorated as employers and states have eroded worker power and security, contributing to union-membership losses, stagnant wages, and debased benefits and hours.2 Researchers hypothesize these adverse employment conditions mediate the education-mortality relationship (and mediate the relationship between other social exposures and mortality).3,4 However, pathways linking employment conditions and health are intertwined with other social determinants,3 complicating analysis.57 For example, occupation often confounds the relationship between employment conditions and health and is caused by social exposures like education. Moreover, social exposures may modify the health effects of employment conditions. This exposure-induced mediator-outcome confounding and exposure-mediator interaction, respectively, can compromise the validity of traditional mediation methods, like the difference or product methods.8,9 Thus, although limited research has been conducted,4,10 methodological barriers have hindered research on the role of employment conditions and hazards in mortality inequities, impeding identification of mediating mechanisms and the development of policy and organizing solutions.

In this paper, we outline and provide code for a marginal structural modeling (MSM) approach to mediation analysis that can be used to estimate policy-relevant effects in occupational health research, including in settings with exposure-induced mediator-outcome confounding and exposure-mediator interaction. We also cover the approach’s challenges, including the strong assumptions required for causal inference. Due to their apparent complexity, MSM approaches are rarely applied to occupational health research, particularly to mediation analyses.5 Thus, by outlining the approach in a didactic manner accessible to applied occupational health researchers, our paper makes an important methodological contribution. Our worked example assesses the extent to which disparities in employment quality (EQ) explained educational inequities in mortality.

Total effects and controlled direct effects

This section describes a MSM mediation approach proposed elsewhere.8,9 The approach can be used to estimate: 1) the total effect (TE) the exposure had on the outcome, with 2) the controlled direct effect (CDE) the exposure would have had on the outcome if the mediator had been constant across exposure groups.

Under the potential outcomes framework, the TE of a binary exposure E on a continuous outcome Y can be defined as:8,9,11 TE=E(Ye=1)E(Ye=0) where E (Ye=1) is the expected value of Y in a sample if all respondents – possibly contrary to fact – had been exposed (e=1), while E (Ye=0) is the expected value of Y in a sample if all respondents – again, possibly contrary to fact – had been unexposed (e=0). Contrasting these quantities captures every pathway through which the exposure affected the outcome, whether through the mediator or otherwise.8,9

Meanwhile, the CDE of a binary exposure E on a binary outcome Y can be defined as:8,9 CDE=E(Ye=1,m=m*)E(Ye=0,m=m*) where E (Ye=1,m=m*) is the expected value of Y in a sample if all respondents – possibly contrary to fact – had been exposed (e=1) and their mediator M had been at level m*, while E (Ye=0,m=m*) is the expected value of Y in a sample if all respondents – again, possibly contrary to fact – had been unexposed (e=0) and their mediator M had been at level m*. Contrasting these quantities captures the effect the exposure would have had on the outcome if all respondents, regardless of their exposure, had the same mediator value, such as an occupational hazard compliant with a regulatory standard.8,9

As defined above, the TE and CDE are counterfactual contrasts.8,11 Because outcomes for respondents are only observable under one exposure-mediator combination at a given time, counterfactual contrasts cannot be measured directly.8,11 Rather, they must be estimated using study design and modeling approaches.8,11 In observational settings, such contrasts can be consistently estimated if one sufficiently controls for confounding of the exposure-outcome and mediator-outcome relationships such that the distribution of confounders is equal across exposure-mediator subgroups. In such settings, differences in observed outcomes across subgroups can be attributed to effects of the exposure and mediator rather than to effects of confounders.8,11 The MSM approach described below is one method that can be used to consistently estimate TEs and CDEs in observational settings, given the validity of the requisite assumptions.

Mediation analysis using marginal structural modeling

MSM mediation analyses proceed in three primary steps: 1) estimating exposure inverse probability weights (IPW), 2) estimating mediator IPW, and 3) estimating TEs and CDEs using weighted regression models.8,9 eAppendix 1 contains R code and Table I contains a high-level overview of the approach.

The first step involves estimating exposure IPW, which can be used to address confounding of the exposure-outcome relationship by creating a weighted pseudo-population in which measured confounders are unassociated with exposure.8,9 For a binary exposure E, the exposure IPW wiE for each respondent i can be defined as: wiE=P(E=ei)P(E=ei|X=xi) where the denominator is the probability the respondent experienced their observed exposure value (E = ei), given their values of any measured exposure-outcome confounders (X = xi) suggested by theory and one’s directed acyclic graph (DAG).8,9 Brookhart et al. provide further guidance on confounder/covariate selection.12 The numerator can be the exposure’s marginal (unconditional) probability in the sample [P (E = ei)] (which “stabilizes” the weight) or one (less common, since stabilization increases efficiency).8,9 For binary exposures, numerator and denominator probabilities can be estimated using logistic models, while for ordinal or categorical exposures, probabilities can be estimated using ordinal or multinomial logistic models.8,13 For continuous exposures, probabilities can be estimated using quantile-binning approaches or by replacing the probabilities with probability density functions estimated by linear or gamma models.8,13 The final IPW should have a mean near one and moderate range.14 Using stabilized weights or truncating extreme weights can improve precision of the estimated TE and CDE, although truncation can increase residual confounding.8,9,14 Confounder balance across exposure values after weighting can be examined using balance statistics; imbalance should be minimal post-weighting to mitigate residual confounding.15 Imbalance can be addressed by modifying one’s weighting model (e.g., by altering continuous-variable specification or including interactions).14,15

The second step involves estimating mediator IPW to address confounding of the mediator-outcome relationship.8,9 For a binary mediator M, the mediator IPW wiM for each respondent i can be defined as: wiM=P(M=mi)P(M=mi|E=ei,  X=xi,  Z=zi) where the denominator is the probability the respondent experienced their observed mediator value (M = mi), given their exposure value (E = ei), their values of exposure-outcome confounders (X = xi), and their values of mediator-outcome confounders (Z = zi). Again, the numerator can be the mediator’s marginal probability in the sample or one,8,9 extreme weights can be truncated,14 weights should have a mean near one,14 and confounder balance can be assessed using balance statistics.15 Weights for ordinal, categorical, and continuous mediators can be estimated as described earlier.8,13

The third step involves estimating the TE and the CDE. TEs can be estimated by fitting a regression model of the outcome as a function of the exposure, weighted by the vector of exposure IPW (wE).16 For a continuous outcome Y and binary exposure E, this model can be written as: E(Y|E=e)=β0+β1e where β1 is the TE of the exposure on the outcome.16 Meanwhile, CDEs can be estimated by fitting a regression model of the outcome as a function of the exposure, mediator, and an exposure-mediator interaction term, weighted by the product of the exposure and mediator weights (wE *wM).8,9 For a continuous outcome Y, binary exposure E, and binary mediator M, this model can be written as: E(Y|E=e, M=0)=β0+β1e+β2m+β3em where β1 is the CDE of the exposure on the outcome, holding M at 0 so that β2 and β3 drop from the model.8,9 If there is exposure-mediator interaction, such as if EQ more strongly affected mortality among less-educated people than among more-educated people, the estimated CDE will vary with M’s reference level (0) – such as “high EQ” or “low EQ”.17. If exposure-mediator interaction is anticipated, one can calculate multiple CDEs with varying reference levels. Alternatively, one can choose the reference level based on real-world relevance. For example, if seeking to estimate whether reducing exposure to a chemical hazard could mitigate occupational inequities in lung-cancer mortality among workers in the manufacturing sector, one could choose the reference level based on a hypothetical or proposed regulatory standard for the hazard (e.g., elimination of the hazard or exposure below a given threshold). Likewise, if seeking to estimate whether increasing wages could reduce gender inequities in depression among workers in the service sector, one could choose the reference level based on proposed minimum-wage levels, such as the $15 minimum-wage level proposed by social movements.

Finally, the proportion of the exposure’s effect on the outcome that would have been eliminated if the mediator had been held at a certain value (i.e., “proportion eliminated”) for absolute measures of effect (e.g., risk differences) can be defined as:9,18 Proportioneliminatedforabsolutemeasuresofeffect=(TECDE)TE Meanwhile, for relative measures of effect (e.g., risk ratios), the proportion eliminated can be defined as:18 Proportioneliminatedforrelativemeasuresofeffect=(TECDE)(TE1)

Robust or bootstrap standard errors should be used to calculate confidence intervals for TEs and CDEs.9 The proposed approach can accommodate common outcome types, including continuous, binary, and survival (provided the outcome is rare) via weighted linear, logistic, and Cox proportional hazards regression, respectively,8,9,16,19,20 and can extend to time-varying settings.8 Additionally, if desired, missing data can be addressed via multiple imputation by chained equations (MICE). The IPW, TE, and CDE should be estimated on each of the multiply imputed datasets and the TE and CDE estimates from each of the datasets pooled using Rubin’s Rules.21

Assumptions

As in non-mediation settings, consistently estimating TEs using the approach requires no uncontrolled exposure-outcome confounding (among other assumptions, including no selection bias and no information bias).8,9 Consistently estimating CDEs additionally requires no uncontrolled mediator-outcome confounding.8,9 However, unlike traditional methods, mediator-outcome confounders can be exposure-induced (Figure I). This is because the approach does not condition on confounders via covariate adjustment or stratification and thus does not inherently induce overadjustment and collider bias.8,9 Nonetheless, the no-unmeasured-confounding assumptions are strong and their validity cannot be directly tested; rather, researchers must assess their validity (or near-validity) using theory and background knowledge.8,9,17 Additionally, if desired, researchers can conduct sensitivity analyses that assess how strong unmeasured confounding would need to be to meaningfully alter conclusions drawn from one’s TE and CDE estimates.20,22,23 Consistently estimating TEs and CDEs additionally requires no model misspecification and positivity.8,9 No model misspecification requires that the IPW models are adequately specified to address confounding of the exposure-outcome and mediator-outcome relationships.9,14,15 Weights with a mean far from one or large range can indicate model misspecification, as can confounder imbalance after weighting.9,14,15 Meanwhile, positivity assumes each respondent has a nonzero probability of receiving each exposure-mediator combination, given their covariates.8,9,14,15 Positivity violations – or near violations, which occur if there are rare exposure-mediator-covariate combinations and which can produce weights with a large range – can cause imprecision and bias.9,14 Additionally, as in all mediation analyses, consistently estimating CDEs requires the exposure preceded the mediator, which preceded the outcome.24

APPLIED EXAMPLE

We applied the MSM mediation approach to assess whether EQ disparities explained educational inequities in mortality in a sample of employed workers ages 45–64. Our code is on Open Science Framework (https://osf.io/d5s24/?view_only=2d8401617ad8479db4ab75b6a0ee5b51).

MethodsData and sample

Data are from the Panel Study of Income Dynamics (PSID), a U.S.-based nationally representative survey.25 We used data on reference persons and their partners ages 45–64 from the biennial 1999–2015 waves. First, we restricted to reference persons and their partners ages 45–64 in those waves. Next, we restricted to the first wave (if any) such respondents were employed as employees (i.e., not self-employed). Mortality follow-up occurred through 2017.

Measures

Our exposure was respondents’ highest level of education, which we dichotomized as high school (HS) degree or less (≤HS) versus some college or more (>HS).

Our outcome was all-cause mortality, available along with death year in PSID’s mortality file.25 PSID provided the precise death year for 98% of deaths and a 1–2 year range for 1% of deaths (e.g., 2000–2001 or 2000–2002). For the latter, we assigned the death year to the range’s latter year. We did not assign death years to respondents associated with the remaining deaths.

Our mediator was respondents’ employment quality (EQ). As detailed elsewhere,2628 EQ characterizes the terms and conditions of the worker-employer relationship using seven dimensions: 1) employment stability, 2) material rewards, 3) workers’ rights, 4) working-time arrangements, 5) training and employment opportunities, 6) collective organization, and 7) interpersonal power relations. We analyzed nine variables to capture these dimensions: employment tenure, prior-year unemployment duration, labor income, employer-based health insurance, pension access, waged/salaried, overtime pay, annual hours worked, and union membership. Using these variables, we constructed a continuous EQ score for each respondent with complete EQ data using principal components analysis (PCA).29,30 eAppendix 2 contains details.

Potential measured exposure-outcome confounders included respondents’ gender, race, nativity, parents’ educational attainment, division of residence, parental wealth, age, year, and disability status (Figure II). Potential measured mediator-outcome confounders – some of which may have been exposure-induced – included those variables, plus respondents’ education, occupation, industry, business ownership, family income (excluding respondents’ labor income, which was part of the EQ measure), and marital status interacted with partner’s employment status. Potential unmeasured exposure-outcome and mediator-outcome confounders included additional factors related to respondents’ family backgrounds or preexisting health statuses.

Statistical analyses

First, we addressed missingness in variables of interest. To this end, we excluded respondents whose deaths were only known to have occurred within a range of ≥3 years (<1%). Next, we carried forwards and backwards respondents’ educational values where possible to eliminate missingness (<3%). We then created our analytic sample, excluding those with remaining missing age, employment-status, self-employment, and educational data (<1%). Subsequently, within educational strata,31 we performed multiple imputation by chained equations (MICE) with 15 replications and 25 iterations using R’s ‘mice’ package,32 directly imputing the EQ score and other variables with missingness (<5% per variable) using available EQ, confounder, and outcome data (a death indicator and the Nelson-Aalen hazard function33). Finally, we merged the education-specific datasets.

Second, we estimated exposure IPW on each of the imputed datasets21 using R’s “WeightIt” package.34 To estimate the denominators, we fit logistic models with a binary ≤HS indicator as the outcome and predictors of the measured exposure-outcome confounders,8 with continuous variables specified as 3-knot restricted cubic splines35 to increase the likelihood of confounder balance post-weighting.14 To estimate the numerators, we fit intercept-only logistic models to generate the exposure’s marginal probability in the sample.8 We truncated the IPW at the 1st and 99th percentiles to obtain a mean near one and small range.14

Third, we estimated mediator IPW on each of the imputed datasets using “WeightIt”34 and a quantile-binning approach.36 To this end, we transformed the continuous EQ score into a categorical variable by cutting it into ten equal-sized deciles (with the deciles calculated by pooling across imputed datasets).13 Next, to estimate the denominators, we fit pairwise logistic models with categorical EQ (EQ deciles) as the outcome and predictors of the mediator-outcome confounders, with continuous variables specified as 3-knot restricted cubic splines. We used these models to generate predicted probabilities of respondents’ observed mediator categories, given measured confounders.13 To estimate the numerators, we fit intercept-only pairwise logistic models to generate marginal probabilities of respondents’ observed mediator categories.13 We truncated the IPW at the 1st and 99th percentiles.14

Fourth, we examined confounder balance using R’s “cobalt” package.37 For the exposure, we assessed balance by calculating mean differences (MDs, standardized for continuous variables) in confounder values across education levels before and after weighting;15 we also calculated Kolmogorov-Smirnov (KS) statistics comparing the distribution of confounders across education levels before and after weighting.15 For the mediator, we assessed balance by calculating Pearson correlation coefficients between EQ and confounders before and after weighting;38 we also calculated KS statistics comparing the distribution of confounders in the unweighted and weighted samples to assess if the weighted sample was representative of the unweighted.39 We calculated each of the statistics within imputed datasets,40 then took the mean across imputations,40 targeting values ≤0.10.15,38

Fifth, we estimated the TE that having ≤HS degree had on mortality. To this end, we fit Cox proportional hazards models on each of the imputed datasets using R’s “survival” package,41 with incident mortality as the outcome, a binary ≤HS indicator as the exposure, weights of the exposure IPW, and a years-since-baseline timescale.8,9

Finally, we estimated the CDE that having ≤HS degree would have had on mortality if EQ had been constant across education groups. To this end, we again fit Cox models41 on each of the imputed datasets, with incident mortality as the outcome, weights of the product of the exposure and mediator IPW, and a years-since-baseline timescale. As predictors, the models contained a binary ≤HS indicator (exposure), the continuous EQ score (mediator), and their interaction.8,9 Because we anticipated exposure-mediator interaction – specifically, EQ more strongly affecting mortality for less-educated people than for more-educated people, since the former may depend more heavily on their jobs for survival – we ran models centering the continuous EQ score at the 20th, 50th, and 80th percentiles (i.e., making such percentiles the reference value of 0),17 with EQ percentiles calculated by pooling across imputed datasets. We chose the 20th, 50th, and 80th percentiles because they encompassed a wide range of the EQ score’s distribution, while not being so extreme that they severely compromised estimates’ precision.

We calculated confidence intervals for the TE and CDE using robust standard errors clustered at the family-clan level (with “clans” composed of related families), pooling estimates from each of the imputed datasets using Rubin’s Rules.42

Sensitivity analyses

Sensitivity analyses included: 1) addressing possible overadjustment and collider bias induced by conditioning sample selection on employment status using inverse probability of selection weights, 2) estimating mediator IPW using a normal probability density function, and 3) not adjusting for disability status or division of residence when estimating exposure IPW (eAppendices 79).

Institutional review board approval

The University of Washington Institutional Review Board determined this study to be exempt from review because it used publicly available, deidentified data. Nonetheless, the University of Washington Institutional Review Board reviewed and approved the study because PSID requires such approval to access the restricted-use mortality data.

Results

Our sample included 6,507 respondents, followed for a median, maximum, and total of 12, 18, and 78,282 years, respectively. There were 380 deaths (≤HS: 235; >HS: 145); the Kaplan-Meier survival probability at 18 years was 90%. See Figure III for a flow diagram.

Forty-six percent of respondents had ≤HS degree at baseline (Table II). Less-educated respondents had similar distributions of age, gender, marital status, and division to more-educated respondents (Table II). However, they were less often White and more often low income and employed in industries of “manufacturing” and occupations of “operators, fabricators, and laborers”, “precision production, craft, and repair”, and “services” (Table II). Less-educated respondents had lower median EQ than more-educated respondents, driven by less-educated respondents’ greater likelihood of being uninsured, waged, pension-less, short-tenured, and low-income (Table II). Lower-EQ respondents were disproportionately Black or “other”, low-income, less-educated, disabled, and employed in industries of “services” and “wholesale and retail trade” and occupations of “operators, fabricators, and laborers” and “services” (eAppendix 3).

eAppendix 4 displays IPW distributions, which had means near one and moderate ranges. After weighting, differences in confounder distributions across exposure and mediator values were minimal (eAppendix 5).

Regarding the TE, the mortality hazard was 67% greater (HR: 1.67, 95% CI: 1.34, 2.09) among those with ≤HS degree than among those with >HS degree (Figure IV and eAppendix 6).

Regarding the CDE, when holding EQ at the 80th percentile (100th=best) across educational groups, the mortality hazard was 15% greater (HR: 1.15; 95% CI: 0.81, 1.64) among those with ≤HS degree than among those with >HS degree (Figure IV and eAppendix 6). End-of-follow-up survival also increased across subgroups.

Holding EQ at lower percentiles increased the CDE, indicating an exposure-mediator interaction in which EQ decreases were more strongly related to increased mortality among the less-educated than among the more-educated, although estimates were imprecise. Indeed, when holding EQ at the 20th percentile (1st=worst), the CDE nearly equaled the TE (HR: 1.66, 95% CI: 1.19, 2.31) (Figure IV and eAppendix 6).

Sensitivity analyses yielded similar estimates (eAppendices 79).

DISCUSSION

We outlined an MSM approach to estimating TEs and CDEs that can be applied to common mediation settings in occupational health research. Such settings include those with exposure-induced mediator-outcome confounding and exposure-mediator interaction, both of which may compromise the validity of other approaches8,9 and which are common in occupational health settings.5,6 We applied the approach to assess the extent to which EQ disparities explained educational inequities in mortality in a sample of workers ages 45–64. Given the requisite strong assumptions described in the text, our estimates suggest over 70% of the educational inequity in mortality would have been eliminated if EQ had been at the 80th percentile (100th=best) across exposure groups, i.e., if everyone – education aside – had high EQ, such as salutary material rewards, employment stability, organization, and power, employment characteristics that could be fostered by worker organizing and government policy to bolster labor rights and standards. Estimates diminished holding EQ at lower percentiles.

The CDE estimated by the approach may be of particular interest to applied researchers because it can correspond to the exposure effect one would have observed if a mediator had been at a policy-relevant level, such as income above a living wage or occupational hazards compliant with regulatory standards.8 Other mediation effects, such as natural direct and indirect effects, may be more useful for etiologic research, and can be estimated using a variety of approaches, including marginal structural modeling and inverse odds weighting.8,43,44 However, natural effects are challenging to estimate in settings with exposure-induced mediator-outcome confounding.8,43,44

Despite its advantages, the approach has limitations. First, it forces mediator values to be constant throughout a sample. This is unrealistic for certain mediators, including ours, as it is unlikely policy or organizing could equalize EQ throughout a population. Alternative CDE estimation approaches, including stochastic mediation contrasts, allow mediator values to vary across respondents, although they can be more difficult to implement using standard software.45

Second, estimating consistent effects requires correctly specifying the exposure and mediator weighting models; misspecification of either can cause residual confounding.9 Model misspecification is particularly likely in settings like ours with continuous mediators (or exposures), which require specifying the variables’ correct distributional form.13 Nonetheless, in our example, the estimates’ similarity when using quantile-binning and normal-weighting approaches mitigates concerns about misspecification. Additionally, other common approaches also require strong model-specification assumptions, although doubly-robust methods can lessen the assumptions’ strength.9

Third, like many approaches, estimating causal effects requires no uncontrolled exposure-outcome or mediator-outcome confounding; uncontrolled confounding of either can cause bias.8,9 In our example, possible unmeasured confounders included additional factors related to respondents’ family backgrounds or preexisting health statuses, as well as their prior EQ. Due to the latter, our CDE estimates may not capture the effects of modifying only current EQ, but may also capture the effects of modifying prior EQ, to the extent that prior EQ is associated with current EQ and mortality. Unlike other approaches, however, such as those often used to estimate natural effects, this approach does not require no uncontrolled exposure-mediator confounding.8

Fourth, also like many approaches, estimating consistent effects requires positivity in the exposure-outcome and mediator-outcome models.8,9 Random positivity violations are common in settings with continuous mediators (or exposures), as rare mediator-covariate combinations are guaranteed when the number of possible mediator values approaches infinity.9 Indeed, we used truncation to reduce weight variability, a problem often caused by rare cells.9,14 Alternative approaches, such as structural nested modeling, may be more robust in settings with likely positivity violations.8

Other assumptions frequently invoked in causal-inference settings are: 1) consistency: the exposure and mediator have the same effect on the outcome for each respondent, regardless of how each respondent received their exposure and mediator values; and 2) no interference: respondents’ outcomes are not affected by other respondents’ exposures or mediators.46 These assumptions, which are required for interpreting estimates as intervention effects46 – the effect the exposure would have in the future if a hypothetical intervention fixed the mediator to a certain value across exposure groups – are unlikely to hold in our example. For example, consistency is likely violated for EQ because EQ’s mortality effects might differ depending on the type of EQ-modifying intervention (e.g., organizing vs. policy change) and the specific EQ components that are modified (e.g., union membership vs income). Meanwhile, no interference may be violated because an EQ-modifying intervention could have spillover effects modifying the EQ-mortality relationship. When consistency and no interference are violated, it may be more appropriate to interpret estimates in terms of the past – the effect the exposure would have had on outcomes in the sample if the mediator had been constant across exposure groups – analogous to “realized effects”.47,48 Nonetheless, although realized-effect interpretations have been applied to exposure-focused analyses, they are rarer in mediation settings, and thus should be the focus of future research.

CONCLUSION

We outlined an MSM mediation approach for estimating TEs and CDEs in settings with exposure-induced mediator-outcome confounding and exposure-mediator interaction, phenomena which often plague occupational health settings. Given the requisite assumptions, the approach can be used to investigate policy- and practice-relevant topics in occupational health research, including research on factors driving mortality inequities or the health effects of mediators complying with regulatory standards.

Supplementary MaterialFunding:

Grant sponsor: National Institute of Aging; Grant number: R01AG06001 (all authors)

Grant sponsor: National Institute of Mental Health; Grant number: T32MH013043 (Jerzy Eisenberg-Guyot)

Grant sponsor: National Institute on Minority Health and Health Disparities; Grant number: R00MD012807 (Vanessa Oddo)

Grant sponsor: National Institute for Occupational Safety and Health; Grant number: T420H008433 (Shanise Owens)

Grant sponsor: Canadian Institutes of Health Research; Grant number: MFE-320293 (Anita Minh)

Grant sponsor: Canadian Institutes of Health Research; Grant number: PJT-178101 (Anita Ming)

Institution and ethics approval and informed consent:

The University of Washington Institutional Review Board determined this study to be exempt from review because it used publicly available, deidentified data. Nonetheless, the University of Washington Institutional Review Board reviewed and approved the study because PSID requires such approval to access the restricted-use mortality data.

Conflict of interest disclosure:

None

REFERENCESBorJ, CohenGH, GaleaS. Population health in an era of rising income inequality: USA, 1980–2015. The Lancet. 2017;389(10077):14751490. doi:10.1016/S0140-6736(17)30571-8BenachJ, VivesA, AmableM, VanroelenC, TarafaG, MuntanerC. Precarious employment: understanding an emerging social determinant of health. Annu Rev Public Health. 2014;35(1):229253. doi:10.1146/annurev-publhealth-032013-18250024641559AhonenEQ, FujishiroK, CunninghamT, FlynnM. Work as an inclusive part of population health inequities research and prevention. Am J Public Health. 2018;108(3):306311. doi:10.2105/AJPH.2017.30421429345994FujishiroK, MacDonaldLA, HowardVJ. Job complexity and hazardous working conditions: how do they explain educational gradient in mortality? J Occup Health Psychol. 2020;25(3):176186. doi:10.1037/ocp000017131566401Oude GroenigerJ, BurdorfA. Advancing mediation analysis in occupational health research. Scand J Work Environ Health. 2020;46(2):113116. doi:10.5271/sjweh.388631950195KristensenP, AalenOO. Understanding mechanisms: opening the “black box” in observational studies. Scand J Work Environ Health. 2013;39(2):121124. doi:10.5271/sjweh.334323319154LandsbergisPA. Assessing the contribution of working conditions to socioeconomic disparities in health: a commentary. Am J Ind Med. 2010;53(2):95103. doi:10.1002/ajim.2076619852020VanderWeeleTJ. Marginal structural models for the estimation of direct and indirect effects. Epidemiology. 2009;20(1):1826. doi:10.1097/EDE.0b013e31818f69ce19234398NaimiAI, SchnitzerME, MoodieEEM, BodnarLM. Mediation analysis for health disparities research. Am J Epidemiol. 2016;184(4):315324. doi:10.1093/aje/kwv32927489089FujishiroK, HajatA, LandsbergisPA, MeyerJD, SchreinerPJ, KaufmanJD. Explaining racial/ethnic differences in all-cause mortality in the Multi-Ethnic Study of Atherosclerosis (MESA): substantive complexity and hazardous working conditions as mediating factors. SSM - Popul Health. 2017;3:497505. doi:10.1016/j.ssmph.2017.05.01029349240HernánMA. A definition of causal effect for epidemiological research. J Epidemiol Community Health. 2004;58(4):265271. doi:10.1136/jech.2002.00636115026432BrookhartMA, SchneeweissS, RothmanKJ, GlynnRJ, AvornJ, StürmerT. Variable selection for propensity score models. Am J Epidemiol. 2006;163(12):11491156. doi:10.1093/aje/kwj14916624967NaimiAI, MoodieEEM, AugerN, KaufmanJS. Constructing inverse probability weights for continuous exposures: a comparison of methods. Epidemiology. 2014;25(2):292299. doi:10.1097/EDE.000000000000005324487212ColeSR, HernanMA. Constructing inverse probability weights for marginal structural models. Am J Epidemiol. 2008;168(6):656664. doi:10.1093/aje/kwn16418682488AustinPC, StuartEA. Moving towards best practice when using inverse probability of treatment weighting (IPTW) using the propensity score to estimate causal treatment effects in observational studies. Stat Med. 2015;34(28):36613679. doi:10.1002/sim.660726238958RobinsJM, Hernán, BrumbackB. Marginal structural models and causal inference in epidemiology. Epidemiology. 2000;11(5):550560. doi:10.1097/00001648-200009000-0001110955408RichiardiL, BelloccoR, ZugnaD. Mediation analysis in epidemiology: methods, interpretation and bias. Int J Epidemiol. 2013;42(5):15111519. doi:10.1093/ije/dyt12724019424SuzukiE, EvansD, ChaixB, VanderWeeleTJ. On the “proportion eliminated” for risk differences versus excess relative risks: Epidemiology. 2014;25(2):309310. doi:10.1097/EDE.000000000000006024487217GuoJ, NaimiAI, BrooksMM, MuldoonMF, OrchardTJ, CostacouT. Mediation analysis for estimating cardioprotection of longitudinal RAS inhibition beyond lowering blood pressure and albuminuria in type 1 diabetes. Ann Epidemiol. 2020;41:713. doi:10.1016/j.annepidem.2019.12.00431928894VanderWeeleTJ. Unmeasured confounding and hazard scales: sensitivity analysis for total, direct, and indirect effects. Eur J Epidemiol. 2013;28(2):113117. doi:10.1007/s10654-013-9770-623371044LeyratC, CarpenterJR, BaillyS, WilliamsonEJ. Common methods for handling missing data in marginal structural models: what works and why. Am J Epidemiol. 2021;190(4):663672. doi:10.1093/aje/kwaa22533057574VanderWeeleTJ. Mediation analysis: a practitioner’s guide. Annu Rev Public Health. 2016;37(1):1732. doi:10.1146/annurev-publhealth-032315-02140226653405VanderWeeleTJ. Bias formulas for sensitivity analysis for direct and indirect effects. Epidemiology. 2010;21(4):540551. doi:10.1097/EDE.0b013e3181df191c20479643MacKinnonDP, FairchildAJ, FritzMS. Mediation Analysis. Annu Rev Psychol. 2007;58(1):593614. doi:10.1146/annurev.psych.58.110405.08554216968208McGonagleKA, SchoeniRF, SastryN, FreedmanVA. The Panel Study of Income Dynamics: overview, recent innovations, and potential for life course research. Longitud Life Course Stud. 2012;3(2):188. doi:10.14301/llcs.v3i2.188Peckham, Fujishiro, Hajat, Flaherty, Seixas. Evaluating employment quality as a determinant of health in a changing labor market. RSF Russell Sage Found J Soc Sci. 2019;5(4):258. doi:10.7758/rsf.2019.5.4.09Eisenberg-GuyotJ, PeckhamT, AndreaSB, OddoV, SeixasN, HajatA. Life-course trajectories of employment quality and health in the U.S.: A multichannel sequence analysis. Soc Sci Med. 2020;264:113327. doi:10.1016/j.socscimed.2020.11332732919256Van AerdenK, MoorsG, LevecqueK, VanroelenC. Measuring employment arrangements in the European labour force: a typological approach. Soc Indic Res. 2014;116(3):771791. doi:10.1007/s11205-013-0312-0AndreaSB, Eisenberg-GuyotJ, OddoVM, PeckhamT, JacobyD, HajatA. Beyond hours worked and dollars earned: multidimensional EQ, retirement trajectories and health in later life. Work Aging Retire. 2022;8(1):5173. doi:10.1093/workar/waab01235035984WidamanKF. Common factors versus components: principals and principles, errors and misconceptions. In: CudeckR, MacCallumRC, eds. Factor Analysis at 100: Historical Developments and Future Directions. Lawrence Erlbaum Associates Publishers; 2007:177203.von HippelPT. How to impute interactions, squares, and other transformed variables. Sociol Methodol. 2009;39(1):265291. doi:10.1111/j.1467-9531.2009.01215.xvan BuurenS Package “mice” (version 3.11.0) Published online 2021. Accessed March 18, 2021. https://cran.r-project.org/web/packages/mice/mice.pdfWhiteIR, RoystonP. Imputing missing covariate values for the Cox model. Stat Med. 2009;28(15):19821998. doi:10.1002/sim.361819452569GreiferN R package “WeightIt” (version 0.13.1) Published online 2022. Accessed August 17, 2022. https://CRAN.R-project.org/package=WeightItRosarioHD. R package “pwr” (version 6.2–0) Published online 2020. Accessed April 29, 2022. https://cran.r-project.org/package=pwrGeskusRB, van der WalWM. R package “ipw” (version 1.0–11) Published online 2015. Accessed August 11, 2021. https://cran.r-project.org/package=ipwGreiferN R package “cobalt” (version 4.3.1) Published online 2021. Accessed August 11, 2021. https://cran.r-project.org/package=cobaltZhuY, CoffmanDL, GhoshD. A boosting algorithm for estimating generalized propensity scores with continuous treatments. J Causal Inference. 2015;3(1):2540. doi:10.1515/jci-2014-002226877909GreiferN Covariate balance tables and plots: a guide to the cobalt package. cobalt: Covariate Balance Tables and Plots Published November 3, 2022. Accessed December 27, 2022. https://cran.r-project.org/web/packages/cobalt/vignettes/cobalt.htmlGreiferN. Appendix 2: Using cobalt with clustered, multiply imputed, and other segmented data. cobalt: Covariate Balance Tables and Plots Published August 13, 2022. Accessed August 17, 2022. https://cran.r-project.org/web/packages/cobalt/vignettes/cobalt_A2_segmented_data.htmlTherneauTM. R package “survival” (version 3.4–0) Published online 2022. Accessed August 11, 2022. https://cran.r-project.org/package=survivalRubinDB. Multiple Imputation for Nonresponse in Surveys. John Wiley and Sons; 2004.SchramJL, Oude GroenigerJ, SchuringM, Working conditions and health behavior as causes of educational inequalities in self-rated health: an inverse odds weighting approach. Scand J Work Environ Health. 2021;47(2):127135. doi:10.5271/sjweh.391832815549Tchetgen TchetgenEJ. Inverse odds ratio-weighted estimation for causal mediation analysis. Stat Med. 2013;32(26):45674580. doi:10.1002/sim.586423744517NaimiAI, MoodieEEM, AugerN, KaufmanJS. Stochastic mediation contrasts in epidemiologic research: interpregnancy interval and the educational disparity in preterm delivery. Am J Epidemiol. 2014;180(4):436445. doi:10.1093/aje/kwu13825038216SchwartzS, GattoNM, CampbellUB. Extending the sufficient component cause model to describe the Stable Unit Treatment Value Assumption (SUTVA). Epidemiol Perspect Innov. 2012;9(1):3. doi:10.1186/1742-5573-9-322472125PrinsSJ, McKettaS, PlattJ, MuntanerC, KeyesKM, BatesLM. “The serpent of their agonies”: exploitation as structural determinant of mental illness. Epidemiology. 2021;32(2):303309. doi:10.1097/EDE.000000000000130433252438SchwartzS, GattoNM, CampbellUB. Causal identification: a charge of epidemiology in danger of marginalization. Ann Epidemiol. 2016;26:669673. doi:10.1016/j.annepidem.2016.03.01327237595

Directed acyclic graph depicting a common confounding structure observed in occupational health settings, including exposure-outcome confounding (EY) and mediator-outcome confounding (MY), which may be exposure-induced (depicted by the dashed line).

Pseudo directed acyclic graph depicting hypothesized confounders of exposure-outcome (education-mortality) and mediator-outcome (employment-quality-mortality) relationships in Panel Study of Income Dynamics analysis. Confounders outlined in light grey are not hypothesized to be induced by exposure, while confounders outlined in dark grey are hypothesized to be induced by exposure.

Notes:

a Parents’ educational attainment

b Partner’s labor force status

Flow diagram depicting construction of analytic sample.

Hazard of mortality among those with a high-school degree or less (≤HS) versus some college or more from models estimating the total effect of education on mortality and the controlled direct effect (CDE) of education on mortality, holding the employment-quality (EQ) score at various percentiles (100%=best).

Notes:

Estimates from inverse-probability-weighted Cox proportional hazards models run on 1999–2015 Panel Study of Income Dynamics sample (n=6,507) with mortality follow-up through 2017. Models specified as described in the main text, with confidence intervals calculated using robust standard errors clustered at the family-clan level. Estimates from multiply imputed datasets pooled using Rubin’s Rules.

Simplified core analysis steps of marginal structural modeling mediation analysis approach.

Estimate…PurposeCorrespondence with step in applied example
1. …exposure ipwaAddress confounding of exposure-outcome relationshipEstimate IPW for binary education exposure
2. …mediator ipwaAddress confounding of mediator-outcome relationshipEstimate IPW for continuous employment-quality mediator
3. …confounder balanceAssess confounder balance across exposure and mediator values in weighted sample to ensure imbalance, and thus confounding by measured variables, is minimal in weighted sampleExamine balance in exposure-outcome confounders across education values and in mediator-outcome confounders across employment-quality values
4. …total effectContrast expected value of outcome in sample if all (versus none) of respondents in sample had been exposed, which captures every pathway through which the exposure affected the outcomeEstimate hazard of mortality if all (versus none) respondents in sample had an education of a high-school degree or less
5. …controlled direct effectContrast expected value of outcome in sample if all (versus none) of respondents in sample had been exposed and mediator had been held at level m*, which captures the effect the exposure would have had on the outcome if all respondents, regardless of their exposure, had the same mediator valueEstimate hazard of mortality if all (versus none) respondents in sample had an education of a high-school degree or less and employment quality had been held at given level across education groups
6. …proportion eliminatedCapture proportion of the exposure’s effect on the outcome that would have been eliminated if the mediator had been held at level m* across exposure groupsCapture proportion of education’s effect on mortality that would have been eliminated if employment quality had been held at given level across education groups

Notes:

Inverse probability weights

Descriptive statistics of 1999–2015 Panel Study of Income Dynamics sample at baseline stratified by education level (high-school degree or less [≤HS] vs some college or more [>HS]). Multiply imputed datasets pooled prior to calculating statistics.

Overall≤HS>HS
n per imputed dataset650730053502
Agea46 [45, 50]46 [45, 50]46 [45, 50]
Yeara2003 [1999, 2009]2003 [1999, 2009]2003 [1999, 2009]
Other incomea,b3.9 [1.3, 7.1]3.1 [1.0, 5.7]4.7 [1.7, 8.4]
Male (%)484947
Race (%)
Black 303724
Other 9117
White 625369
Born in US (%)918993
Marital by partner employment (%)c
Married/cohabiting & employed 625765
Married/cohabiting & unempl/NILF 151812
Not married/cohabiting 242523
Division (%)d
East North Central 171716
East South Central 897
Middle Atlantic 111012
Mountain 545
New England 435
Pacific 141414
South Atlantic 242622
West North Central 889
West South Central 101010
Parental wealth (%)e
Poor 303923
Average/Varied 464149
Pretty well off 242028
Father’s education (%)f
<HS313725
HS323034
>HS18628
Other202814
Mother’s education (%)f
<HS263319
HS423944
>HS16625
Other172312
Family-owned business (%)13817
Work disability (%)g101010
Occupation (%)
Farming, forestry, & fishing 230
Managerial 11515
Military 001
Operators, fabricators, & laborers 15257
Production 10146
Professional 19531
Services 162410
Technical, sales, & admin support 282431
Industry (%)
Agriculture, forestry, and fisheries 231
Construction 573
Finance, insurance, & real estate 647
Manufacturing 172113
Military 101
Mining 110
Public administration 759
Services 393146
Transport, communications, & utilities 10119
Wholesale & retail trade 141811
Union membership (%)171817
Employer-based health insurance (%)807286
Salaried (%)402155
Paid extra for overtime (%)h898890
Pension or retirement plan access (%)564864
Annual hours workeda2025 [1764, 2340]2016 [1748, 2250]2040 [1788, 2410]
Past-year unempl. duration (months)a0.0 [0.0, 0.0]0.0 [0.0, 0.0]0.0 [0.0, 0.0]
Total labor incomea,i4.3 [2.5, 6.6]3.3 [1.9, 5.0]5.3 [3.2, 8.3]
Employment tenure (months)a84 [24, 204]84 [24, 192]96 [30, 204]
Employment quality scorea,j0.1 [−0.3, 0.3]−0.1 [−0.5, 0.2]0.2 [−0.2, 0.4]

Notes:

Median [quartile 1, quartile 3]

Family income minus respondents’ labor income in tens of thousands of 2017 dollars

Marital status interacted with partner’s employment status

Pacific includes those living in U.S. territories or foreign countries

Parental wealth when respondent was growing up

“Other” category includes those: 1) whose parents were educated outside the US only or whose parents received no education, 2) who had no parent of given type (father or mother), or 3) who had missingness for the variable

Disability limited type or amount of work respondent could do

Paid extra for overtime hours worked (see appendix for details on variable coding)

In tens of thousands of 2017 dollars

Principal-components-analysis-derived employment quality score (see appendix for details)