<!DOCTYPE article
PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD with MathML3 v1.2 20190208//EN" "JATS-archivearticle1-mathml3.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" article-type="research-article"><?properties manuscript?><front><journal-meta><journal-id journal-id-type="nlm-journal-id">0147621</journal-id><journal-id journal-id-type="pubmed-jr-id">3548</journal-id><journal-id journal-id-type="nlm-ta">Environ Res</journal-id><journal-id journal-id-type="iso-abbrev">Environ Res</journal-id><journal-title-group><journal-title>Environmental research</journal-title></journal-title-group><issn pub-type="ppub">0013-9351</issn><issn pub-type="epub">1096-0953</issn></journal-meta><article-meta><article-id pub-id-type="pmid">33737076</article-id><article-id pub-id-type="pmc">8187296</article-id><article-id pub-id-type="doi">10.1016/j.envres.2021.111019</article-id><article-id pub-id-type="manuscript">NIHMS1685472</article-id><article-categories><subj-group subj-group-type="heading"><subject>Article</subject></subj-group></article-categories><title-group><article-title>Interdisciplinary Data Science to Advance Environmental Health
Research and Improve Birth Outcomes</article-title></title-group><contrib-group><contrib contrib-type="author"><name><surname>Stingone</surname><given-names>Jeanette A.</given-names></name><xref ref-type="aff" rid="A1">a</xref><xref rid="CR1" ref-type="corresp">*</xref></contrib><contrib contrib-type="author"><name><surname>Triantafillou</surname><given-names>Sofia</given-names></name><xref ref-type="aff" rid="A2">b</xref></contrib><contrib contrib-type="author"><name><surname>Larsen</surname><given-names>Alexandra</given-names></name><xref ref-type="aff" rid="A3">c</xref><xref rid="FN1" ref-type="author-notes">1</xref></contrib><contrib contrib-type="author"><name><surname>Kitt</surname><given-names>Jay P.</given-names></name><xref ref-type="aff" rid="A4">d</xref></contrib><contrib contrib-type="author"><name><surname>Shaw</surname><given-names>Gary M.</given-names></name><xref ref-type="aff" rid="A5">e</xref></contrib><contrib contrib-type="author"><name><surname>Marsillach</surname><given-names>Judit</given-names></name><xref ref-type="aff" rid="A6">f</xref></contrib></contrib-group><aff id="A1"><label>a</label>Department of Epidemiology, Columbia University&#x02019;s
Mailman School of Public Health, 722 West 168th St, Room 1608, New York, NY 10032,
USA</aff><aff id="A2"><label>b</label>Department of Biomedical Informatics, University of
Pittsburgh, Pittsburgh, PA, USA</aff><aff id="A3"><label>c</label>Department of Biostatistics and Bioinformatics, Duke
University, Durham, NC, USA</aff><aff id="A4"><label>d</label>Departments of Chemistry and Biomedical Informatics,
University of Utah, Salt Lake City, UT, USA</aff><aff id="A5"><label>e</label>Department of Pediatrics, Stanford University School of
Medicine, Stanford, CA, USA</aff><aff id="A6"><label>f</label>Department of Environmental and Occupational Health
Sciences, University of Washington, Seattle, WA, USA</aff><author-notes><fn fn-type="present-address" id="FN1"><label>1</label><p id="P1">Present address: US Environmental Protection Agency, Office of
Research and Development, Center for Public Health and Environmental
Assessment, Research Triangle Park, NC</p></fn><fn fn-type="con" id="FN2"><p id="P2">CRediT Author statement</p><p id="P3">Jeanette A Stingone: Conceptualization, Data Curation,
Writing-Original Draft, Sofia Triantafillou: Conceptualization, Formal
Analysis, Writing-Original Draft, Jay P. Kitt: Conceptualization,
Writing-Review &#x00026; Editing, Alexandra Larsen: Conceptualization,
Writing-Review &#x00026; Editing, Gary M Shaw: Conceptualization, Writing-Review
&#x00026; Editing, Judit Marsillach: Conceptualization, Writing-Original Draft,
Project Administration</p></fn><corresp id="CR1"><label>*</label>Corresponding author at: Department of
Epidemiology, Columbia University&#x02019;s Mailman School of Public Health, 722
West 168th St, Room 1608, New York, NY 10032, USA.
<email>j.stingone@columbia.edu</email> (J. A. Stingone)</corresp></author-notes><pub-date pub-type="nihms-submitted"><day>22</day><month>4</month><year>2021</year></pub-date><pub-date pub-type="epub"><day>15</day><month>3</month><year>2021</year></pub-date><pub-date pub-type="ppub"><month>6</month><year>2021</year></pub-date><pub-date pub-type="pmc-release"><day>01</day><month>6</month><year>2022</year></pub-date><volume>197</volume><fpage>111019</fpage><lpage>111019</lpage><!--elocation-id from pubmed: 10.1016/j.envres.2021.111019--><abstract id="ABS1"><p id="P4">Rates of preterm birth and low birthweight continue to rise in the United
States and pose a significant public health problem. Although a variety of
environmental exposures are known to contribute to these and other adverse birth
outcomes, there has been a limited success in developing policies to prevent
these outcomes. A better characterization of the complexities between multiple
exposures and their biological responses can provide the evidence needed to
inform public health policy and strengthen preventative population-level
interventions. In order to achieve this, we encourage the establishment of an
interdisciplinary data science framework that integrates epidemiology,
toxicology and bioinformatics with biomarker-based research to better define how
population-level exposures contribute to these adverse birth outcomes. The
proposed interdisciplinary research framework would 1) facilitate data-driven
analyses using existing data from health registries and environmental monitoring
programs; 2) develop novel algorithms with the ability to predict which
exposures are driving, in this case, adverse birth outcomes in the context of simultaneous
exposures; and 3) refine biomarker-based research, ultimately leading to new
policies and interventions to reduce the incidence of adverse birth
outcomes.</p></abstract><kwd-group><kwd>Preterm Birth</kwd><kwd>Environmental Mixtures</kwd><kwd>Multiple Exposures</kwd><kwd>Public Health Data Science</kwd></kwd-group></article-meta></front><body><sec id="S1"><label>1.</label><title>Introduction</title><p id="P5">In the United States, rates of preterm birth (deliveries before 37 weeks
gestation) have risen for the past four years, to 10.02% in 2018<sup><xref rid="R1" ref-type="bibr">1</xref></sup>; and indeed the frequency of preterm birth
has increased in the US for four decades.<sup><xref rid="R2" ref-type="bibr">2</xref></sup> Incidence of low birth weight has also increased 3% since
2014.<sup><xref rid="R1" ref-type="bibr">1</xref></sup> There are
well-documented disparities in these rates, with Black women having almost double
the risk of having preterm births than White women.<sup><xref rid="R1" ref-type="bibr">1</xref>,<xref rid="R3" ref-type="bibr">3</xref></sup>
Preterm birth and low birthweight remain a substantial public health problem as they
have a significant impact on an infant&#x02019;s survival, development and long-term
health.<sup><xref rid="R4" ref-type="bibr">4</xref>&#x02013;<xref rid="R6" ref-type="bibr">6</xref></sup> Decades of research have provided evidence
that environmental exposures contribute to both preterm birth and low
birthweight.<sup><xref rid="R7" ref-type="bibr">7</xref>&#x02013;<xref rid="R14" ref-type="bibr">14</xref></sup> These exposures include air
pollution, pesticides, extreme temperatures and aspects of the built
environment.</p><p id="P6">Despite the breadth of the exposures investigated for associations with
preterm birth and low birthweight, there has been limited translation to public
health policies and interventions that reduce the frequencies of these outcomes,
particularly in vulnerable communities. Evidence for prevention would be
strengthened by simultaneously investigating the manifold exposures experienced by
women during pregnancy and then employing analytic techniques that have the ablity
to disentangle effects and identify targeted points for intervention. Characterizing
the complex exposure-response relationships that contribute to preterm birth and low
birthweight requires both epidemiologic and toxicologic studies that include the
examination of multiple exposures simultaneously, account for temporal variability
in exposure throughout pregnancy and explore the potential for interaction between
environmental exposures, socioeconomic contexts and genetics. Current study designs
and analytic methods often fail to capture this complexity.<sup><xref rid="R15" ref-type="bibr">15</xref></sup> In addition, there has been limited research
that combines epidemiology of ambient environmental pollutants with biomarker-based
research of biological responses to fully characterize the pathways that define how
population-level exposures can contribute to these adverse birth outcomes.<sup><xref rid="R16" ref-type="bibr">16</xref>,<xref rid="R17" ref-type="bibr">17</xref></sup> This knowledge is critical for informing successful public
health policy and population-level interventions aimed at prevention. These
approaches require collaborations between computational scientists skilled at the
analysis of large, complex data with biomarker-based laboratory researchers with the
knowledge and techniques to determine the mechanistic paths that connect external
exposures to adverse birth outcomes. An interdisciplinary data science framework
enables these collaborations and allows for the holistic analysis of complex
environmental data to extract primary risk drivers and to guide biological,
mechanistic and biomarker-based research, enabling reduction and prevention of
preterm birth, low birthweight and other adverse birth outcomes.</p><p id="P7">Much has been written about the promise of data science in advancing the
understanding of public health.<sup><xref rid="R18" ref-type="bibr">18</xref></sup>
There have been numerous commentaries arguing for the integration of data science
methods into environmental health research.<sup><xref rid="R19" ref-type="bibr">19</xref>,<xref rid="R20" ref-type="bibr">20</xref></sup> Many of these
focus on examining the complexity of the exposome, a paradigm describing the
totality of endogenous and exogenous exposures that occur throughout a
lifetime.<sup><xref rid="R21" ref-type="bibr">21</xref>&#x02013;<xref rid="R26" ref-type="bibr">26</xref></sup> We are beginning to see examples
of research that have successfully adapted data science methods to address the
challenges of characterizing the exposome and documenting its effects on adverse
birth outcomes.<sup><xref rid="R27" ref-type="bibr">27</xref>,<xref rid="R28" ref-type="bibr">28</xref></sup> While this is a clear sign of progress,
applying more complex analytics to ever larger datasets can lead to more questions
than answers. This is because data-driven analyses, including those that have marked
this initial phase of environmental health data science, are limited by the
assumptions required of purely statistical approaches and often fail to include the
existing knowledge generated within other fields such as known biophysical pathways
underlying disease pathology or toxicological knowledge of environmental
pollutants.<sup><xref rid="R27" ref-type="bibr">27</xref></sup> The causal
inference needed to inform public health policy and interventions requires an
interdisciplinary data science, one that enables the integration of knowledge across
research fields including epidemiologic and toxicologic knowledge with environmental
health data on large populations.</p><p id="P8">The objective of this commentary is to propose an interdisciplinary research
framework that integrates data science across the multiple disciplines within
environmental health. The proposed framework utilizes multiple lines of scientific
inquiry, such as epidemiology, data-driven analytics and biomarker-based
investigations, in order to connect the external exposures, modifiable through
policy and interventions, to the internal measures of biologic response that may
serve as more proximal causes of preterm birth and low-birthweight. In addition, we
provide recommendations aimed at the environmental health community to support the
interdisciplinary collaborations needed to advance this work.</p></sec><sec id="S2"><label>2.</label><title>Rationale for an Interdisciplinary Data Science Framework</title><p id="P9">There are two primary reasons to support an interdisciplinary data science
framework for environmental health: the first is to address the limitations of
evidence produced by existing approaches and the second is to advance the field of
data-driven analytic approaches to navigate the space of hypotheses more
efficiently.</p><p id="P10">As discussed above, epidemiologic studies of environmental contributors to
adverse birth outcomes have traditionally focused on a single exposure. This fails
to account for the complexities of exposure, pregnancy and fetal
development.<sup><xref rid="R29" ref-type="bibr">29</xref></sup> These
complexities include epidemiologic analysis of simultaneous exposure to multiple
chemical and non-chemical stressors, temporal variability in both exposure and
vulnerability throughout pregnancy and the potential for interaction between
environmental exposures, socioeconomic contexts and genetics. Recent studies have
attempted to tackle many of these complexities. Previous work in California took a
comprehensive approach to assess both spatial and temporal variation in pesticides
to explore associations with pregnancy outcomes.<sup><xref rid="R30" ref-type="bibr">30</xref></sup> A number of studies have applied complex analytic
techniques, including distributed lag models and others, to identify critical
windows of exposure during pregnancy.<sup><xref rid="R30" ref-type="bibr">30</xref>&#x02013;<xref rid="R33" ref-type="bibr">33</xref></sup> Smaller
cohort studies have begun to utilize biological specimens, such as maternal urine
collected during pregnancy and cord blood collected at delivery, to assess more
proximal exposures to the fetus.<sup><xref rid="R34" ref-type="bibr">34</xref>&#x02013;<xref rid="R36" ref-type="bibr">36</xref></sup> While
making considerable advances in our knowledge, these studies have tackled the
individual limitations of previous studies. In many instances, they remain subject
to other challenges, that could be addressed by a more interdisciplinary
data-science approach. For example, a study to identify critical exposure windows of
a single exposure using statistical approaches remains prone to the confounding
influences of unmeasured co-exposures and often only considers biological knowledge
about fetal development when interpreting results, not in the analysis itself. Thus,
each individual study may be limited in its ability to generate causal inference at
the level needed for policy and intervention planning.</p><p id="P11">Additionally, data science methods have traditionally focused on building
robust predictive statistical models. State-of-the-art feature selection algorithms
have the ability to select, among large sets of covariates, a set of variables that
are maximally predictive for the target variable, and discard the rest as
non-significant contributors. This omission is particularly important in
environmental health data sets, where exposures can often co-occur or have a
spurious association with the outcome due to their correlation with other causal
exposures, and therefore carry similar information for the outcome.<sup><xref rid="R37" ref-type="bibr">37</xref></sup> For example, in <xref rid="F1" ref-type="fig">Figure 1</xref> we illustrate the challenge of outcome-equivalent
co-exposure sets and how to detect them on a county-level data set on preterm birth
from California. In this example, the chemicals chloroform and ethylene oxide were
found to be interchangeable in a linear model that included 8 other air toxics as
covariates. <xref rid="F1" ref-type="fig">Figure 1</xref> shows the corresponding
residuals for preterm birth rates, using chloroform and ethylene oxide. The
residuals are almost identical, suggesting that the variables are interchangeable in
the model. In regression analysis methods used for feature selection, such as LASSO
(least absolute shrinkage and selection operator), chloroform would be discarded as
a non-significant contributor, without any consideration of toxicologic knowledge
about which exposure may be more relevant to preterm birth. This example illustrates
a common issue in environmental health research: the presence of variables that
carry similar outcome information can result in some of the variables being
overlooked as potential risk factors for the outcome of interest.</p><p id="P12">Solutions to these limitations require not just expertise in complex
analytics, but the ability to integrate data across a variety of fields within the
broader domain of environmental health research including epidemiology, toxicology
and the omics technologies (e.g. genomics, proteomics, metabolomics, etc).
Approaches that incorporate previous knowledge on toxicity of exposures while
adapting study designs and analytic methods to address multiple limitations
simultaneously can advance our knowledge of environmental contributors to adverse
birth outcomes and accelerate efforts to translate research into prevention.</p></sec><sec id="S3"><label>3.</label><title>An interdisciplinary data science framework to address challenges within
environmental health research</title><p id="P13">We propose an interdisciplinary data science framework that integrates
epidemiologic data with toxicological knowledge of exposures when applying complex
analytic methods and then using generated results to inform targeted biomarker-based
studies of biological mechanisms.</p><p id="P14">The proposed framework includes multiple techniques from data science that
span the disciplines of epidemiology, bioinformatics and laboratory science: i) the
re-use of existing public health data, environmental monitoring and biospecimens to
efficiently access information on large, representative populations; ii) the use of
bioinformatics toxicological knowledge within quantitative and statistical
approaches; and iii) the translation of results to inform biomarker-based
investigations of biological mechanisms. Our proposed framework is not meant to
serve as a &#x0201c;how-to&#x0201d; for conducting complex analytics related to
environmental contributors to adverse birth outcomes. Rather, we aim to encourage a
more holistic interpretation of data science, marked by the integration of
interdisciplinary approaches and tailored to the research question of interest.
<xref rid="F2" ref-type="fig">Figure 2</xref> illustrates an example of how this
framework could be applied to investigations of adverse birth outcomes, such as
preterm birth, using publicly-available public health data. We will refer to this
example throughout the rest of this commentary. However, we encourage readers to
consider the study populations, exposure data sources, analytic techniques and
biomarker-based investigations that best fit the scientific complexities of their
research question and context.</p><sec id="S4"><label>3.1</label><title>Conduct efficient data-driven analyses using big public health data</title><p id="P15">As stated earlier, much work has been done to incorporate data science
into the fields of medicine and healthcare to better utilize electronic health
record data to promote precision medicine.<sup><xref rid="R38" ref-type="bibr">38</xref>,<xref rid="R39" ref-type="bibr">39</xref></sup> A parallel
challenge is to utilize the vast amounts of data in health registries,
environmental monitoring programs and administrative records to catalyze
improvements in public health.<sup><xref rid="R40" ref-type="bibr">40</xref></sup> The need is echoed in the NIEHS Strategic Plan, which
specifically calls for work that effectively uses data to generate and translate
knowledge into actionable policies to improve public health.<sup><xref rid="R41" ref-type="bibr">41</xref></sup> Environmental epidemiology has a long
history of linking administrative birth records with place-based exposure
metrics to investigate environmental contributors to a variety of health
outcomes including preterm birth and low birthweight. Data quality, however,
remains a limitation as birth records often lack high-quality information on
maternal conditions during pregnancy, pre-pregnancy health and other important
factors. The use of data integration, to combine features across multiple
administrative systems, has the potential to yield a more accurate and holistic
picture of women and infants within populations.<sup><xref rid="R42" ref-type="bibr">42</xref></sup> Research from Northern European
countries with integrated health systems routinely use public health data to
investigate novel questions related to perinatal environmental health that are
not possible relying on birth records alone.<sup><xref rid="R43" ref-type="bibr">43</xref>&#x02013;<xref rid="R45" ref-type="bibr">45</xref></sup> While
the US does not have automatically integrated systems, many municipalities now
routinely link birth and delivery hospitalization data to provide higher quality
information, enabling more comprehensive investigations into birth
outcomes.<sup><xref rid="R46" ref-type="bibr">46</xref></sup></p><p id="P16">The breadth and depth of data resources for environmental exposures that
can be linked to these richer health data records will vary based on
availability and the spatial and temporal context of an individual study.
However, the ability to capture the complexities of exposure during pregnancy
will depend upon using a broad definition of environment, and including
resources that capture exposures across all domains of the external
exposome.<sup><xref rid="R47" ref-type="bibr">47</xref></sup> This could
include both traditional resources, such as air monitoring databases and
pesticide registries maintained by municipal governments for regulatory
purposes, as well as data resources built from newer technologies such as
crowd-sourced traffic data and citizen-science initiatives.</p><p id="P17">As shown in <xref rid="F2" ref-type="fig">Figure 2</xref>, the selection
and integration of multiple data sources is the first step in our proposed
framework. In our example examining preterm birth using publicly available data,
the use of birth registries linked with hospital discharge data would provide
higher quality data on maternal conditions before and during pregnancy, as well
as allow directed investigations of the different phenotypes of preterm birth.
One would use the environmental contributors identified in previous research as
a starting point, but expanding to include all domains of the external exposome
including measures related to built environment. For example, previous research
linking historical red-lining<sup><xref rid="R48" ref-type="bibr">48</xref>,<xref rid="R49" ref-type="bibr">49</xref></sup> and housing
insecurity<sup><xref rid="R50" ref-type="bibr">50</xref></sup> to
preterm birth would support the inclusion of data resources containing metrics
of residential segregation, evictions and gentrification. The resulting big
public health data facilitates use of data-driven discovery approaches to
characterize complex exposure patterns experienced in pregnancy that span the
domains of the exposome.</p><p id="P18">Another component that can be linked to birth records, and therefore to
this big public health data are newborn dried blood spots (NBDS).<sup><xref rid="R51" ref-type="bibr">51</xref></sup> In the US and many other
countries, heel-stick blood samples are collected from newborns at birth to
determine inborn errors of metabolism that could be detrimental for the
infant&#x02019;s postnatal development if not treated immediately. This screening
utilizes only a few of the NDBS collected using a filter paper Guthrie card. The
remaining NDBS are typically stored by a state&#x02019;s Newborn Screening
program for a particular number of years and under safe storage conditions,
defined by state policies, until they are discarded.<sup><xref rid="R52" ref-type="bibr">52</xref></sup> A number of these programs make these
stored residual NBDS available, with appropriate human subjects protection, for
research purposes. NDBS offer a unique opportunity to assess external exposures
from samples representative of <italic>in utero</italic> conditions and
investigate their effects on adverse birth outcomes, childhood developmental
disorders and susceptibility to certain diseases during aging.<sup><xref rid="R53" ref-type="bibr">53</xref>&#x02013;<xref rid="R60" ref-type="bibr">60</xref></sup> Given the limited availability of these
valuable stored residual NDBS, secondary research proposals using residual NDBS
should prioritize scientific rigour to ensure the outcomes will have a
significant impact in improving human health. This imperative supports the need
for analytic methods that can identify the exposures most likely to be causal
contributors to adverse birth outcomes.</p><p id="P19">Linkages between administrative data, birth registries and NDBS
repositories are not the only source of big public health data that would fit
within our proposed interdisciplinary data science framework. There are a
growing number of consortia and &#x0201c;big science&#x0201d; initiatives that
seek to pool and harmonize existing data from individual scientific studies, as
well as implement shared protocols moving forward. Examples include the HELIX
project in Europe<sup><xref rid="R28" ref-type="bibr">28</xref></sup> and the
ECHO program in the United States<sup><xref rid="R61" ref-type="bibr">61</xref></sup>. These initiatives are actively implementing analytic
pipelines to investigate the pregnancy exposome and also provide a complementary
resource to replicate findings observed in studies that use publicly available
data.</p></sec><sec id="S5"><label>3.2.</label><title>Disentangle correlated exposures</title><p id="P20">Data science often focuses on prediction: popular machine learning
methods mine the data for the most informative features and learn a model that
predicts a target outcome. Such models can exploit complex correlation patterns
in the data, and have greatly improved the precision of target prediction,
compared to traditional statistical models. However, this improved capacity does
not necessarily translate to increased knowledge. For example, in epidemiology,
there is great interest in understanding the treatment-outcome mechanism,
especially modifiable factors that can influence outcomes, rather than simply
discovering the features which allow statistical predictions of those outcomes.
To tackle this problem, algorithms that are customized for domain-specific
problems, and which incorporate expert knowledge in guiding feature selection
while mining information from the data are needed.</p><p id="P21">Again, if we consider the purely data-driven approach, the reason
becomes clear. When investigating adverse health outcomes due to environmental
factors, simultaneous exposures to multiple pollutants are common; while some of
these will lead to poor outcomes with a clear underlying pathophysiology, others
are associated only through co-occurrence with the biophysically important
compound. Even state-of-the-art statistical algorithms, as part of a statistical
software package with no modification, may return either the important or
unimportant variable as the most significant predictor, and discard the other.
This is because highly-correlated co-exposures, no matter the basis of that
correlation (e.g. pollutant source, location, or time period), carry the same
statistical information and are thus mathematically indistinguishable to the
algorithm. Thus, the information underlying the true source of the adverse
outcome, in pathophysiological terms, may be carried in any one of the discarded
variables. In the simplest terms, most current algorithms fail to account for
the mechanistic nature of a variable and lack a means for discriminating
them.</p><p id="P22">The challenges of analyzing such highly correlated mixtures of exposures
are known, and addressing them is an active area of research.<sup><xref rid="R62" ref-type="bibr">62</xref>,<xref rid="R63" ref-type="bibr">63</xref></sup> Many of these approaches include some form of data
reduction before statistical analysis, with the goal of building the best
predictive model for the outcome of interest. Although epidemiologists often aim
to identify potential drivers of the outcome, it is known that such
data-reduction techniques hinder the biological interpretation of the
results.<sup><xref rid="R64" ref-type="bibr">64</xref></sup> Some
statistical methods are specifically designed for mixtures, but also aim to
estimate interaction effects between agents<sup><xref rid="R65" ref-type="bibr">65</xref></sup> or to asses the overall effect of combined
exposures.<sup><xref rid="R66" ref-type="bibr">66</xref></sup> When it
comes to variable selection, state-of-the art methods are known to often be
sensitive to even small pertubations of the data when the features are highly
correlated.<sup><xref rid="R67" ref-type="bibr">67</xref></sup> We
propose combining such methods with accumulated prior knowledge stored in
biochemical databases to improve the data-driven generation of scientific
hypotheses.</p><p id="P23">Our interdisciplinary framework proposes the development of novel
algorithms specifically designed for discriminating mechanistic variables based
on <italic>a priori</italic> knowledge from broader scientific study. As
illustrated in <xref rid="F2" ref-type="fig">Figure 2</xref>, a possible
implementation would be the development of an algorithm that returns sets of
statistically equivalent features (as opposed to discarding one in favour of
another) and that carries out a ranking analysis based on the biochemical and
toxicological properties for each feature allowing stratification by likelihood
of biological impact. Some methods for identifying multiple molecular
&#x0201c;signatures&#x0201d; for a target outcome have been developed in the area
of molecular biology, but focus on predictive performance and do not incorporate
domain knowledge. For environmental exposures, for example, indications for the
likelihood of toxicity can be assessed using a variety of chemical, biological,
and toxicological databases.<sup><xref rid="R68" ref-type="bibr">68</xref>,<xref rid="R69" ref-type="bibr">69</xref></sup> As an example, consider again
the data presented in <xref rid="F1" ref-type="fig">Fig. 1</xref>. A
state-of-the-art feature selection algorithm would return either chloroform and
ethylene oxide as risk factors for preterm birth, based on noise in the
measurement or even chance. Prior knowledge on the toxicity and action of these
compounds, mined from toxicology databases or from the literature could rank
them in terms of their likelihood for participating in the mechanism of preterm
birth. This type of &#x0201c;ranking&#x0201d; is manually done by researchers to
formulate plausible testable hypotheses. Using data science to automatically
include this infomration can help the methods navigate the space of hypotheses
more efficiently.</p></sec><sec id="S6"><label>3.3</label><title>Inform biomarker-based research at the population-level</title><p id="P24">Biomarker-based research complements large-scale epidemiologic
investigations by allowing targeted and a potentially more biologically proximal
understanding of the influence of environmental factors. This can potentially
lead to better prevention, diagnosis and treatment of adverse birth outcomes. It
is practically infeasible, fiscally prohibitive and inefficient to
simultaneously measure biomarkers of every single exposure from conception to
death when attempting to study the exposome in relation to a particular adverse
birth outcome. Application of interdisciplinary data science, as proposed within
this commentary, is fundamental in addressing this challenge and will be
essential in informing and guiding targeted biomarker-based research of the
exposome. As illustrated in <xref rid="F2" ref-type="fig">Figure 2</xref>,
hypotheses generated from integrated analysis of epidemiologic research can
inform the selection of exposures to be investigated in mechanistic studies
using available biospecimens.</p><p id="P25">Advances in laboratory research are allowing the simultaneous
characterization and accurate quantification of a large number of individual
biological components such as proteins, metabolites, lipids, etc. These omics
technologies offer an ideal strategy at characterizing the internal components
of the exposome. However, without information about external and lifestyle
components, the omics approaches will yield limited knowledge of the exposome
and population-based prevention strategies.<sup><xref rid="R22" ref-type="bibr">22</xref></sup> Data science can provide an ideal platform to connect
all three components of the exposome and refine measurements that together can
lead to improved prevention of adverse birth outcomes and disease. For example,
in a cohort with well-characterized external exposures, identification of the
exposures that are driving a specific biological effect/disease of interest by a
data science approach can help biomarker-based research stratify subjects for
those exposures of interest and apply omics technologies. This type of study has
the potential to provide mechanistic input linking certain exposures and their
effects on adverse birth outcomes or diseases, providing evidence in support of
prevention strategies aimed at specific external exposures..</p><p id="P26">Direct characterization of the <italic>in utero</italic> exposome is
unattainable, and to date, it has been mostly estimated from exposures measured
in the mother&#x02019;s blood. However, recent improvements in omics technologies
are allowing the use of archived residual NDBS for assessing the etiology of
diseases and certain environmental exposures that occur during the fetal
stage.<sup><xref rid="R55" ref-type="bibr">55</xref>,<xref rid="R57" ref-type="bibr">57</xref>&#x02013;<xref rid="R59" ref-type="bibr">59</xref>,<xref rid="R70" ref-type="bibr">70</xref>&#x02013;<xref rid="R74" ref-type="bibr">74</xref></sup> In our example study of
environmental contributors to preterm birth (<xref rid="F2" ref-type="fig">Figure 2</xref>), we propose to integrate the data science-provided
hypotheses on the specific environmental features most relevant to preterm birth
s with the use of proteomics to analyze archived residual NDBS and ascertain the
biological signatures of <italic>in utero</italic> exposures. By allowing for
much more targeted omics analysis, the interdisciplinary framework described
herein can limit the required breadth of patient samples required for laboratory
processing and potentially allow more rapid discovery of biomarkers of exposure,
furthering the omics field and advancing NDBS-based omics analyses.</p></sec></sec><sec id="S7"><label>4.</label><title>Recommendations</title><p id="P27">We present three recommendations to support and advance the use of the
proposed interdisciplinary framework. First, we encourage researchers across the
multiple disciplines within environmental health to establish interdisciplinary
research teams. The challenges facing the field of environmental health are complex
and require a broad set of skills to maximize the knowledge generated by the diverse
types of environmental health data. The expansion of research teams is not limited
to the inclusion of data scientists with expertise in analytic approaches, but also
chemists knowledgeable in the use of bioinformatics resources, clinicians and
biologists with training in physiological pathways and public health professionals
to advise on optimal translation strategies. The formation and function of these
interdisciplinary research teams will need to be actively supported by funding
agencies, professional organizations and academic centers, through grants and
programs that support collaboration and consortia development. An excellent example
is the Data Science Innovation Labs currently supported by NIH that led to the
creation of the authors&#x02019; interdisciplinary group.<sup><xref rid="R75" ref-type="bibr">75</xref></sup> Initially supported by the NIH Big Data to
Knowledge Initiative (BD2K), the Data Science Innovation Labs are an annual
intensive workshop, created specifically to foster the development of
interdisciplinary teams focused on specific biomedical challenges that could benefit
from increased use of data science techniques. Guided by professional facilitators
and experienced mentors, investigators from diverse disciplines form teams to solve
a data science challenge related to a broad biomedical theme.</p><p id="P28">Second, to encourage efficient and timely interdisciplinary research
designs, we recommend greater use of existing biorepositories for research into
adverse birth outcomes. In our proposed framework, we discussed the use of archived
residual NDBS as a potential source for biomarker-based research. New technologies
have enhanced our ability to re-use these public health resources for multiple omics
research. Many states have demonstrated the ability to share these resources with
researchers, while still protecting individuals&#x02019; privacy and ensuring data
security. A greater commitment to sharing data across birth cohorts could also
provide access to larger resources of biospecimens linked with extensive
epidemiologic data. The HELIX study in Europe provides a blueprint for constructing
novel study designs aimed at integrating interdisciplinary data science within
existing birth cohort studies.</p><p id="P29">As a third recommendation, we encourage the cross-training of both
environmental health and data science investigators to improve communication and
integration of skills within interdisciplinary research teams. NIEHS has
acknowledged this need through an approved concept to advance workforce development
for environmental health data science. Their proposed concept included initiatives
to support environmental health training of data scientists and educational
resources for skill development to enhance data-intensive environmental health
research.</p></sec><sec id="S8"><label>7.</label><title>Conclusion</title><p id="P30">Incorporating interdisciplinary data science techniques and greater use of
big public health data into environmental health research can address limitations of
prior research. We call for the creation and support of interdisciplinary research
teams to implement our proposed framework, connecting population-level environmental
exposures to biomarker-based targeted interventions of biological mechanisms. By
leveraging expertise across the domains of environmental health research and
utilizing diverse techniques of data science, we will enable integrative research to
generate translatable knowledge on environmental contributors to the etiologies and
prevention of adverse birth outcomes.</p></sec></body><back><ack id="S9"><title>Acknowledgements</title><p id="P31">The authors acknowledge Dr. Lynda R. Hardy for her contributions to early
discussions of this paper, and the 2019 Data Science Innovation Lab led by Dr. John
Van Horn.</p><p id="P32">Funding sources</p><p id="P33">JAS&#x02019; work was supported in part by NIH/NIEHS (ES027022). JPK&#x02019;s
work was supported in part by the Utah Center for Clinical and Translational Science
funded by NCATS award (1ULTR002538) and the NIH/NLM Training grant (LM007124).
AL&#x02019;s work was supported in part by NIH/NCI (CA220693). JM&#x02019;s work was
supported in part by NIH/NIEHS (ES04696).</p></ack><fn-group><fn id="FN3"><p id="P35">Declaration of interests</p><p id="P36">The authors declare that they have no known competing financial
interests or personal relationships that could have appeared to influence the
work reported in this paper.</p></fn><fn id="FN4"><p id="P37" content-type="publisher-disclaimer">This is a PDF file of an unedited
manuscript that has been accepted for publication. As a service to our customers
we are providing this early version of the manuscript. The manuscript will
undergo copyediting, typesetting, and review of the resulting proof before it is
published in its final form. Please note that during the production process
errors may be discovered which could affect the content, and all legal
disclaimers that apply to the journal pertain.</p></fn></fn-group><glossary><title>Abbreviations</title><def-list><def-item><term>NDBS</term><def><p id="P34">newborn dried blood spots</p></def></def-item></def-list></glossary><ref-list><title>References</title><ref id="R1"><label>1.</label><mixed-citation publication-type="book"><name><surname>Hamilton</surname><given-names>B</given-names></name>, <name><surname>Martin</surname><given-names>J</given-names></name>, <name><surname>Osterman</surname><given-names>M</given-names></name>, <name><surname>Rossen</surname><given-names>L</given-names></name>. <source>Births: Provisional Data for 2018</source>.
<publisher-name>National Center for Health Statistics</publisher-name>;
<year>2019</year>.</mixed-citation></ref><ref id="R2"><label>2.</label><mixed-citation publication-type="journal"><name><surname>Goldenberg</surname><given-names>RL</given-names></name>, <name><surname>Culhane</surname><given-names>JF</given-names></name>, <name><surname>Iams</surname><given-names>JD</given-names></name>, <name><surname>Romero</surname><given-names>R</given-names></name>. <article-title>Epidemiology and causes of preterm
birth</article-title>. <source>Lancet Lond Engl</source>.
<year>2008</year>;<volume>371</volume>(<issue>9606</issue>):<fpage>75</fpage>&#x02013;<lpage>84</lpage>.
doi:<pub-id pub-id-type="doi">10.1016/S0140-6736(08)60074-4</pub-id></mixed-citation></ref><ref id="R3"><label>3.</label><mixed-citation publication-type="journal"><name><surname>Burris</surname><given-names>HH</given-names></name>, <name><surname>Hacker</surname><given-names>MR</given-names></name>
<article-title>Birth outcome racial disparities: A result of intersecting social
and environmental factors</article-title>. <source>Semin Perinatol</source>.
<year>2017</year>;<volume>41</volume>(<issue>6</issue>):<fpage>360</fpage>&#x02013;<lpage>366</lpage>.
doi:<pub-id pub-id-type="doi">10.1053/j.semperi.2017.07.002</pub-id><pub-id pub-id-type="pmid">28818300</pub-id></mixed-citation></ref><ref id="R4"><label>4.</label><mixed-citation publication-type="journal"><name><surname>Twilhaar</surname><given-names>ES</given-names></name>, <name><surname>Wade</surname><given-names>RM</given-names></name>, <name><surname>de Kieviet</surname><given-names>JF</given-names></name>, <name><surname>van Goudoever</surname><given-names>JB</given-names></name>, <name><surname>van Elburg</surname><given-names>RM</given-names></name>, <name><surname>Oosterlaan</surname><given-names>J</given-names></name>. <article-title>Cognitive Outcomes of Children Born Extremely or Very
Preterm Since the 1990s and Associated Risk Factors: A Meta-analysis and
Meta-regression</article-title>. <source>JAMA Pediatr</source>.
<year>2018</year>;<volume>172</volume>(<issue>4</issue>):<fpage>361</fpage>&#x02013;<lpage>367</lpage>.
doi:<pub-id pub-id-type="doi">10.1001/jamapediatrics.2017.5323</pub-id><pub-id pub-id-type="pmid">29459939</pub-id></mixed-citation></ref><ref id="R5"><label>5.</label><mixed-citation publication-type="journal"><name><surname>Frey</surname><given-names>HA</given-names></name>, <name><surname>Klebanoff</surname><given-names>MA</given-names></name>. <article-title>The epidemiology, etiology, and costs of preterm
birth</article-title>. <source>Semin Fetal Neonatal Med</source>.
<year>2016</year>;<volume>21</volume>(<issue>2</issue>):<fpage>68</fpage>&#x02013;<lpage>73</lpage>.
doi:<pub-id pub-id-type="doi">10.1016/j.siny.2015.12.011</pub-id><pub-id pub-id-type="pmid">26794420</pub-id></mixed-citation></ref><ref id="R6"><label>6.</label><mixed-citation publication-type="journal"><name><surname>Petrou</surname><given-names>S</given-names></name>, <name><surname>Sach</surname><given-names>T</given-names></name>, <name><surname>Davidson</surname><given-names>L</given-names></name>. <article-title>The long-term costs of preterm birth and low birth
weight: results of a systematic review</article-title>. <source>Child Care
Health Dev</source>.
<year>2001</year>;<volume>27</volume>(<issue>2</issue>):<fpage>97</fpage>&#x02013;<lpage>115</lpage>.
doi:<pub-id pub-id-type="doi">10.1046/j.1365-2214.2001.00203.x</pub-id><pub-id pub-id-type="pmid">11251610</pub-id></mixed-citation></ref><ref id="R7"><label>7.</label><mixed-citation publication-type="journal"><name><surname>Nieuwenhuijsen</surname><given-names>MJ</given-names></name>, <name><surname>Dadvand</surname><given-names>P</given-names></name>, <name><surname>Grellier</surname><given-names>J</given-names></name>, <name><surname>Martinez</surname><given-names>D</given-names></name>, <name><surname>Vrijheid</surname><given-names>M</given-names></name>. <article-title>Environmental risk factors of pregnancy outcomes: a
summary of recent meta-analyses of epidemiological studies</article-title>.
<source>Environ Health Glob Access Sci Source</source>.
<year>2013</year>;<volume>12</volume>:<fpage>6</fpage>.
doi:<pub-id pub-id-type="doi">10.1186/1476-069X-12-6</pub-id></mixed-citation></ref><ref id="R8"><label>8.</label><mixed-citation publication-type="journal"><name><surname>Stieb</surname><given-names>DM</given-names></name>, <name><surname>Chen</surname><given-names>L</given-names></name>, <name><surname>Eshoul</surname><given-names>M</given-names></name>, <name><surname>Judek</surname><given-names>S</given-names></name>. <article-title>Ambient air pollution, birth weight and preterm birth: a
systematic review and meta-analysis</article-title>. <source>Environ
Res</source>.
<year>2012</year>;<volume>117</volume>:<fpage>100</fpage>&#x02013;<lpage>111</lpage>.
doi:<pub-id pub-id-type="doi">10.1016/j.envres.2012.05.007</pub-id><pub-id pub-id-type="pmid">22726801</pub-id></mixed-citation></ref><ref id="R9"><label>9.</label><mixed-citation publication-type="journal"><name><surname>Kloog</surname><given-names>I</given-names></name>
<article-title>Air pollution, ambient temperature, green space and preterm
birth</article-title>. <source>Curr Opin Pediatr</source>.
<year>2019</year>;<volume>31</volume>(<issue>2</issue>):<fpage>237</fpage>&#x02013;<lpage>243</lpage>.
doi:<pub-id pub-id-type="doi">10.1097/MOP.0000000000000736</pub-id><pub-id pub-id-type="pmid">30640892</pub-id></mixed-citation></ref><ref id="R10"><label>10.</label><mixed-citation publication-type="journal"><name><surname>Klepac</surname><given-names>P</given-names></name>, <name><surname>Locatelli</surname><given-names>I</given-names></name>, <name><surname>Koro&#x00161;ec</surname><given-names>S</given-names></name>, <name><surname>K&#x000fc;nzli</surname><given-names>N</given-names></name>, <name><surname>Kukec</surname><given-names>A</given-names></name>. <article-title>Ambient air pollution and pregnancy outcomes: A
comprehensive review and identification of environmental public health
challenges</article-title>. <source>Environ Res</source>.
<year>2018</year>;<volume>167</volume>:<fpage>144</fpage>&#x02013;<lpage>159</lpage>.
doi:<pub-id pub-id-type="doi">10.1016/j.envres.2018.07.008</pub-id><pub-id pub-id-type="pmid">30014896</pub-id></mixed-citation></ref><ref id="R11"><label>11.</label><mixed-citation publication-type="journal"><name><surname>Nieuwenhuijsen</surname><given-names>MJ</given-names></name>, <name><surname>Ristovska</surname><given-names>G</given-names></name>, <name><surname>Dadvand</surname><given-names>P</given-names></name>. <article-title>WHO Environmental Noise Guidelines for the European
Region: A Systematic Review on Environmental Noise and Adverse Birth
Outcomes</article-title>. <source>Int J Environ Res Public Health</source>.
<year>2017</year>;<volume>14</volume>(<issue>10</issue>).
doi:<pub-id pub-id-type="doi">10.3390/ijerph14101252</pub-id></mixed-citation></ref><ref id="R12"><label>12.</label><mixed-citation publication-type="journal"><name><surname>Stillerman</surname><given-names>KP</given-names></name>, <name><surname>Mattison</surname><given-names>DR</given-names></name>, <name><surname>Giudice</surname><given-names>LC</given-names></name>, <name><surname>Woodruff</surname><given-names>TJ</given-names></name>. <article-title>Environmental exposures and adverse pregnancy outcomes:
a review of the science</article-title>. <source>Reprod Sci Thousand Oaks
Calif</source>.
<year>2008</year>;<volume>15</volume>(<issue>7</issue>):<fpage>631</fpage>&#x02013;<lpage>650</lpage>.
doi:<pub-id pub-id-type="doi">10.1177/1933719108322436</pub-id></mixed-citation></ref><ref id="R13"><label>13.</label><mixed-citation publication-type="journal"><name><surname>Kuehn</surname><given-names>L</given-names></name>, <name><surname>McCormick</surname><given-names>S</given-names></name>. <article-title>Heat Exposure and Maternal Health in the Face of Climate
Change</article-title>. <source>Int J Environ Res Public Health</source>.
<year>2017</year>;<volume>14</volume>(<issue>8</issue>).
doi:<pub-id pub-id-type="doi">10.3390/ijerph14080853</pub-id></mixed-citation></ref><ref id="R14"><label>14.</label><mixed-citation publication-type="journal"><name><surname>Shirangi</surname><given-names>A</given-names></name>, <name><surname>Nieuwenhuijsen</surname><given-names>M</given-names></name>, <name><surname>Vienneau</surname><given-names>D</given-names></name>, <name><surname>Holman</surname><given-names>CDJ</given-names></name>. <article-title>Living near agricultural pesticide applications and the
risk of adverse reproductive outcomes: a review of the
literature</article-title>. <source>Paediatr Perinat Epidemiol</source>.
<year>2011</year>;<volume>25</volume>(<issue>2</issue>):<fpage>172</fpage>&#x02013;<lpage>191</lpage>.
doi:<pub-id pub-id-type="doi">10.1111/j.1365-3016.2010.01165.x</pub-id><pub-id pub-id-type="pmid">21281330</pub-id></mixed-citation></ref><ref id="R15"><label>15.</label><mixed-citation publication-type="journal"><name><surname>Patel</surname><given-names>CJ</given-names></name>. <article-title>Analytic Complexity and Challenges in Identifying
Mixtures of Exposures Associated with Phenotypes in the Exposome
Era</article-title>. <source>Curr Epidemiol Rep</source>.
<year>2017</year>;<volume>4</volume>(<issue>1</issue>):<fpage>22</fpage>&#x02013;<lpage>30</lpage>.
doi:<pub-id pub-id-type="doi">10.1007/s40471-017-0100-5</pub-id><pub-id pub-id-type="pmid">28251040</pub-id></mixed-citation></ref><ref id="R16"><label>16.</label><mixed-citation publication-type="journal"><name><surname>Vadillo-Ortega</surname><given-names>F</given-names></name>, <name><surname>Osornio-Vargas</surname><given-names>A</given-names></name>, <name><surname>Buxton</surname><given-names>MA</given-names></name>, <etal/>
<article-title>Air pollution, inflammation and preterm birth: a potential
mechanistic link</article-title>. <source>Med Hypotheses</source>.
<year>2014</year>;<volume>82</volume>(<issue>2</issue>):<fpage>219</fpage>&#x02013;<lpage>224</lpage>.
doi:<pub-id pub-id-type="doi">10.1016/j.mehy.2013.11.042</pub-id><pub-id pub-id-type="pmid">24382337</pub-id></mixed-citation></ref><ref id="R17"><label>17.</label><mixed-citation publication-type="journal"><name><surname>Zhang</surname><given-names>Y</given-names></name>, <name><surname>Wang</surname><given-names>J</given-names></name>, <name><surname>Gong</surname><given-names>X</given-names></name>, <etal/>. <article-title>Ambient PM2.5 exposures and systemic biomarkers
of lipid peroxidation and total antioxidant capacity in early
pregnancy</article-title>. <source>Environ Pollut Barking Essex
1987</source>. <year>2020</year>;<volume>266</volume>(Pt
<issue>2</issue>):115301. doi:<pub-id pub-id-type="doi">10.1016/j.envpol.2020.115301</pub-id></mixed-citation></ref><ref id="R18"><label>18.</label><mixed-citation publication-type="journal"><name><surname>Jones</surname><given-names>KH</given-names></name>, <name><surname>Ford</surname><given-names>DV</given-names></name>. <article-title>Population data science: advancing the safe use of
population data for public benefit</article-title>. <source>Epidemiol
Health</source>. <year>2018</year>;<volume>40</volume>:e2018061.
doi:<pub-id pub-id-type="doi">10.4178/epih.e2018061</pub-id></mixed-citation></ref><ref id="R19"><label>19.</label><mixed-citation publication-type="journal"><name><surname>Choirat</surname><given-names>C</given-names></name>, <name><surname>Braun</surname><given-names>D</given-names></name>, <name><surname>Kioumourtzoglou</surname><given-names>M-A</given-names></name>. <article-title>Data Science in Environmental Health
Research</article-title>. <source>Curr Epidemiol Rep</source>.
<year>2019</year>;<volume>6</volume>(<issue>3</issue>):<fpage>291</fpage>&#x02013;<lpage>299</lpage>.
doi:<pub-id pub-id-type="doi">10.1007/s40471-019-00205-5</pub-id><pub-id pub-id-type="pmid">31723546</pub-id></mixed-citation></ref><ref id="R20"><label>20.</label><mixed-citation publication-type="journal"><name><surname>Stieb</surname><given-names>DM</given-names></name>, <name><surname>Boot</surname><given-names>CR</given-names></name>, <name><surname>Turner</surname><given-names>MC</given-names></name>. <article-title>Promise and pitfalls in the application of big data to
occupational and environmental health</article-title>. <source>BMC Public
Health</source>.
<year>2017</year>;<volume>17</volume>(<issue>1</issue>):<fpage>372</fpage>.
doi:<pub-id pub-id-type="doi">10.1186/s12889-017-4286-8</pub-id><pub-id pub-id-type="pmid">28482822</pub-id></mixed-citation></ref><ref id="R21"><label>21.</label><mixed-citation publication-type="journal"><name><surname>Wild</surname><given-names>CP</given-names></name>. <article-title>Complementing the genome with an
&#x0201c;exposome&#x0201d;: the outstanding challenge of environmental
exposure measurement in molecular epidemiology</article-title>.
<source>Cancer Epidemiol Biomark Prev Publ Am Assoc Cancer Res Cosponsored
Am Soc Prev Oncol</source>.
<year>2005</year>;<volume>14</volume>(<issue>8</issue>):<fpage>1847</fpage>&#x02013;<lpage>1850</lpage>.
doi:<pub-id pub-id-type="doi">10.1158/1055-9965.EPI-05-0456</pub-id></mixed-citation></ref><ref id="R22"><label>22.</label><mixed-citation publication-type="journal"><name><surname>Wild</surname><given-names>CP</given-names></name>. <article-title>The exposome: from concept to utility</article-title>.
<source>Int J Epidemiol</source>.
<year>2012</year>;<volume>41</volume>(<issue>1</issue>):<fpage>24</fpage>&#x02013;<lpage>32</lpage>.
doi:<pub-id pub-id-type="doi">10.1093/ije/dyr236</pub-id><pub-id pub-id-type="pmid">22296988</pub-id></mixed-citation></ref><ref id="R23"><label>23.</label><mixed-citation publication-type="journal"><name><surname>Rappaport</surname><given-names>SM</given-names></name>, <name><surname>Smith</surname><given-names>MT</given-names></name>. <article-title>Epidemiology. Environment and disease
risks</article-title>. <source>Science</source>.
<year>2010</year>;<volume>330</volume>(<issue>6003</issue>):<fpage>460</fpage>&#x02013;<lpage>461</lpage>.
doi:<pub-id pub-id-type="doi">10.1126/science.1192603</pub-id><pub-id pub-id-type="pmid">20966241</pub-id></mixed-citation></ref><ref id="R24"><label>24.</label><mixed-citation publication-type="journal"><name><surname>Miller</surname><given-names>GW</given-names></name>, <name><surname>Jones</surname><given-names>DP</given-names></name>. <article-title>The nature of nurture: refining the definition of the
exposome</article-title>. <source>Toxicol Sci Off J Soc Toxicol</source>.
<year>2014</year>;<volume>137</volume>(<issue>1</issue>):<fpage>1</fpage>&#x02013;<lpage>2</lpage>.
doi:<pub-id pub-id-type="doi">10.1093/toxsci/kft251</pub-id></mixed-citation></ref><ref id="R25"><label>25.</label><mixed-citation publication-type="journal"><name><surname>Siroux</surname><given-names>V</given-names></name>, <name><surname>Agier</surname><given-names>L</given-names></name>, <name><surname>Slama</surname><given-names>R</given-names></name>. <article-title>The exposome concept: a challenge and a potential driver
for environmental health research</article-title>. <source>Eur Respir Rev
Off J Eur Respir Soc</source>.
<year>2016</year>;<volume>25</volume>(<issue>140</issue>):<fpage>124</fpage>&#x02013;<lpage>129</lpage>.
doi:<pub-id pub-id-type="doi">10.1183/16000617.0034-2016</pub-id></mixed-citation></ref><ref id="R26"><label>26.</label><mixed-citation publication-type="journal"><name><surname>Stingone</surname><given-names>JA</given-names></name>, <name><surname>Buck Louis</surname><given-names>GM</given-names></name>, <name><surname>Nakayama</surname><given-names>SF</given-names></name>, <etal/>
<article-title>Toward Greater Implementation of the Exposome Research Paradigm
within Environmental Epidemiology</article-title>. <source>Annu Rev Public
Health</source>.
<year>2017</year>;<volume>38</volume>:<fpage>315</fpage>&#x02013;<lpage>327</lpage>.
doi:<pub-id pub-id-type="doi">10.1146/annurev-publhealth-082516-012750</pub-id><pub-id pub-id-type="pmid">28125387</pub-id></mixed-citation></ref><ref id="R27"><label>27.</label><mixed-citation publication-type="journal"><name><surname>Oskar</surname><given-names>S</given-names></name>, <name><surname>Stingone</surname><given-names>JA</given-names></name>. <article-title>Machine Learning Within Studies of Early-Life
Environmental Exposures and Child Health: Review of the Current Literature
and Discussion of Next Steps</article-title>. <source>Curr Environ Health
Rep</source>.
<year>2020</year>;<volume>7</volume>(<issue>3</issue>):<fpage>170</fpage>&#x02013;<lpage>184</lpage>.
doi:<pub-id pub-id-type="doi">10.1007/s40572-020-00282-5</pub-id><pub-id pub-id-type="pmid">32578067</pub-id></mixed-citation></ref><ref id="R28"><label>28.</label><mixed-citation publication-type="journal"><name><surname>Maitre</surname><given-names>L</given-names></name>, <name><surname>de Bont</surname><given-names>J</given-names></name>, <name><surname>Casas</surname><given-names>M</given-names></name>, <etal/>
<article-title>Human Early Life Exposome (HELIX) study: a European
population-based exposome cohort</article-title>. <source>BMJ Open</source>.
<year>2018</year>;<volume>8</volume>(<issue>9</issue>):e021311.
doi:<pub-id pub-id-type="doi">10.1136/bmjopen-2017-021311</pub-id></mixed-citation></ref><ref id="R29"><label>29.</label><mixed-citation publication-type="journal"><name><surname>Robinson</surname><given-names>O</given-names></name>, <name><surname>Vrijheid</surname><given-names>M</given-names></name>. <article-title>The Pregnancy Exposome</article-title>. <source>Curr
Environ Health Rep</source>.
<year>2015</year>;<volume>2</volume>(<issue>2</issue>):<fpage>204</fpage>&#x02013;<lpage>213</lpage>.
doi:<pub-id pub-id-type="doi">10.1007/s40572-015-0043-2</pub-id><pub-id pub-id-type="pmid">26231368</pub-id></mixed-citation></ref><ref id="R30"><label>30.</label><mixed-citation publication-type="journal"><name><surname>Shaw</surname><given-names>GM</given-names></name>, <name><surname>Yang</surname><given-names>W</given-names></name>, <name><surname>Roberts</surname><given-names>EM</given-names></name>, <etal/>
<article-title>Residential Agricultural Pesticide Exposures and Risks of
Spontaneous Preterm Birth</article-title>. <source>Epidemiol Camb
Mass</source>.
<year>2018</year>;<volume>29</volume>(<issue>1</issue>):<fpage>8</fpage>&#x02013;<lpage>21</lpage>.
doi:<pub-id pub-id-type="doi">10.1097/EDE.0000000000000757</pub-id></mixed-citation></ref><ref id="R31"><label>31.</label><mixed-citation publication-type="journal"><name><surname>Darrow</surname><given-names>LA</given-names></name>, <name><surname>Klein</surname><given-names>M</given-names></name>, <name><surname>Strickland</surname><given-names>MJ</given-names></name>, <name><surname>Mulholland</surname><given-names>JA</given-names></name>, <name><surname>Tolbert</surname><given-names>PE</given-names></name>. <article-title>Ambient air pollution and birth weight in full-term
infants in Atlanta, 1994&#x02013;2004</article-title>. <source>Environ Health
Perspect</source>.
<year>2011</year>;<volume>119</volume>(<issue>5</issue>):<fpage>731</fpage>&#x02013;<lpage>737</lpage>.
doi:<pub-id pub-id-type="doi">10.1289/ehp.1002785</pub-id><pub-id pub-id-type="pmid">21156397</pub-id></mixed-citation></ref><ref id="R32"><label>32.</label><mixed-citation publication-type="journal"><name><surname>Liang</surname><given-names>Z</given-names></name>, <name><surname>Lin</surname><given-names>Y</given-names></name>, <name><surname>Ma</surname><given-names>Y</given-names></name>, <etal/>. <article-title>The association between ambient temperature and
preterm birth in Shenzhen, China: a distributed lag non-linear time series
analysis</article-title>. <source>Environ Health Glob Access Sci
Source</source>.
<year>2016</year>;<volume>15</volume>(<issue>1</issue>):<fpage>84</fpage>.
doi:<pub-id pub-id-type="doi">10.1186/s12940-016-0166-4</pub-id></mixed-citation></ref><ref id="R33"><label>33.</label><mixed-citation publication-type="journal"><name><surname>Sheridan</surname><given-names>P</given-names></name>, <name><surname>Ilango</surname><given-names>S</given-names></name>, <name><surname>Bruckner</surname><given-names>TA</given-names></name>, <name><surname>Wang</surname><given-names>Q</given-names></name>, <name><surname>Basu</surname><given-names>R</given-names></name>, <name><surname>Benmarhnia</surname><given-names>T</given-names></name>. <article-title>Ambient Fine Particulate Matter and Preterm Birth in
California: Identification of Critical Exposure Windows</article-title>.
<source>Am J Epidemiol</source>.
<year>2019</year>;<volume>188</volume>(<issue>9</issue>):<fpage>1608</fpage>&#x02013;<lpage>1615</lpage>.
doi:<pub-id pub-id-type="doi">10.1093/aje/kwz120</pub-id><pub-id pub-id-type="pmid">31107509</pub-id></mixed-citation></ref><ref id="R34"><label>34.</label><mixed-citation publication-type="journal"><name><surname>Ashley-Martin</surname><given-names>J</given-names></name>, <name><surname>Lavigne</surname><given-names>E</given-names></name>, <name><surname>Arbuckle</surname><given-names>TE</given-names></name>, <etal/>
<article-title>Air Pollution During Pregnancy and Cord Blood Immune System
Biomarkers</article-title>. <source>J Occup Environ Med</source>.
<year>2016</year>;<volume>58</volume>(<issue>10</issue>):<fpage>979</fpage>&#x02013;<lpage>986</lpage>.
doi:<pub-id pub-id-type="doi">10.1097/JOM.0000000000000841</pub-id><pub-id pub-id-type="pmid">27483336</pub-id></mixed-citation></ref><ref id="R35"><label>35.</label><mixed-citation publication-type="journal"><name><surname>Minatoya</surname><given-names>M</given-names></name>, <name><surname>Itoh</surname><given-names>S</given-names></name>, <name><surname>Miyashita</surname><given-names>C</given-names></name>, <etal/>
<article-title>Association of prenatal exposure to perfluoroalkyl substances
with cord blood adipokines and birth size: The Hokkaido Study on environment
and children&#x02019;s health</article-title>. <source>Environ Res</source>.
<year>2017</year>;<volume>156</volume>:<fpage>175</fpage>&#x02013;<lpage>182</lpage>.
doi:<pub-id pub-id-type="doi">10.1016/j.envres.2017.03.033</pub-id><pub-id pub-id-type="pmid">28349882</pub-id></mixed-citation></ref><ref id="R36"><label>36.</label><mixed-citation publication-type="journal"><name><surname>Haraux</surname><given-names>E</given-names></name>, <name><surname>Tourneux</surname><given-names>P</given-names></name>, <name><surname>Kouakam</surname><given-names>C</given-names></name>, <etal/>
<article-title>Isolated hypospadias: The impact of prenatal exposure to
pesticides, as determined by meconium analysis</article-title>.
<source>Environ Int</source>.
<year>2018</year>;<volume>119</volume>:<fpage>20</fpage>&#x02013;<lpage>25</lpage>.
doi:<pub-id pub-id-type="doi">10.1016/j.envint.2018.06.002</pub-id><pub-id pub-id-type="pmid">29929047</pub-id></mixed-citation></ref><ref id="R37"><label>37.</label><mixed-citation publication-type="journal"><name><surname>ZIDEK</surname><given-names>JV</given-names></name>
<name><surname>WONG</surname><given-names>H</given-names></name>, <name><surname>LE</surname><given-names>ND</given-names></name>, <name><surname>BURNETT</surname><given-names>R%JE</given-names></name>. <source>Causality, measurement error and multicollinearity in
epidemiology</source>.
<year>1996</year>;<volume>7</volume>(<issue>4</issue>):<fpage>441</fpage>&#x02013;<lpage>451</lpage>.</mixed-citation></ref><ref id="R38"><label>38.</label><mixed-citation publication-type="journal"><name><surname>Lopes</surname><given-names>P</given-names></name>, <name><surname>Silva</surname><given-names>LB</given-names></name>, <name><surname>Oliveira</surname><given-names>JL</given-names></name>. <article-title>Challenges and Opportunities for Exploring Patient-Level
Data</article-title>. <source>BioMed Res Int</source>.
<year>2015</year>;<volume>2015</volume>:150435. doi:<pub-id pub-id-type="doi">10.1155/2015/150435</pub-id></mixed-citation></ref><ref id="R39"><label>39.</label><mixed-citation publication-type="journal"><name><surname>Alyass</surname><given-names>A</given-names></name>, <name><surname>Turcotte</surname><given-names>M</given-names></name>, <name><surname>Meyre</surname><given-names>D</given-names></name>. <article-title>From big data analysis to personalized medicine for all:
challenges and opportunities</article-title>. <source>BMC Med
Genomics</source>. <year>2015</year>;<volume>8</volume>:<fpage>33</fpage>.
doi:<pub-id pub-id-type="doi">10.1186/s12920-015-0108-y</pub-id><pub-id pub-id-type="pmid">26112054</pub-id></mixed-citation></ref><ref id="R40"><label>40.</label><mixed-citation publication-type="journal"><name><surname>Gamache</surname><given-names>R</given-names></name>, <name><surname>Kharrazi</surname><given-names>H</given-names></name>, <name><surname>Weiner</surname><given-names>JP</given-names></name>. <article-title>Public and Population Health Informatics: The Bridging
of Big Data to Benefit Communities</article-title>. <source>Yearb Med
Inform</source>.
<year>2018</year>;<volume>27</volume>(<issue>1</issue>):<fpage>199</fpage>&#x02013;<lpage>206</lpage>.
doi:<pub-id pub-id-type="doi">10.1055/s-0038-1667081</pub-id><pub-id pub-id-type="pmid">30157524</pub-id></mixed-citation></ref><ref id="R41"><label>41.</label><mixed-citation publication-type="book"><collab>NIEHS</collab>.
<source>2018&#x02013;2023 Strategic Plan. Advancing Environmental Health
Sciences. Improving Health</source>. <publisher-name>National Institutes of
Health US Department of Health and Human
Services</publisher-name></mixed-citation></ref><ref id="R42"><label>42.</label><mixed-citation publication-type="journal"><name><surname>Jutte</surname><given-names>DP</given-names></name>, <name><surname>Roos</surname><given-names>LL</given-names></name>, <name><surname>Brownell</surname><given-names>MD</given-names></name>. <article-title>Administrative record linkage as a tool for public
health research</article-title>. <source>Annu Rev Public Health</source>.
<year>2011</year>;<volume>32</volume>:<fpage>91</fpage>&#x02013;<lpage>108</lpage>.
doi:<pub-id pub-id-type="doi">10.1146/annurev-publhealth031210-100700</pub-id><pub-id pub-id-type="pmid">21219160</pub-id></mixed-citation></ref><ref id="R43"><label>43.</label><mixed-citation publication-type="journal"><name><surname>Li</surname><given-names>X</given-names></name>, <name><surname>Sundquist</surname><given-names>J</given-names></name>, <name><surname>Sundquist</surname><given-names>K</given-names></name>. <article-title>Parental occupation and risk of
small-for-gestational-age births: a nationwide epidemiological study in
Sweden</article-title>. <source>Hum Reprod Oxf Engl</source>.
<year>2010</year>;<volume>25</volume>(<issue>4</issue>):<fpage>1044</fpage>&#x02013;<lpage>1050</lpage>.
doi:<pub-id pub-id-type="doi">10.1093/humrep/deq004</pub-id></mixed-citation></ref><ref id="R44"><label>44.</label><mixed-citation publication-type="journal"><name><surname>Lunde</surname><given-names>A</given-names></name>, <name><surname>Melve</surname><given-names>KK</given-names></name>, <name><surname>Gjessing</surname><given-names>HK</given-names></name>, <name><surname>Skjaerven</surname><given-names>R</given-names></name>, <name><surname>Irgens</surname><given-names>LM</given-names></name>
<article-title>Genetic and environmental influences on birth weight, birth
length, head circumference, and gestational age by use of population-based
parent-offspring data</article-title>. <source>Am J Epidemiol</source>.
<year>2007</year>;<volume>165</volume>(<issue>7</issue>):<fpage>734</fpage>&#x02013;<lpage>741</lpage>.
doi:<pub-id pub-id-type="doi">10.1093/aje/kwk107</pub-id><pub-id pub-id-type="pmid">17311798</pub-id></mixed-citation></ref><ref id="R45"><label>45.</label><mixed-citation publication-type="journal"><name><surname>Gong</surname><given-names>T</given-names></name>, <name><surname>Dalman</surname><given-names>C</given-names></name>, <name><surname>Wicks</surname><given-names>S</given-names></name>, <etal/>
<article-title>Perinatal Exposure to Traffic-Related Air Pollution and Autism
Spectrum Disorders</article-title>. <source>Environ Health
Perspect</source>.
<year>2017</year>;<volume>125</volume>(<issue>1</issue>):<fpage>119</fpage>&#x02013;<lpage>126</lpage>.
doi:<pub-id pub-id-type="doi">10.1289/EHP118</pub-id><pub-id pub-id-type="pmid">27494442</pub-id></mixed-citation></ref><ref id="R46"><label>46.</label><mixed-citation publication-type="journal"><name><surname>Kim</surname><given-names>SY</given-names></name>, <name><surname>Ahuja</surname><given-names>S</given-names></name>, <name><surname>Stampfel</surname><given-names>C</given-names></name>, <name><surname>Williamson</surname><given-names>D</given-names></name>. <article-title>Are Birth Certificate and Hospital Discharge Linkages
Performed in 52 Jurisdictions in the United States?</article-title>
<source>Matern Child Health J</source>.
<year>2015</year>;<volume>19</volume>(<issue>12</issue>):<fpage>2615</fpage>&#x02013;<lpage>2620</lpage>.
doi:<pub-id pub-id-type="doi">10.1007/s10995-015-1780-4</pub-id><pub-id pub-id-type="pmid">26140836</pub-id></mixed-citation></ref><ref id="R47"><label>47.</label><mixed-citation publication-type="journal"><name><surname>Turner</surname><given-names>MC</given-names></name>, <name><surname>Nieuwenhuijsen</surname><given-names>M</given-names></name>, <name><surname>Anderson</surname><given-names>K</given-names></name>, <etal/>
<article-title>Assessing the Exposome with External Measures: Commentary on the
State of the Science and Research Recommendations</article-title>.
<source>Annu Rev Public Health</source>.
<year>2017</year>;<volume>38</volume>:<fpage>215</fpage>&#x02013;<lpage>239</lpage>.
doi:<pub-id pub-id-type="doi">10.1146/annurev-publhealth-082516-012802</pub-id><pub-id pub-id-type="pmid">28384083</pub-id></mixed-citation></ref><ref id="R48"><label>48.</label><mixed-citation publication-type="journal"><name><surname>Krieger</surname><given-names>N</given-names></name>, <name><surname>Van Wye</surname><given-names>G</given-names></name>, <name><surname>Huynh</surname><given-names>M</given-names></name>, <etal/>
<article-title>Structural Racism, Historical Redlining, and Risk of Preterm
Birth in New York City, 2013&#x02013;2017</article-title>. <source>Am J
Public Health</source>.
<year>2020</year>;<volume>110</volume>(<issue>7</issue>):<fpage>1046</fpage>&#x02013;<lpage>1053</lpage>.
doi:<pub-id pub-id-type="doi">10.2105/AJPH.2020.305656</pub-id><pub-id pub-id-type="pmid">32437270</pub-id></mixed-citation></ref><ref id="R49"><label>49.</label><mixed-citation publication-type="journal"><name><surname>Mendez</surname><given-names>DD</given-names></name>, <name><surname>Hogan</surname><given-names>VK</given-names></name>, <name><surname>Culhane</surname><given-names>JF</given-names></name>. <article-title>Institutional racism, neighborhood factors, stress, and
preterm birth</article-title>. <source>Ethn Health</source>.
<year>2014</year>;<volume>19</volume>(<issue>5</issue>):<fpage>479</fpage>&#x02013;<lpage>499</lpage>.
doi:<pub-id pub-id-type="doi">10.1080/13557858.2013.846300</pub-id><pub-id pub-id-type="pmid">24134165</pub-id></mixed-citation></ref><ref id="R50"><label>50.</label><mixed-citation publication-type="journal"><name><surname>Leifheit</surname><given-names>KM</given-names></name>, <name><surname>Schwartz</surname><given-names>GL</given-names></name>, <name><surname>Pollack</surname><given-names>CE</given-names></name>, <etal/>
<article-title>Severe Housing Insecurity during Pregnancy: Association with
Adverse Birth and Infant Outcomes</article-title>. <source>Int J Environ Res
Public Health</source>.
<year>2020</year>;<volume>17</volume>(<issue>22</issue>).
doi:<pub-id pub-id-type="doi">10.3390/ijerph17228659</pub-id></mixed-citation></ref><ref id="R51"><label>51.</label><mixed-citation publication-type="journal"><name><surname>DePasquale</surname><given-names>JM</given-names></name>, <name><surname>Freeman</surname><given-names>K</given-names></name>, <name><surname>Amin</surname><given-names>MM</given-names></name>, <etal/>
<article-title>Efficient Linking of Birth Certificate and Newborn Screening
Databases for Laboratory Investigation of Congenital Cytomegalovirus
Infection and Preterm Birth: Florida, 2008</article-title>. <source>Matern
Child Health J</source>.
<year>2012</year>;<volume>16</volume>(<issue>2</issue>):<fpage>486</fpage>&#x02013;<lpage>494</lpage>.
doi:<pub-id pub-id-type="doi">10.1007/s10995-010-0740-2</pub-id><pub-id pub-id-type="pmid">21203810</pub-id></mixed-citation></ref><ref id="R52"><label>52.</label><mixed-citation publication-type="journal"><name><surname>Rothwell</surname><given-names>E</given-names></name>, <name><surname>Johnson</surname><given-names>E</given-names></name>, <name><surname>Riches</surname><given-names>N</given-names></name>, <name><surname>Botkin</surname><given-names>JR</given-names></name>. <article-title>Secondary research uses of residual newborn screening
dried bloodspots: a scoping review</article-title>. <source>Genet
Med</source>.
<year>2019</year>;<volume>21</volume>(<issue>7</issue>):<fpage>1469</fpage>&#x02013;<lpage>1475</lpage>.
doi:<pub-id pub-id-type="doi">10.1038/s41436-018-0387-8</pub-id><pub-id pub-id-type="pmid">30531811</pub-id></mixed-citation></ref><ref id="R53"><label>53.</label><mixed-citation publication-type="journal"><name><surname>Funk</surname><given-names>WE</given-names></name>, <name><surname>Waidyanatha</surname><given-names>S</given-names></name>, <name><surname>Chaing</surname><given-names>SH</given-names></name>, <name><surname>Rappaport</surname><given-names>SM</given-names></name>. <article-title>Hemoglobin adducts of benzene oxide in neonatal and
adult dried blood spots</article-title>. <source>Cancer Epidemiol Biomark
Prev Publ Am Assoc Cancer Res Cosponsored Am Soc Prev Oncol</source>.
<year>2008</year>;<volume>17</volume>(<issue>8</issue>):<fpage>1896</fpage>&#x02013;<lpage>1901</lpage>.
doi:<pub-id pub-id-type="doi">10.1158/1055-9965.EPI-08-0356</pub-id></mixed-citation></ref><ref id="R54"><label>54.</label><mixed-citation publication-type="journal"><name><surname>Funk</surname><given-names>WE</given-names></name>, <name><surname>McGee</surname><given-names>JK</given-names></name>, <name><surname>Olshan</surname><given-names>AF</given-names></name>, <name><surname>Ghio</surname><given-names>AJ</given-names></name>. <article-title>Quantification of arsenic, lead, mercury and cadmium in
newborn dried blood spots</article-title>. <source>Biomark Biochem Indic
Expo Response Susceptibility Chem</source>.
<year>2013</year>;<volume>18</volume>(<issue>2</issue>):<fpage>174</fpage>&#x02013;<lpage>177</lpage>.
doi:<pub-id pub-id-type="doi">10.3109/1354750X.2012.750379</pub-id></mixed-citation></ref><ref id="R55"><label>55.</label><mixed-citation publication-type="journal"><name><surname>Funk</surname><given-names>WE</given-names></name>. <article-title>Use of Dried Blood Spots for Estimating Children?s
Exposures to Heavy Metals in Epidemiological Research</article-title>.
<source>J Environ Anal Toxicol</source>.
<year>2015</year>;<volume>s7</volume>. doi:<pub-id pub-id-type="doi">10.4172/2161-0525.S7-002</pub-id></mixed-citation></ref><ref id="R56"><label>56.</label><mixed-citation publication-type="journal"><name><surname>Asrani</surname><given-names>K</given-names></name>, <name><surname>Shaw</surname><given-names>GM</given-names></name>, <name><surname>Rine</surname><given-names>J</given-names></name>, <name><surname>Marini</surname><given-names>NJ</given-names></name>. <article-title>DNA Methylome Profiling on the Infinium
HumanMethylation450 Array from Limiting Quantities of Genomic DNA from a
Single, Small Archived Bloodspot</article-title>. <source>Genet Test Mol
Biomark</source>.
<year>2017</year>;<volume>21</volume>(<issue>8</issue>):<fpage>516</fpage>&#x02013;<lpage>519</lpage>.
doi:<pub-id pub-id-type="doi">10.1089/gtmb.2017.0019</pub-id></mixed-citation></ref><ref id="R57"><label>57.</label><mixed-citation publication-type="journal"><name><surname>Petrick</surname><given-names>L</given-names></name>, <name><surname>Edmands</surname><given-names>W</given-names></name>, <name><surname>Schiffman</surname><given-names>C</given-names></name>, <etal/>
<article-title>An untargeted metabolomics method for archived newborn dried
blood spots in epidemiologic studies</article-title>. <source>Metabolomics
Off J Metabolomic Soc</source>.
<year>2017</year>;<volume>13</volume>(<issue>3</issue>).
doi:<pub-id pub-id-type="doi">10.1007/s11306-016-1153-z</pub-id></mixed-citation></ref><ref id="R58"><label>58.</label><mixed-citation publication-type="journal"><name><surname>Yano</surname><given-names>Y</given-names></name>, <name><surname>Grigoryan</surname><given-names>H</given-names></name>, <name><surname>Schiffman</surname><given-names>C</given-names></name>, <etal/>
<article-title>Untargeted adductomics of Cys34 modifications to human serum
albumin in newborn dried blood spots</article-title>. <source>Anal Bioanal
Chem</source>.
<year>2019</year>;<volume>411</volume>(<issue>11</issue>):<fpage>2351</fpage>&#x02013;<lpage>2362</lpage>.
doi:<pub-id pub-id-type="doi">10.1007/s00216-019-01675-8</pub-id><pub-id pub-id-type="pmid">30783713</pub-id></mixed-citation></ref><ref id="R59"><label>59.</label><mixed-citation publication-type="journal"><name><surname>Gonseth</surname><given-names>S</given-names></name>, <name><surname>Shaw</surname><given-names>GM</given-names></name>, <name><surname>Roy</surname><given-names>R</given-names></name>, <etal/>. <article-title>Epigenomic profiling of newborns with isolated
orofacial clefts reveals widespread DNA methylation changes and implicates
metastable epiallele regions in disease risk</article-title>.
<source>Epigenetics</source>.
<year>2019</year>;<volume>14</volume>(<issue>2</issue>):<fpage>198</fpage>&#x02013;<lpage>213</lpage>.
doi:<pub-id pub-id-type="doi">10.1080/15592294.2019.1581591</pub-id><pub-id pub-id-type="pmid">30870065</pub-id></mixed-citation></ref><ref id="R60"><label>60.</label><mixed-citation publication-type="journal"><name><surname>Ma</surname><given-names>W-L</given-names></name>, <name><surname>Gao</surname><given-names>C</given-names></name>, <name><surname>Bell</surname><given-names>EM</given-names></name>, <etal/>
<article-title>Analysis of polychlorinated biphenyls and organochlorine
pesticides in archived dried blood spots and its application to track
temporal trends of environmental chemicals in newborns</article-title>.
<source>Environ Res</source>.
<year>2014</year>;<volume>133</volume>:<fpage>204</fpage>&#x02013;<lpage>210</lpage>.
doi:<pub-id pub-id-type="doi">10.1016/j.envres.2014.05.029</pub-id><pub-id pub-id-type="pmid">24968082</pub-id></mixed-citation></ref><ref id="R61"><label>61.</label><mixed-citation publication-type="journal"><name><surname>Buckley</surname><given-names>JP</given-names></name>, <name><surname>Barrett</surname><given-names>ES</given-names></name>, <name><surname>Beamer</surname><given-names>PI</given-names></name>, <etal/>
<article-title>Opportunities for evaluating chemical exposures and child health
in the United States: the Environmental influences on Child Health Outcomes
(ECHO) Program</article-title>. <source>J Expo Sci Environ
Epidemiol</source>.
<year>2020</year>;<volume>30</volume>(<issue>3</issue>):<fpage>397</fpage>&#x02013;<lpage>419</lpage>.
doi:<pub-id pub-id-type="doi">10.1038/s41370-020-0211-9</pub-id><pub-id pub-id-type="pmid">32066883</pub-id></mixed-citation></ref><ref id="R62"><label>62.</label><mixed-citation publication-type="journal"><name><surname>Taylor</surname><given-names>KW</given-names></name>, <name><surname>Joubert</surname><given-names>BR</given-names></name>, <name><surname>Braun</surname><given-names>JM</given-names></name>, <etal/>
<article-title>Statistical Approaches for Assessing Health Effects of
Environmental Chemical Mixtures in Epidemiology: Lessons from an Innovative
Workshop</article-title>. <source>Environ Health Perspect</source>.
<year>2016</year>;<volume>124</volume>(<issue>12</issue>):<fpage>A227</fpage>&#x02013;<lpage>A229</lpage>.
doi:<pub-id pub-id-type="doi">10.1289/EHP547</pub-id><pub-id pub-id-type="pmid">27905274</pub-id></mixed-citation></ref><ref id="R63"><label>63.</label><mixed-citation publication-type="journal"><name><surname>Gibson</surname><given-names>EA</given-names></name>, <name><surname>Goldsmith</surname><given-names>J</given-names></name>, <name><surname>Kioumourtzoglou</surname><given-names>M-A</given-names></name>. <article-title>Complex Mixtures, Complex Analyses: an Emphasis on
Interpretable Results</article-title>. <source>Curr Environ Health
Rep</source>.
<year>2019</year>;<volume>6</volume>(<issue>2</issue>):<fpage>53</fpage>&#x02013;<lpage>61</lpage>.
doi:<pub-id pub-id-type="doi">10.1007/s40572-019-00229-5</pub-id><pub-id pub-id-type="pmid">31069725</pub-id></mixed-citation></ref><ref id="R64"><label>64.</label><mixed-citation publication-type="journal"><name><surname>Patel</surname><given-names>CJ</given-names></name>, <name><surname>Kerr</surname><given-names>J</given-names></name>, <name><surname>Thomas</surname><given-names>DC</given-names></name>, <etal/>
<article-title>Opportunities and Challenges for Environmental Exposure
Assessment in Population-Based Studies</article-title>. <source>Cancer
Epidemiol Biomark Prev Publ Am Assoc Cancer Res Cosponsored Am Soc Prev
Oncol</source>.
<year>2017</year>;<volume>26</volume>(<issue>9</issue>):<fpage>1370</fpage>&#x02013;<lpage>1380</lpage>.
doi:<pub-id pub-id-type="doi">10.1158/1055-9965.EPI-17-0459</pub-id></mixed-citation></ref><ref id="R65"><label>65.</label><mixed-citation publication-type="journal"><name><surname>Bobb</surname><given-names>JF</given-names></name>, <name><surname>Valeri</surname><given-names>L</given-names></name>, <name><surname>Claus Henn</surname><given-names>B</given-names></name>, <etal/>
<article-title>Bayesian kernel machine regression for estimating the health
effects of multi-pollutant mixtures</article-title>. <source>Biostat Oxf
Engl</source>.
<year>2015</year>;<volume>16</volume>(<issue>3</issue>):<fpage>493</fpage>&#x02013;<lpage>508</lpage>.
doi:<pub-id pub-id-type="doi">10.1093/biostatistics/kxu058</pub-id></mixed-citation></ref><ref id="R66"><label>66.</label><mixed-citation publication-type="journal"><name><surname>Carrico</surname><given-names>C</given-names></name>, <name><surname>Gennings</surname><given-names>C</given-names></name>, <name><surname>Wheeler</surname><given-names>DC</given-names></name>, <name><surname>Factor-Litvak</surname><given-names>P</given-names></name>. <article-title>Characterization of Weighted Quantile Sum Regression for
Highly Correlated Data in a Risk Analysis Setting</article-title>. <source>J
Agric Biol Environ Stat</source>.
<year>2015</year>;<volume>20</volume>(<issue>1</issue>):<fpage>100</fpage>&#x02013;<lpage>120</lpage>.
doi:<pub-id pub-id-type="doi">10.1007/s13253-014-0180-3</pub-id><pub-id pub-id-type="pmid">30505142</pub-id></mixed-citation></ref><ref id="R67"><label>67.</label><mixed-citation publication-type="journal"><name><surname>Dougherty</surname><given-names>ER</given-names></name>, <name><surname>Brun</surname><given-names>M</given-names></name>. <article-title>On the number of close-to-optimal feature
sets</article-title>. <source>Cancer Inform</source>.
<year>2007</year>;<volume>2</volume>:<fpage>189</fpage>&#x02013;<lpage>196</lpage>.<pub-id pub-id-type="pmid">19458767</pub-id></mixed-citation></ref><ref id="R68"><label>68.</label><mixed-citation publication-type="journal"><name><surname>Judson</surname><given-names>R</given-names></name>
<article-title>Public databases supporting computational
toxicology</article-title>. <source>J Toxicol Environ Health B Crit
Rev</source>.
<year>2010</year>;<volume>13</volume>(<issue>2&#x02013;4</issue>):<fpage>218</fpage>&#x02013;<lpage>231</lpage>.
doi:<pub-id pub-id-type="doi">10.1080/10937404.2010.483937</pub-id><pub-id pub-id-type="pmid">20574898</pub-id></mixed-citation></ref><ref id="R69"><label>69.</label><mixed-citation publication-type="journal"><name><surname>Pawar</surname><given-names>G</given-names></name>, <name><surname>Madden</surname><given-names>JC</given-names></name>, <name><surname>Ebbrell</surname><given-names>D</given-names></name>, <name><surname>Firman</surname><given-names>JW</given-names></name>, <name><surname>Cronin</surname><given-names>MTD</given-names></name>
<article-title>In Silico Toxicology Data Resources to Support Read-Across and
(Q)SAR</article-title>. <source>Front Pharmacol</source>.
<year>2019</year>;<volume>10</volume>:<fpage>561</fpage>.
doi:<pub-id pub-id-type="doi">10.3389/fphar.2019.00561</pub-id><pub-id pub-id-type="pmid">31244651</pub-id></mixed-citation></ref><ref id="R70"><label>70.</label><mixed-citation publication-type="journal"><name><surname>Yu</surname><given-names>M</given-names></name>, <name><surname>Dolios</surname><given-names>G</given-names></name>, <name><surname>Yong-Gonzalez</surname><given-names>V</given-names></name>, <etal/>
<article-title>Untargeted metabolomics profiling and hemoglobin normalization
for archived newborn dried blood spots from a refrigerated
biorepository</article-title>. <source>J Pharm Biomed Anal</source>.
<year>2020</year>;<volume>191</volume>:113574. doi:<pub-id pub-id-type="doi">10.1016/j.jpba.2020.113574</pub-id></mixed-citation></ref><ref id="R71"><label>71.</label><mixed-citation publication-type="journal"><name><surname>Bell</surname><given-names>EM</given-names></name>, <name><surname>Yeung</surname><given-names>EH</given-names></name>, <name><surname>Ma</surname><given-names>W</given-names></name>, <etal/>
<article-title>Concentrations of endocrine disrupting chemicals in newborn blood
spots and infant outcomes in the upstate KIDS study</article-title>.
<source>Environ Int</source>. <year>2018</year>;<volume>121</volume>(Pt
<issue>1</issue>):<fpage>232</fpage>&#x02013;<lpage>239</lpage>.
doi:<pub-id pub-id-type="doi">10.1016/j.envint.2018.09.005</pub-id><pub-id pub-id-type="pmid">30219610</pub-id></mixed-citation></ref><ref id="R72"><label>72.</label><mixed-citation publication-type="journal"><name><surname>Yano</surname><given-names>Y</given-names></name>, <name><surname>Schiffman</surname><given-names>C</given-names></name>, <name><surname>Grigoryan</surname><given-names>H</given-names></name>, <etal/>
<article-title>Untargeted adductomics of newborn dried blood spots identifies
modifications to human serum albumin associated with childhood
leukemia</article-title>. <source>Leuk Res</source>.
<year>2020</year>;<volume>88</volume>:106268. doi:<pub-id pub-id-type="doi">10.1016/j.leukres.2019.106268</pub-id></mixed-citation></ref><ref id="R73"><label>73.</label><mixed-citation publication-type="journal"><name><surname>Yeung</surname><given-names>EH</given-names></name>, <name><surname>Bell</surname><given-names>EM</given-names></name>, <name><surname>Sundaram</surname><given-names>R</given-names></name>, <etal/>
<article-title>Examining Endocrine Disruptors Measured in Newborn Dried Blood
Spots and Early Childhood Growth in a Prospective Cohort</article-title>.
<source>Obes Silver Spring Md</source>.
<year>2019</year>;<volume>27</volume>(<issue>1</issue>):<fpage>145</fpage>&#x02013;<lpage>151</lpage>.
doi:<pub-id pub-id-type="doi">10.1002/oby.22332</pub-id></mixed-citation></ref><ref id="R74"><label>74.</label><mixed-citation publication-type="journal"><name><surname>Ernst</surname><given-names>M</given-names></name>, <name><surname>Rogers</surname><given-names>S</given-names></name>, <name><surname>Lausten-Thomsen</surname><given-names>U</given-names></name>, <etal/>
<article-title>Gestational age-dependent development of the neonatal
metabolome</article-title>. <source>Pediatr Res</source>. Published online
September 17, 2020. doi:<pub-id pub-id-type="doi">10.1038/s41390-020-01149-z</pub-id></mixed-citation></ref><ref id="R75"><label>75.</label><mixed-citation publication-type="web"><name><surname>Van Horn</surname><given-names>JD</given-names></name>. <source>Biomedical data science innovation labs: an intensive research
project development program</source>. <date-in-citation>Accessed October 15,
2020</date-in-citation>. <comment><ext-link ext-link-type="uri" xlink:href="https://projectreporter.nih.gov/project_info_description.cfm?aid=10049064&#x00026;icde=52210839">https://projectreporter.nih.gov/project_info_description.cfm?aid=10049064&#x00026;icde=52210839</ext-link></comment></mixed-citation></ref></ref-list></back><floats-group><fig id="F1" orientation="portrait" position="float"><label>Figure 1.</label><caption><p id="P38">Example of the challenge of interchangeable co-exposures.</p><p id="P39">To illustrate the challenge of statistically interchangeable
co-exposures in environmental health, we constructed a data set including
preterm birth rates by county, along with the estimated ambient concentrations
of 175 air toxics from the 2014&#x02019;s EPA national air toxics assessment. To
show that equivalent co-exposures can be interchangeable in a predictive model,
we ran a 10-fold, cross-validated LASSO regression on the data set and reported
the selected co-exposures. We then removed each selected co-exposure from the
predictor set and re-ran the algorithm to obtain a new model. We tested the
equivalence of the two models with a paired t-test of the two models&#x02019;
residuals. If the difference of the residuals were not statistically significant
(at the 0.1 level), the co-exposure sets were considered equivalent. In general,
the initial signature included 9 variables, and 8 of them were found to be
replaceable.</p><p id="P40">For example, chloroform and ethylene oxide have equivalent statistical
information for preterm birth: They are both predictive of preterm birth and are
interchangeable in a model including 8 additional covariates. The corresponding
residuals of the two models, plotted above are almost collinear
(Pearson&#x02019;s &#x00393;: 0.923).</p></caption><graphic xlink:href="nihms-1685472-f0001"/></fig><fig id="F2" orientation="portrait" position="float"><label>Figure 2.</label><caption><p id="P41">Example implementation of the proposed framework to investigate
environmental contributors to preterm birth</p></caption><graphic xlink:href="nihms-1685472-f0002"/></fig><boxed-text id="BX1" position="float" orientation="portrait"><caption><title>Highlights</title></caption><list list-type="bullet" id="L1"><list-item><p id="P42">Rates of preterm birth and low birthweight continue to rise in the
United States.</p></list-item><list-item><p id="P43">Interdisciplinary data science can address challenges of previous
research.</p></list-item><list-item><p id="P44">Integrative analysis of complex environmental data can improve
translation efforts.</p></list-item></list></boxed-text></floats-group></article>