^{1}

^{2}

^{1}

^{2}

^{*}

Conceived and designed the experiments: KT MJ IB. Performed the experiments: KT. Analyzed the data: KT IB. Contributed reagents/materials/analysis tools: IB. Wrote the paper: KT MJ IB.

The mechanisms by which human embryonic stem cells (hESC) differentiate to endodermal lineage have not been extensively studied. Mathematical models can aid in the identification of mechanistic information. In this work we use a population-based modeling approach to understand the mechanism of endoderm induction in hESC, performed experimentally with exposure to Activin A and Activin A supplemented with growth factors (basic fibroblast growth factor (FGF2) and bone morphogenetic protein 4 (BMP4)). The differentiating cell population is analyzed daily for cellular growth, cell death, and expression of the endoderm proteins Sox17 and CXCR4. The stochastic model starts with a population of undifferentiated cells, wherefrom it evolves in time by assigning each cell a propensity to proliferate, die and differentiate using certain user defined rules. Twelve alternate mechanisms which might describe the observed dynamics were simulated, and an ensemble parameter estimation was performed on each mechanism. A comparison of the quality of agreement of experimental data with simulations for several competing mechanisms led to the identification of one which adequately describes the observed dynamics under both induction conditions. The results indicate that hESC commitment to endoderm occurs through an intermediate mesendoderm germ layer which further differentiates into mesoderm and endoderm, and that during induction proliferation of the endoderm germ layer is promoted. Furthermore, our model suggests that CXCR4 is expressed in mesendoderm and endoderm, but is not expressed in mesoderm. Comparison between the two induction conditions indicates that supplementing FGF2 and BMP4 to Activin A enhances the kinetics of differentiation than Activin A alone. This mechanistic information can aid in the derivation of functional, mature cells from their progenitors. While applied to initial endoderm commitment of hESC, the model is general enough to be applicable either to a system of adult stem cells or later stages of ESC differentiation.

Embryonic stem cells (ESC) have gained much attention in recent years for their ability to differentiate into cells of any of the three primary germ layers (endoderm, mesoderm, and ectoderm) as well as to remain in a pluripotent state under appropriate conditions

The current study aims to gain a better understanding of endoderm differentiation of human embryonic stem cells (hESC) through the development of a population based model. We base our model on earlier reports

Human embryonic stem cells (H1 cell line) were cultured under feeder-free condition. 6-well culture dishes were incubated with Matrigel™ coating (hESC-qualified Matrix, BD Biosciences, San Jose, CA, USA) for 30 minutes. hESC colonies (p93) were plated onto the Matrigel layer with 1 mL mTeSR®1 hESC media and supplement (Stem Cell Technologies, Vancouver, BC, Canada). The cells were incubated at 37°C in 5% CO_{2}, and the mTeSR®1 media was replaced daily. For the differentiation study, the current work chose to compare two conditions to induce endoderm: human Activin A (henceforth referred to as ‘Condition A’) and human Activin A supplemented with the growth factors FGF2 and BMP4 (‘Condition B’). To commence endoderm induction, DMEM/F12 (Invitrogen, Carlsbad, CA, USA), 1×B27® Supplement (Invitrogen), and 0.2% Bovine Serum Albumin (BSA, Sigma-Aldrich,St. Louis, MO, USA) supplemented with 100 ng/mL human Activin A (Condition A; R&D Systems, Minneapolis, MN, USA) or 100 ng/mL human Activin A, 100 ng/mL FGF2 (Invitrogen), and 100 ng/mL BMP4 (Condition B; R&D Systems) was used as differentiation media, which was replaced daily for a total of five days. Upon induction of differentiation cells were harvested on a daily basis for subsequent analysis. For each well, the supernatant was collected, the plated cells were dissociated with Trypsin+EDTA (Invitrogen), Trypan Blue (Sigma-Aldrich) was added to distinguish live from dead , and the cells were counted using a standard hemacytometer,

For flow cytometry, harvested cells were first fixed for 15 minutes in 4% methanol-free formaldehyde (Thermo Scientific, Rockford, IL, USA) in phosphate-buffered saline (PBS). Cells were washed twice and permeabilized in 0.1% Saponin (Sigma-Aldrich)+0.5% BSA in PBS for 30 minutes. To block non-specific binding, the cells were incubated in 3% BSA+0.25% dimethyl sulfoxide (DMSO, Fisher Scientific, Hampton, NH, USA)+0.1% Saponin in PBS for 30 minutes. A portion of cells were then set aside as the negative control (secondary antibody only without primary). The cells to be used as the positive samples were then incubated in blocking buffer with goat anti-human sox17 (R&D Systems) and rabbit anti-human cxcr4 (Abcam, Cambridge, MA, USA) primary antibodies, 1∶200 dilution, for 30 minutes. The cells were washed twice with blocking buffer, resuspended in the buffer, and incubated with donkey anti-goat APC (1∶350 dilution; Santa Cruz Biotechnology, Santa Cruz, CA, USA) and donkey anti-rabbit FITC (1∶200 dilution; Santa Cruz Biotechnology) for 30 minutes (both the samples and negative control). Two washings were followed by 10 minutes of 0.2% tween-20 (Bio-Rad, Hercules, CA, USA) to further eliminate non-specific staining. Cells were washed and transferred to flow cytometry tubes. Accuri C6 © Flow Cytometer was used to quantify sox17-APC and cxcr4-FITC expression. Cells stained with the secondary antibody only (without primary antibody) were first analyzed; this population was taken as the negative, and the gate was set beyond these cells to eliminate false positives due to auto-fluorescence and non-specific secondary antibody binding. The completely stained samples (primary and secondary antibody stained) were then analyzed, and the percentage of the population falling within the set gate was recorded as the positive sample for the respective antibody.

To quantify mRNA levels, harvested cells were lysed and mRNA was extracted and purified using a Nucleospin II RNA kit (Macherey-Nagel, Bethlehem, PA, USA). The RNA quantity and quality was measured using a SmartSpec™ Plus spectrophotometer (Bio-Rad), after which reverse transcription was performed with the ImProm II Reverse Transcriptase System (Promega, Madison, WI, USA). cDNA levels of Gapdh (left primer:

The system of stem cell differentiation to endoderm is modeled using a stochastic population-based model. The basic formulation of the model is based on earlier reports for hematopoietic stem cells

The model is initiated by a population of cells, the properties of which are evolved by specific pre-assigned rules. The cells are primarily categorized into two signaling regimes: Ω and A; Ω can be considered as an active regime supporting cellular proliferation and differentiation, while A is a more dormant regime where cells are quiescent and prone to dedifferentiation. The cells can transfer in between these two regimes, an event decided upon primarily by a cell-specific parameter, termed as ‘a’ value. This parameter ‘a’ is randomly assigned to each cell at the beginning of simulation, and is updated at each time step. The probability of transfer to/from a regime is dependent on this parameter along with the number of cells in the destination regime. This ‘a’ value is unaltered in the A regime while it progressively reduces in the Ω regime, and when it falls below a threshold the cells lose their ability to transfer to the A regime.

Each cell is randomly assigned a maximum life span, exceeding which it will die. While the cells age in the Ω regime, they neither proliferate nor age in the A regime. Proliferation is allowed in the Ω regime for an amount of time which is cell dependent, after which the cell enters a senescent stage and will not proliferate. Furthermore, an individual cell is allowed to proliferate only after it loses the capability to pass into A regime having crossed the ‘a’ threshold value.

Cellular differentiation is governed by the ‘lineage propensity’ parameter, representing a cell's likelihood to differentiate into a particular lineage. Only the Ω regime allows an increase in lineage propensity. While updating the propensity of differentiation to a particular lineage, all the possible lineages are competing and any can be updated, with the one with higher propensity having a higher probability of being selected. Once a cell's propensity exceeds a threshold level, the cell is considered committed to that particular lineage and will retain its differentiated phenotype. If this threshold has not been exceeded and if the cell is chosen to be transferred to the A regime, the propensity values will converge to an average value. The model is therefore able to track specific germ layer populations, and through this the percent of the population positive for Sox17 (visceral and definitive endoderm marker) and CXCR4 (definitive endoderm and mesendoderm marker)

The current work is focused on the mechanistic investigation of the dynamics of hESC induction into endoderm. Using the platform of the stochastic population based model we investigated several alternate mechanisms and analyzed them for agreement with experimental data. Three characteristics of the differentiation process were chosen to be investigated: the presence/absence of an intermediate germ layer, mesendoderm, which subsequently gives rise to mesoderm and endoderm

The incorporation of mesendoderm involved a two stage differentiation scheme. In the first stage, hESC are able to differentiate into either mesendoderm or visceral endoderm. Once cells are committed to the mesendoderm lineage, several of their attributes are re-initialized, such as their ‘a’ value and lineage propensities. The mesendodermal cells can then further differentiate into endoderm or mesoderm. Differences in the proliferation potential of different phenotypes were incorporated by considering 3 scenarios: proliferation of hESC and mesendoderm; proliferation of hESC, mesendoderm and endoderm; and proliferation of all phenotypes.

As with any stochastic model, the number of model runs necessary to obtain a converged solution needs to be determined. An additional parameter of the current model is the initial cell population, which affects the solution over a certain range. A two-dimensional convergence study was undertaken, wherein the effects of both stochastic run number and initial cell population on model output were determined. The convergence test allows determination of the minimum number of stochastic runs and initial cell population beyond which the model output does not significantly change. All results reported here are using the converged parameter values.

Sensitivity analysis was performed to determine the relative importance of parameters in affecting the outputs of cellular growth, death, and lineage commitment. Because this model is probabilistic in nature, traditional ways of determining local sensitivity, e.g. partial derivative of output with respect to an input, cannot be employed. A stochastic analysis was therefore chosen, using differences in output histograms of nominal and perturbed parameters, S, to estimate parameter sensitivity _{j}_{j}_{i}_{j/}y_{j}_{i}

For sensitivity analysis, each parameter was perturbed by 10% while keeping the rest at the nominal value. For each bin in the nominal output histogram, the difference between the percentage of total nominal histogram elements residing in that bin and the percentage of the total perturbed histogram elements residing in that bin is calculated. The sum of the absolute value of this difference over all of the bins is the histogram distance.

Having determined the most sensitive model parameters, the next step is to determine an appropriate value of the parameters which best estimates the experimental data. We realize that a single parameter combination may not be adequate in describing the experimental data; instead, there exists a parameter hyper-space adequately satisfying the data. Hence an ensemble parameter estimation was performed by randomly generating initial guesses from the hyper-space of the sensitive parameters. The model was simulated with 10000 random parameter samples; for each of these simulations the least square estimate is determined between experimental data and model output. These simulations were run for each mechanism and condition under investigation, and parameter ensembles were generated by considering only those parameter sets which meet certain error constraints. Following the above detailed methodology we evaluated the model predictions obtained from the different mechanisms and compared them with experimental data to determine the most plausible mechanism.

The hESC culture was analyzed for cellular growth and death dynamics during endoderm induction by both Activin A (Condition A) and Activin A/FGF2/BMP4 (Condition B) conditions as illustrated in

Cellular growth (A) and death (B) dynamics for Conditions A and B. Temporal behavior of cellular population positive for Sox17 (C) and CXCR4 (D). Inset: Representative output of flow cytometry data. Red histogram: secondary antibody only control. Black histogram: sample. Red gate denotes sample taken to be positive.

The differentiated cell population was analyzed by flow cytometry for percent of cells positive for Sox17 and CXCR4 for each day of differentiation.

In the next step the developed stochastic model was used to test the proposed mechanisms for agreement with experimental data. The mathematical model involves multiple parameters which require detailed analysis before the model can be used for prediction. The parameters can be grouped into two categories: (a) simulation parameters which affect the convergence behavior of the simulation and (b) model parameters which affect specific model output for the converged simulation. Two parameters were identified to be simulation parameters: initial cell population and number of stochastic runs. These parameter values were optimized by performing a two-dimensional convergence study, as illustrated in

Output is percent of the simulated population positive for CXCR4, averaged over all stochastic runs at Day 5.

A detailed sensitivity analysis was performed for the model parameters in order to determine the relative importance of the parameters in determining model output. As detailed in Equation 1, the measure of histogram distance is used to represent the parameter sensitivity associated with a specific model output.

(A) Cellular growth sensitivity to each of the parameters, perturbed by 10%. Parameter definitions listed in

While

a_{min}: ‘a’ value threshold beyond which a cell enters the proliferation phase

a_{0max}: The initial cell population is randomly assigned an ‘a’ value, with an upper limit of a_{0max}

x_{com}: lineage propensity threshold beyond which a cell is committed to a particular lineage

d: factor with which ‘a’ decreases

t_{g1}: time cell stays in G1 phase of cell cycle. Only in this phase can a cell differentiate and transfer to the A regime from the Ω regime

l_{max}: upper value of range of cell population's life span

nprog_{i}: factor in determining magnitude of propensity updates for each lineage i

aa: parameter in determining the probability of a cell transferring from the Ω regime to the A regime

Having determined the sensitive parameters for each of the mechanisms, the next step is to determine the optimum parameter values which will result in best agreement with experimental data. In literature, biological samples are typically defined as being ‘sloppy’

(A) Parameter values for Mechanism B ensemble yielding errors of less than 0.025. Each parameter is compared to the most sensitive parameter, ‘d’. Color bar denotes the ensemble error for that particular parameter value. (B) Minimum ensemble error generated for each mechanistic model. Proliferation induced: A, all phenotypes; E&U, endoderm and uncommitted (hESC and mesendoderm); U, uncommitted only. Blue, Condition A; Red, Condition B.

As shown in

Grey band denotes the ensemble of simulations having an error less than the threshold, with the single solid black curve showing the best fit. Black circles represent the experimental data points. (A–D): Growth kinetics, cell death, fraction of population positive for Sox17 and CXCR4 dynamics, respectively, of Mechanism A; error threshold of 0.05. (E–F) Growth kinetics, cell death, fraction of population positive for Sox17 and CXCR4 dynamics, respectively, of Mechanism B; error threshold of 0.025. Cellular growth and death normalized to Day 4.

Condition B (Activin A supplemented with FGF2 and BMP4) proved more difficult to describe via the investigated mechanisms, mainly because CXCR4 dynamics exhibits a faster and more prominent drop as compared to Activin A only condition. As shown in

Grey band denotes the ensemble of simulations having an error less than the threshold, with the single solid black curve showing the best fit. Black circles represent the experimental data points. (A–D): Growth kinetics, cell death, fraction of population positive for Sox17 and CXCR4 dynamics, respectively, of Mechanism C; error threshold of 0.1. (E–F) Growth kinetics, cell death, fraction of population positive for Sox17 and CXCR4 dynamics, respectively, of Mechanism B; error threshold of 0.025. Cellular growth and death normalized to Day 4.

It is therefore reasonable to conclude that during endoderm induction with the conditions described above, undifferentiated stem cells first differentiate into a mesendoderm germ layer with subsequent differentiation to endoderm and mesoderm, the latter not expressing CXCR4. Furthermore, the induction condition seems to promote proliferation of both pluripotent and endoderm-like cells. The optimized parameters of this mechanism are shown in

The power of mathematical models lies in their predictive capacity. The predictive capacity of our proposed model was thus tested by simulating the population dynamics of cell types for which no

Simulated dynamics of the undifferentiated (A (Condition A), B (Condition B)) and mesendoderm (D (Condition A), E (Condition B)) phenotypes were compared to experimental data of their respective genes, measured by qPCR (markers: experimental measurements; lines: linear connections between data points): Oct4 (Undifferentiated; C) and Brachyury (Mesendoderm; F). The simulated dynamics bands represent 4000 stochastic simulations using the optimized parameters of Mechanism B. mRNA levels were measured with time using qPCR. Data was first normalized to the housekeeping gene Gapdh then to undifferentiated cells. Fold change levels, determined by the 2^{−ΔΔCt} method, were then normalized to the maximum level for each respective gene (data reported as percent of maximum fold change).

In the next step the validity of such prediction was verified by conducting further experiments to analyze the dynamics of undifferentiated cells by Oct4 gene expression and that of mesendoderm cells by Brachyury expression. While the comparison of population dynamics with mRNA levels is not exact, under the assumption of efficient translation they become comparable.

The objective of the current work is to investigate the mechanism of differentiation of hESC during endoderm induction through an integrated experimental and mathematical approach. We experimentally determine the dynamics of differentiation upon endoderm induction of hESC and use these data along with a population-based stochastic model to determine the mechanisms of differentiation. The model can track growth kinetics, the dynamics of cell death, and the dynamics of differentiation into the germ layers. Thorough comparison of these simulated outputs with experimental data enables determination of the dominant mechanism of differentiation. Furthermore, the model and predicted mechanism is validated against additional experimental observations of the temporal behavior of specific cellular populations. Even though these data were not used to build the model, the model performed extremely well in capturing their dynamics.

Definitive endoderm was induced in hESC through two different pathways: the addition of Activin A and Activin A supplemented with FGF2 and BMP4. The population-based model, revised from the model originally developed for hematopoietic stem cells

It is important to note that the nonlinearity observed in the differentiation dynamics contributed significantly towards identification of a robust mechanism. Sloppiness of biological parameters is well reported

Shown is the presence of mesendoderm, the lack of CXCR4 in mesoderm, and selective phenotype proliferation. ME: mesendoderm, VE: visceral endoderm.

The endoderm induction of hESC was conducted under two different conditions with the objective of investigating mechanistic differences between these two pathways. Quite interestingly, both conditions could be explained by the same, single mechanism, while the rejected mechanisms failed to describe the dynamics even after a thorough search of the parameter space. However, there were significant differences in optimum parameter values. One prominent difference between the two conditions was their differentiation potential after being committed to the mesendoderm germ layer. ‘a0max2’ is lower for Condition B, indicating that mesendodermal cells will more quickly reach the pro-differentiation and –proliferation regimes. This is also evident from the higher level of ‘d’, although this is for both stages of differentiation. Also, cell commitment for Condition B can be considered expedited when considering the lower value of ‘xcom2’, which is the propensity threshold beyond which a mesendodermal cell is considered committed to either endoderm or mesoderm. Therefore, Activin A supplemented with FGF2 and BMP4 drives differentiation towards endoderm/mesoderm to a higher degree than Activin A alone.

As detailed earlier, the developed model for the optimum mechanism could accurately capture the experimentally observed dynamics of differentiation. Quite interestingly, only a single mechanism could adequately describe the experimental data, while the rejected mechanisms failed to describe the dynamics even after a thorough search of the parameter space.

One of the purposes of the present study was to investigate several aspects of differentiation which have faced conflicting reports in the past and to offer further insight using a mathematical analysis. One of these features is the presence of surface receptor, CXCR4. McGrath

(TIF)

Click here for additional data file.

(TIF)

Click here for additional data file.

(DOC)

Click here for additional data file.

(DOC)

Click here for additional data file.

We would like to thank Dr. Ira Fox from the University of Pittsburgh for his generous gift of H1 hESC.