### Model definition

A host of experimental data has been collected which links PR-A, PR-B, and inflammatory drivers in the pregnancy uterus [7, 8, 20–22, 27, 32]. We translated the principles of the PR-A:PR-B hypothesis into equations which could be used to mathematically explore the dynamics and consistency of the biological hypothesis of how the progesterone receptors interact with inflammation during pregnancy. In essence, the PR-A:PR-B hypothesis describes a standard competitive interaction between the pro-pregnancy actions of PR-B and the pro-labor actions of PR-A where the activity of PR-A is related to the level of inflammation in the myometrium. As such, we chose to consider only PR-B and inflammation and incorporated the effects of PR-A into the inflammatory terms of the model (Fig. 1
a).

Two coupled differential equations were used to model the change in transcriptionally active PR-B over time, \(\frac {d\hat {B}}{dt}\), and the change of inflammation over time, \(\frac {d\hat {I}}{dt}\), as a function of the growth and depletion of each parameter and interactions between parameters. The equations we propose to model the transcriptional activity of PR-B and inflammation status are

$$\begin{array}{@{}rcl@{}} \frac{d\hat{B}}{dt} &=& \hat{b}\hat{B}\left(1-\frac{\hat{B}}{B_{c}}\right)-k_{1}\hat{B}\hat{I},\\ \frac{d\hat{I}}{dt} &=& \hat{i}\hat{I}\left(1-\frac{\hat{I}}{I_{c}}\right)-k_{2}\hat{B}\hat{I}. \end{array} $$

(1)

The growth of PR-B and inflammation is modeled by the terms \(\hat {b}\hat {B}(1-\frac {\hat {B}}{B_{c}})\) and \(\hat {i}\hat {I}(1-\frac {\hat {I}}{I_{c}})\) respectively. These terms imply that the levels of PR-B and inflammation increase in the presence of PR-B, \(\hat {B}\), and inflammation, \(\hat {I}\), at rates \(\hat {B}\) and \(\hat {I}\) respectively. The terms \((1-\frac {\hat {B}}{B_{c}})\) and \((1-\frac {\hat {I}}{I_{c}})\), impose a maximum or *critical* value on the level of PR-B and inflammation. At any given time, there is a finite level of PR-B induced activation of the transcriptional machinery. This critical level of PR-B is represented by the parameter *B*
_{
c
}. Analogously, there is some saturable level of inflammatory drivers active in a myometrial cell, represented by the parameter *I*
_{
c
}. The growth terms \(\hat {b}\hat {B}\) and \(\hat {i}\hat {I}\) in the equations for PR-B and inflammation will themselves increase in size as the amount of PR-B and inflammation increase, but in a way that is limited by the critical values for PR-B and inflammation. If \(\hat {B} =B_{c}\) or \(\hat {I} = I_{c}\), the limiting terms in parenthesis equal zero which causes the growth term to equal zero.

The depletion of PR-B and inflammation is modeled by the terms after the negative sign, namely \(k_{1}\hat {B}\hat {I}\) and \(k_{2}\hat {B}\hat {I}\). Qualitatively, this means that the rate of depletion of PR-B is the product of \(\hat {B}\), \(\hat {I}\), and a rate constant *k*
_{1}. The depletion of inflammation follows the same behavior with a different rate constant *k*
_{2}. The value of *k*
_{1} accounts for the relative amount of PR-B to repress inflammation and *k*
_{2} accounts for the relative impact of inflammation on PR-B activity. While we know that the phenomenon of PR-B repression of inflammation occurs, the exact mechanism for this repressive activity is not well understood. By allowing for *k*
_{1} and *k*
_{2} to take on different values relative to one another, we are able to explore multiple possible models for PR-B repression of inflammation.

### Model nondimensionalization and simplification

Nondimentionalization is a tool for simplifying our model whereby the six parameters in our model, \(k_{1}, k_{2}, B_{c}, I_{c}, \hat {i},\) and \(\hat {B}\), are replaced with three dimensionless constants (For full derivation of the dimensionless model see Appendix of derrivations). While this can make it difficult to pinpoint the influence of individual parameters on the system’s behavior, since we only have six parameters in our model, unpacking the influence of particular dimensionless constants is straightforward. The units for \(\hat {I}\), \(\hat {B}\), *B*
_{
c
}, and *I*
_{
c
} are the amount of PR-B or inflammation present, similar to a concentration. Time *t* is given in weeks. The rate constants \(\hat {I}\) and \(\hat {B}\) are in units \(\frac {1}{\text {weeks}}\) while the rate constants *k*
_{1} and *k*
_{2} are in units \(\frac {1}{\text {concentration} * \text {weeks}}\). We define three dimensionless variables for our model, \(B = \frac {\hat {B}}{B_{c}}\), \(I = \frac {\hat {I}}{I_{c}}\), and *τ*=*t*
*k*
_{1}
*I*
_{
c
}. Substituting these for \(\hat {B}\), \(\hat {I}\), and *t* yields the model,

$$\begin{array}{@{}rcl@{}} \frac{dB}{d\tau} = bB(1-B)-BI, \quad \quad \frac{dI}{d\tau} = iI(1-I)-kBI, \end{array} $$

(2)

where \(b = \frac {\hat {b}}{I_{c}k_{1}}, i = \frac {\hat {i}}{I_{c}k_{1}},\) and \(k = \frac {B_{c}k_{2}}{I_{c}k_{1}}.\)

### Determining parameter values from transcriptome data

We obtained data from two published studies which examined transcriptional changes in the myometrium (obtained at the time of cesarean section delivery) of women who were not in labor (NIL: closed and rigid cervix and no indication of uterine contractions) and in labor (IL: cervix dilated >4 cm and rhythmic contractions). One study [28] in which transcriptome analysis was performed by microarray technology (henceforth referred to as the microarray dataset or microarray data) comprised 3 myometrial samples from NIL women at term (>37 weeks gestation), 3 samples from IL women at term, and 3 samples from IL women undergoing preterm (<37 weeks gestation) cesarean section delivery. Some confusion has emerged since publication about whether one of the term IL samples was truly in labor. We excluded this sample for our analysis of this dataset and combined the preterm and term IL samples into one IL group. The other study [29] used RNA sequencing (henceforth referred to as the RNAseq dataset or RNAseq data) of myometrium from 5 NIL women and 5 IL women at term. Both datasets collected independent, non-paired samples, comprising a total cohort of 18 unique samples (10 IL, 8 NIL).

We infer the activity of PR-B and pro-inflammatory drivers with two PR-B responsive genes to serve as surrogates for PR-B, FOXO1A and FKBP5 [20, 30], and three pro-inflammatory genes to serve as surrogates for inflammation, IL- 1*β*, IL-6 and IL-8 [31]. Although there may be more genes associated with PR-B and inflammatory response in the myometrium, a fully curated gene set does not exist for inferring the full set of transcriptome changes induced by PR-B and inflammation. Therefore, we focussed on a targeted set of well studied myometrium specific genes for inferring the activity of PR-B and inflammation in the model. The normalized values of the data for these genes *N*
_{
j
}, are calculated using the equation, \(N_{j} = \frac {G_{j}- m}{M-m},\) where *G*
_{
j
} is the value of the gene for patient *j*, *M* is the maximum expression value for that gene across patients in the dataset, and *m* is the minimum value of that gene across patients in the dataset. One consequence of this normalization procedure is that the transcriptome data has been nondimensionalized enabling integration of the dimensionless trascriptome data with the dimensionless constants. In this formulation the values of the surrogate genes parameterize the dimensionless model for each patient.

The normalization equation bounds the values for the PR-B and inflammation surrogates from 0 to 1 and makes the natural choice of values for critical levels of PR-B and inflammation *B*
_{
c
}=*I*
_{
c
}=1. We then assign the values of *b* and *i* using the normalized dimensionless values for the PR-B and inflammatory surrogate genes respectively and the value of *k* corresponds to the strength of PR-B’s anti-inflammatory actions. The values of *b* and *i* are determined by the normalized intensity of one of the PR-B and one of the inflammatory surrogate genes from the mRNA expression studies [28, 29] and there are six possible combinations of values for *b* and *i* depending upon which inflammatory and PR-B surrogate genes are chosen. Thus, for a particular patient, the values of *b* and *i* are determined by the normalized expression intensity from the RNA-seq or microarray study for one pair PR-B and inflammatory surrogate genes.

### Calculating the probability of labor for each patient

Next, we quantify the behavior of *k* in order to apply our model to patient data. To do this, we have to derive the steady state solutions for our model. These solutions are the values of *B* and *I* which cause \(\frac {dB}{d\tau }=\frac {dI}{d\tau }=0\) and correspond to a state where the system undergoes no change. There are four steady states, also known as *equilibrium points*. These occur when the ordered pair for PR-B and inflammation, (*B*
^{∗},*I*
^{∗}), is equal to, (0, 0), (1, 0), (0, 1), and \(\left (\frac {i(1-b)}{k-ib},\frac {b(k-i)}{k-ib}\right)=\left (\frac {i^{2}b(1-k)-ik(1-b)}{k(bi-k)}, \frac {kb(i-1)}{bi-k}\right)\). Of these, (0,0) is the trivial equilibrium point where neither PR-B nor inflammation is present, (1,0) is the *quiescent equilibrium* where PR-B is maximal and there is no inflammation, and (0,1) is the *laboring equilibrium* where there is no PR-B and inflammation is maximal. The quiescent equilibrium correspond to a PR-B dominant state and laboring equilibrium corresponds to an inflammatory dominant state. The fourth equilibrium point, the *intermediate equilibrium*, exists only for certain values of *b* and *i* between the quiescent and laboring equilibrium (For full derivation of equilibrium points see Appendix of derrivations).

Since the values of *b* and *i* are determined by PR-B and inflammatory surrogate genes scaled from 0 to 1, these terms are bounded to that interval. Furthermore, since *B* and *I* are bounded by *B*
_{
c
}=1 and *I*
_{
c
}=1, *B*
^{∗} and *I*
^{∗} are bounded to the square domain with vertices (0,0), (0,1), (1,1), and (1,0) and area 1. So, the intermediate equilibrium point, in both forms, should satisfy the constraints 0≤*B*
^{∗}≤1 and 0≤*I*
^{∗}≤1. By considering how this constraint impacts both forms of the intermediate equilibrium we can derive a set of constraints for the values of *k*. In order to allow for the full range of values of *i*, we find that *i* and *k* satisfy 0≤*i*<*k*≤1.

In the limit in the case where *i*=*k*, the intermediate equilibrium point equals the quiescent equilibrium point (1,0). If we visualize this state in *phase space* (Fig. 1
b) we see that all the vectors point away from the quiescent equilibrium point toward the laboring equilibrium. In phase space, these vectors define *trajectories* that indicate how the system would evolve in time given a certain starting point. The set of vectors pointing toward the laboring equilibrium point is known as the *basin of attraction* for the laboring equilibrium point.

We compute a *probability of labor* equal to the area of the laboring equilibrium point’s basin of attraction divided by the area of the domain, which in our case is 1. The area of the basin changes as *k*, *i*, and *b* change. This probabilistic interpretation of a phase space is reasonable under the assumptions that all possible pairs of values of *B* and *I* in the domain occur with equal likelihood. While it is clear that in a physiological context there are values of *B* and *I* that are more likely at different time points in pregnancy, since this information is not readily available our assumption of uniformity enables us to compute the probabilities in the most agnostic way possible. Given more information about the distribution of *B* and *I* over the course of pregnancy, it could be possible to get a more precise estimate of the probability of labor, but such an extension is not possible at this t.

For example, in the case where *k*=*i* and *b* is fixed at 0.5, the probability of labor is equal to 1, the entire domain is the basin of attraction for the laboring equilibrium, which means that quiescence is impossible (Fig. 1
c). In order to ensure that quiescence is a possibility, we set *k*=1 so that only one value of *i*, *i*=1, results in a probability of labor equal to 1, enabling us to explore the full range of values for *b* and *i*. With *k* fixed, we can plot the dependence of the probability of labor upon the values of *b* and *i* in a surface upon which all patients must fall given a value for PR-B and inflammatory activity. Thus, the model we apply to patient data is

$$\begin{array}{@{}rcl@{}} \frac{dB}{d\tau} = bB(1-B)-BI, \quad \quad \frac{dI}{d\tau} = iI(1-I)-BI. \end{array} $$

(3)

Each patient in the microarray and RNA-seq datasets has an expression value for each of the surrogate genes, FOXO1A, FKBP5, IL- 1*β*, IL-6 and IL-8. In the absence of proteomic data precisely quantifying the protein level activity of these genes in vivo the mRNA expression levels can be combined with the framework of our mathematical model to approximate the functional activity of these genes at the time of labor, i.e. using FOXO1A or FKBP5 for PR-B and IL- 1*β*, IL-6, or IL-8 for inflammation. We calculated a probability of labor for each patient in each dataset using all six possible combinations of surrogate genes (FOXO1A, IL- 1*β*), (FKBP5, IL- 1*β*), (FOXO1A, IL-6), (FKBP5, IL-6), (FOXO1A, IL-8), and (FKBP5, IL-8) where each surrogate was used to set the parameters *b* and *i* in our model. We will hereafter refer to a pair of surrogate genes as a *predictor*. A probability of labor was computed for each patient which corresponds to the size of the basin of attraction for the laboring equilibrium point given a predictor pair of surrogate genes for *b* and *i*.

### Classifier construction and assessment

After normalization within platform cohort, half of the IL and NIL samples from the total cohort of 18 were randomly selected as a training dataset to construct a classifier from five IL and four NIL samples. Probabilities of labor were calculated for each of the samples using a pair of predictor genes, one PR-B responsive and one inflammatory responsive, to value the parameters *b* and *i* in the model. Two nonparametric 95 % confidence intervals were computed for the the probabilities of labor for the NIL and IL samples. These intervals about the medians of the NIL and IL samples constituted the NIL and IL classifiers. If the intervals did not separate, then we discarded that classifier.

We assessed performance of successful classifiers by computing the probabilities of labor for the remaining nine samples using the same predictor genes. The nine samples in this test set of data were classified as IL, NIL, or a no-call depending on whether a sample’s probability fell into the training set confidence interval for IL, NIL, or somewhere in between. Precision and recall metrics were used to assess the classifier defined as

$$\begin{array}{@{}rcl@{}} \text{precision} &=& \frac{\text{correctly classified samples}}{\text{total classified samples}}\\ \text{recall} &=& \frac{\text{classified samples}}{\text{total samples}}. \end{array} $$

(4)

This procedure of creating classifiers with training samples and predicting phenotypes for test samples was repeated for all 17,640 possible combinations of samples from the microarray and RNA-seq datasets. For each combination, classifier performance was assessed for probabilities of labor calculated from each of all six combinations of PR-B (FOXO1A, FKBP5) and inflammatory (IL- 1*β*, IL-6, and IL-8) responsive genes. Precision and recall metrics for each classifier were aggregated into an average F-score and the proportion of successful classifiers out of 17,640 possible classifiers were computed. These metrics assessed how the selection of samples for the training set influenced i) classifier performance sensitivity and ii) classifier construction sensitivity. The equations for F-score and classifier success rate (CSR) are

$$\begin{array}{@{}rcl@{}} \text{F-score} &=& \frac{2*\text{precision}*\text{recall}}{\text{precision}+\text{recall}}\\ \text{CSR} &=& \frac{\text{constructed classifiers}}{17,640}. \end{array} $$

(5)

These 105,840 model classifiers (17,640 training set combinations × 6 combinations of model predictor genes) were compared to two types of null classifiers. We constructed single and two-gene null classifiers for each of the 17,640 combinations of samples using the normalized expression values of the raw datasets. The single gene null classifiers were built by constructing 95 % nonparametric confidence intervals for the IL and NIL training samples on the normalized expression values for the individual genes. The two-gene null classifiers were constructed by defining a two dimensional confidence region for the two predictor genes for the NIL and IL training samples. Precision, recall, F-score, and CSR metrics were calculated for all 264,600 null classifiers (17,640 training set combinations x 15 one or two-gene null classifiers) and compared to the model classifiers to assess i) the relative performance of the single and two-gene null classifiers and ii) the performance of our model’s two-gene classifier to the null two-gene classifiers. A full workflow of this approach can be found in Fig. 2.