Modeling miRNA-mRNA interactions: fitting chemical kinetics equations to microarray data

Background The miRNAs are small non-coding RNAs of roughly 22 nucleotides in length, which can bind with and inhibit protein coding mRNAs through complementary base pairing. By degrading mRNAs and repressing proteins, miRNAs regulate the cell signaling and cell functions. This paper focuses on innovative mathematical techniques to model gene interactions by algorithmic analysis of microarray data. Our goal was to elucidate which mRNAs were actually degraded or had their translation inhibited by miRNAs belonging to a very large pool of potential miRNAs. Results We proposed two chemical kinetics equations (CKEs) to model the interactions between miRNAs, mRNAs and the associated proteins. In order to reduce computational cost, we used a non linear profile clustering method named minimal net clustering and efficiently condensed the large set of expression profiles observed in our microarray data sets. We determined unknown parameters of the CKE models by minimizing the discrepancy between model prediction and data, using our own fast non linear optimization algorithm. We then retained only the CKE models for which the optimized fit to microarray data is of high quality and validated multiple miRNA-mRNA pairs. Conclusion The implementation of CKE modeling and minimal net clustering reduces drastically the potential set of miRNA-mRNA pairs, with a high gain for further experimental validations. The minimal net clustering also provides good miRNA candidates that have similar regulatory roles.


Background
Transcriptional and translational processes are fundamental cell mechanisms, involving three main molecular species: messenger RNA (mRNA) and their associated proteins, as well as microRNAs (miRNAs).
The miRNAs are small non-coding RNAs of roughly 22 nucleotides in length, which can bind with and inhibit protein coding mRNAs through complementary base pairing. A given miRNA can potentially bind and silence hundreds of mRNAs across a number of signaling pathways. These repressive miRNA-mRNA interactions occur in multiple cellular processes [1][2][3], and involve two distinct modalities: they may directly degrade their target mRNAs, or more often inhibit their translation [4][5][6][7][8][9].
The best characterized features determining the targets of a specific miRNA are the conserved Watson-Crick pairing to the 5' region (positions 2-7) of the miRNA, which are the so-called "seed pairing rules" [3,[10][11][12][13]. Since seed pairing rules are neither sufficient nor necessary for miRNA-target functions [4,14], they have usually been combined with microarray or proteomic analysis to find potential miRNA-target pairs [15][16][17]. Classical microarray data analysis relies mostly on massive application of correlation analysis and linear statistical techniques to simultaneously acquired gene expression profiles. Combined with profile thresholding and heat map displays, these techniques provide commonly used clues for qualitative inference.
In [18], Principal Component Analysis and linear correlation had been linked with comparative sequence analysis to study microarray data recorded during mouse stem cells differentiation, and to broadly predict potential miRNA-mRNA interactions.
Traditional chemical kinetics equations had been proposed to model the transcriptional and translational processes without involving the interaction of miRNAs [19,20].
We have derived Chemical Kinetics Equations (CKEs) to model the dynamics of TD and TI motifs, in the spirit of [20][21][22][23][24][25][26][27][28][29][30]. The equations are algebraically invariant under affine transformation and allow data condensation to reduce computational cost. We have implemented "minimal net" clustering method, which can control the maximum diameter of the clusters, to condense large data sets of gene expression.
Modeling by nonlinear CKEs involves complicated parameter estimation problems to fit the very large set of expression profiles recorded by microarrays. We have developed innovative fast algorithms dedicated to CKE parameter estimation, by optimization of the quality of fit between model and data.
We validate only the parameterized motifs having a high quality of fit to data. To reach robust conclusions we apply a "parameter parsimony" principle, favoring the models having the smallest number of parameters. And we have also evaluated the robustness of our parameter estimation algorithm by algorithm by direct testing on simulated data. These tests did validate our parameter estimation method is quite robust. This nonlinear approach goes further than well-established analysis based on correlation techniques combined with heat map displays.

Basic interaction architectures
To validate if an miRNA M does indeed repress a given gene G, we model the chemical interactions of M and G within a small network containing the pair (M, G). We now sketch two basic interaction architectures and their CKE models.

Transcription Degradation Motifs (TD-motifs)
We call "TD-motif" any interaction architecture, as sketched in Figure 1, involving a single miRNA-mRNA pair (M,G) where G is in Target(M), and M degrades the transcription of G. The TD motif includes also two sets of proteins rep(G) and act(G), namely the transcriptional "repressors" and "activators" of G, denoted by rep(G) = {R 1 , . . . R k } and act(G) = {A 1 , A 2 , . . . , A q } Let g(t), p(t), m(t), r i (t), a j (t), be the expression levels at time t for the chemical species G, P, M, R i , A j . We model the transcription process by a CKE similar to CKEs proposed in [20,22,25,29,30], but with a complementary term encoding the repressive impact of miRNA M on its target mRNA G: where β > 0 is the degradation rate of G, v > 0 is the reaction rate between G and M, κ > 0 is the product of the transcription rate by the concentration c of DNA templates, concentration which we assume to be some constant not depending on time (see [20]).
is the fraction of existing DNA templates which are committed at time t to transcription of the mRNA gene G. Here the percentages REP(t) and ACT(t) are modeled by the following products, where The parameters SR i > 0, u i > 0 and SA j > 0, w j > 0 are the number of binding sites and the affinity constant for the transcriptional factors R i and A j .
Note that the transcription repressors R i combine multiplicatively their individual impacts REP i in REP(t), and that the REP i are analogous to Hill function (see [19,25]); similar remarks apply to the transcription activators. The multiplicative expressions of REP(t) and ACT(t) are typical of a so-called "cis-regulatory" function and have been derived by J. Goutsias [20].
The term κREP(t) ] dt is the concentration of new G molecules synthesized by transcription during the small time interval [t, t + dt], while the repressive interactions of M and G eliminates vg(t)m(t)dt molecules of G, and natural decay destructs βg(t)dt molecules of G.

Translation-Inhibition Motifs (TI-motifs)
We call "TI-motif" any interaction architecture, as sketched in Figure 1, involving a set M 1 , . . . , M r of miR-NAs inhibiting the translation of mRNA gene G, by repressing the expression of the protein P generated by G. Let (p(t), m i (t), g(t)) be the concentrations at time t of protein P, miRNA M i , and mRNA G. In the spirit of [20,21], we model the translation inhibition dynamics by the CKE dp dt where γ > 0 and λ > 0 are the degradation rate and the translation rate for protein P and where H(t) is the percentage of G molecules committed at time t to the translation of G. Thus H(t) encodes the inhibiting impact of the miRNAs M 1 , . . . , M k on the translation of G, and is modeled as a product of terms similar to REP(t): The parameters SM i > 0 and u i > 0 are resp. the number of binding sites and the affinity constant controlling the inhibiting impact of miRNA M i on the translation of gene G. Note that H(t) decreases when the miRNA concentrations m i (t) increase.
Our model for TI motif has been inspired by J. Goutsias [20], and we present below the hypotheses and arguments justifying the expression of H(t). There is a key difference between the CKEs modeling TD and TI motifs. For TI motifs, the concentration of G-molecules committed to translation is g(t)H(t), where g(t) is the concentration of G-molecules and H(t) < 100%.
For TD motifs, the concentration of G-molecules synthesized by transcription is κF(t) = αcF(t) with F(t) < 100%, where α and c are respectively the transcription rate and the concentration of DNA templates, assumed to be constant in time.

Derivation of chemical kinetics equations
Our derivation of the regulation equation for TI motif is quite similar to presentation given for TD motifs in [20], but has several changes in assumptions and formulations. To derive the CKEs (1) and (4), we propose a few hypotheses.
• Hyp. 1: (TD-motifs) The molecules of the miRNA repressor of gene G can strongly bind only at one unique specific site of G -molecules, and once a single such strong bind occurs, the corresponding G molecule degrades extremely fast. • Hyp. 2: (TI-motifs) Each miRNA M j in the set rep (G) of translation inhibitors of G can weakly bind with G but only at specific binding sites constituting a set BIND j of size S j . The sets BIND 1 , BIND 2 , . . . are pairwise disjoint. Once a G molecule thus binds with one inhibitor M j , then this G molecule will fail to translate. • Hyp. 3: For any given G -molecule, call X j the random number of sites in BIND j which actually bind with one "M j "-molecule. We assume that the random variables X 1 , X 2 , . . . . are independent.
Hyp.1 is based on the fact that only a small fraction of all messenger RNAs have more than a single miRNA binding site and miRNA bound with an mRNA gene G has a limited effect on the mRNA G, but affects more substantially the protein P generated by G ( [4,31]. Consider first any TI motif involving an mRNA gene G generating the protein P , as well as a set rep(G) = {M 1 , M 2 , . . . M k } of k translation inhibitors. We now derive the CKE (4) for this TI motif. Call p(t), g(t), m j (t) the concentrations of P, G, M j and let H(t) be at time t the percentage of existing G molecules committed to the translation of G into P. The basic CKE driving translation of G into P is then, as seen in [20], where γ , λ, are the degradation and translation rates of P.
The main point is to compute H(t). By molecular collision theory [23], the concentration ρ(t) of free M 1 molecules which bind at time t on Q[S]molecules to produce Q[S + 1]-molecules by forward reaction 7, is proportional to the product of m 1 (t)q(S, t) by the number (S 1 − S) of vacant binding sites. Hence, for some constant c, Similarly the concentration τ (t) of M 1 molecules freed by backward reaction 7 is given by for some constantc. At chemical equilibrium, we have τ (t) = ρ(t), and hence where u is a new constant. By recurrence on S, this implies This entails The set of G-molecules committed to translation into P at time t is identical to q(0, t). Thus H(t) = q(0, t)/g(t), and (11) yields Using hypothesis Hyp 3, a recurrence on k extends the argument just given for k = 1, to prove that for k ≥ 1, the percentage H(t) of G molecules committed to translation into P is given by: which proves CKE (4) for TI motifs. For TD motifs, the cis-regulation function F(t) in CKE (1) has been derived in [20]. By using hypothesis 1 and molecular collision theory, we directly deduce that the concentration the concentration of mRNA G degraded by miRNA M is proportional to the product of both concentrations of G and M. Thus we justify the miRNA degradation term −vg(t)m(t).

Invariance by affine profile transformations Affine profile transformations
Call ⊂ R q the set of all possible expression level profiles r(t) indexed by time t 1 , . . . , t q . To any pair T = (a, b) of real numbers, we associate an affine profile transformation for all time dates t. Let T be the set of all such affine profile transformations.
In microarray data sets, expression levels of genes are recorded via optical analysis of fluorescence intensities, and hence depend strongly on experimental acquisition modalities. So relative expression levels between pairs of recorded chemical species are more meaningful quantities, and graphic displays of microarray data by "heat maps" often involve logarithms of raw data.
Our microarray data record expression levels which as a first approximation can be viewed as unknown affine transformations of concentrations. Since our CKEs (1), (4) were derived for concentrations, we need to check how these CKEs and their parameters change under generic affine profile transformations.

Affine invariance of CKE models
Consider first a TD-motif involving an mRNA gene G, a protein P, an miRNA M, transcription repressive proteins R 1 , R 2 , . . ., and transcription activating proteins Letĝ,p,m,r i ,â j be the corresponding recorded expression profiles. Assume that each such recorded profilesr is linked to the concentration profile 4 by some unknown affine profile transformation T r .
The new parameters are easily expressed in terms of the original ones and of the coefficients of the affine transformations, but this is irrelevant practically since we will use microarray data to directly compute the new parameters for a CKE of type (1) linking recorded expression levels.
Similar computations for TI-motifs show that this affine invariance property also holds for CKE (4). Hence to model a TD or a TI motif, we can fit a CKE model of type (1) or (4) to recorded expression profiles, even though the theoretical model justification involved true concentrations, which are not directly measured by microarrays.
The key assumption is that, for each chemical species C, the expression levelsĉ(t) of C recorded by microarray are approximately linked to the concentration c(t) of C by some affine relationĉ(t) = a(C)c(t) + b(C), where the unknown coefficients a(C) and b(C) may depend on the species C. The preceding algebraic model invariance under multiple affine profile transformations strongly suggests that adequate distances between dynamic profiles of recorded expression levels should be invariant under affine profiles transformations, as developed in the next section.

Condensation of expression levels profiles
Microarray data typically record several tens of thousands gene expression profiles. So computational costs to fit microarray data to all potential TD-motifs and TI-motifs would of course be prohibitive. A natural option to reduce combinatorial explosion is to cluster the observed profiles.
Since our goal is to model expression profiles by ODEs of type (1) and (4), we need to control the diameters of all http://www.biomedcentral.com/1752-0509/8/19 clusters of expression profiles. This led us to reject hierarchical clustering as well as K-means clustering, and to implement in the space of expression profiles a "minimalnet" clustering technique, inspired by an innovative technique for automatic generation of prototypes in shape spaces (see [32]). In view of the preceding section, the diameters of clusters in the space of profiles should be measured by a distance invariant under affine profiles transformations.

Affine invariant distance between profiles
Recall that ⊂ R q is the set of all profiles r(t) indexed by time dates t 1 , . . . , t q . The mean and variance of a profile r are denoted by Define normalized profiles nor(r) and profiles correlations corr(r 1 , r 2 ) by where < ·, · > is the usual scalar product in R p . We then define a distance D(r 1 , r 2 ) between profiles r 1 and r 2 by For any affine profile transformation T in T , one has nor(r) = nor(Tr), and hence D(r 1 , r 2 ) = D(nor(r 1 ), nor(r 2 ). Thus for any affine profile transformations T 1 , T 2 , and any profiles r 1 , r 2 one has

Minimal net clustering
The set of all profiles is now endowed with a distance D invariant by affine profile transformations. Call mPR the large set of miRNA profiles recorded at time t 1 , . . . , t q by our microarrays. We fix a maximum radius ε for profiles clusters. We seek to partition mPR into disjoint clusters CL 1 , . . . , CL r such that each CL j has diameter inferior to 2ε, and we also want the number NN of clusters to be as small as possible. We have implemented an iterative algorithm to generate this type of minimal net clustering, in the spirit of [32].
Define a distance function D(x, y), where x, y represent the observations of the data. Denote by D(x, Y ) the distance between an observation x and a set Y, where D(x, Y ) = min y∈Y D(x, y). Let Let [x] denote the cluster containing single point x, and be the set of all observations, the minimal net algorithm is as follows: Then let x 1 and y 1 become the representatives of two initial clusters. Let representing the remaining points in excluding the 2 clusters. 2. After step n − 1, we obtain where CL j is a cluster that has single point, j = 1, . . . , n + 1.
In step n, let HD = max x∈R n−1 D(x, C n−1 ), representing the maximum distance between observation x in R n−1 and set C n−1 . If HD > ε, find then assign x to the cluster CL k .
We apply this minimal net clustering algorithm to the set of all miRNA profiles recorded by our microarrays. We define the cluster diameter diam(CL) of cluster CL as the maximum distance of any two observations in the cluster, i.e. diam(CL) = max x,y∈CL D(x, y), where. Compared with the commonly used clustering method such as K-means and Hierarchical clustering, the minimal net algorithm allows us to control the diameter of all clusters by setting the threshold ε representing the supremum radius of the cluster, while the K-means and hierarchical clustering is often used to determine the number of of clusters but not their diameters. If we select a very small threshold ε, the expression levels of genes in the same cluster can be considered almost identical.

Parameter estimation for the CKE models Parameters estimation strategy
A generic strategy for parameter estimation in CKEs systems is to minimize a cost function evaluating the discrepancy between model predictions and experimental data. Each one of our CKE models has a single output variable, namely the expression level g(t) of mRNA gene G for a TD motif, and the expression level p(t) of protein P for a TI motif. The output variable g(t) or p(t) of a CKE model can be estimated by a functionĝ(t) orp(t) once one knows the profiles of all other molecular species involved in the model. The estimatorsĝ(t) orp(t) are respectively determined by CKE (1) or CKE (4). http://www.biomedcentral.com/1752-0509/8/19 Each CKE models is parameterized by a parameter vector w of dimension n par . The quality of fit of this model with recorded profiles data is quantified by the size ERR(w) of the estimation error defined as follows where the output variable f (t) is equal to g(t) for TD motifs and to p(t) for TI motifs. The concrete goal is to find the best parameter vector w by minimization of the lack of fit ERR(w) over all possible values of w.
There are no magic solutions for such non linear minimization problems. Moreover fast computing was essential here, since we usually have to solve a very large number of similar "quality of fit maximization" problems when dealing with large microarray datasets.
We have tested several generic cost minimization approaches (see [33][34][35]) such as "genetic algorithms", as well as "gradient descent" to minimize a sum of squared modeling residuals. These two techniques turned out to require far too much computing time and were often unreliable due to their high dependence on initialization values.
We hence developed our own fast CKE parametrization algorithms to optimize the quality of fit between CKE models and microarray profiles data. This optimized quality of fit, adequately balanced by a systematic emphasis on parsimoniously parameterized models, becomes an essential clue to decide which potential interactions one should validate between miRNAs, mRNAs, and associated proteins.

Parameters parsimony requirement
Robustness of CKE parametrization is the main motivation for our parameter parsimony requirements. Consider CKE (1) modeling a TD-motif TDM involving n act activators and n rep repressors for the transcription of mRNA gene G, and one miRNA M degrading the transcription of G. Then the number n par of unknown parameters is n par = 3 + 2(n act + n rep ). Each profile g(t), m(t), a i (t), r j (t) is recorded at t = t 1 , . . . , t p . The CKE outputs for TDM are g(t 1 ), . . . , g(t p ). Hence each microarray data set provides (p − 1) equations linking the n par unknown parameters, since the recorded g(t) should be very close to the predicted valuesĝ(t) obtained by solving the ODE (1) with initial value g(t 1 ).
Vapnik's results on model fitting (see [36]) show that robust accuracy of parameter estimates requires fairly high values of the ratio of (p − 1)/n par . So n par << (p − 1) is a necessary constraint, and we will impose the parameter parsimony requirement n par ≤ (p − 1)/4. Indeed when n par > (t p −1), CKE models are overfitted and parameters are poorly estimated.

CKE Parameter estimation TI-motifs: plausible ranges for parameters
Consider a generic TI motif TIM involving an mRNA gene G, its associated protein P, and k translation inhibiting miRNAs [M 1 , . . . , M k ]. Let p(t), g(t), m i (t) be the expression levels of P, G, M i . We want to model TIM by CKE (4).
The degradation and translation rates γ and λ of protein P were unknown for most proteins.
According to results from [31], only a small percentage 2% of our 30,000 mRNAs have more than 2 potential binding sites for miRNAs, and only 0.02% mRNAs have as many as 7 such binding sites. So in our parameter estimations it is reasonable to restrict the number of binding sites S j for miRNA M j to be at most 5.

TI-motifs: optimizing the quality of fit
The inputs of CKE (4) are the initial value p(t 1 ) and the expression levels g(t), m 1 (t), . . . , m k (t) recorded at time dates t 1 , . . . , t q .
Discretizing the ODE (4) at time t 1 , . . . , t q , we get where the percentage H(t) is as recalled above . By summation this implies, for j = 2, 3, . . . . , q − 1 , the relation where In view of equation (14), when all expression profiles involved have been recorded until time t j , one can predict the still unknown value of p(t j+1 ) by the following natural estimatorp(t j+1 ), The quality of fit of this CKE model with recorded profiles data will be quantified by the size ERR(w) of the estimation error numerically computed as follows which in view of (15) can be reformulated as where for j = 1, . . . , q we have set We seek a parameter vector w minimizing the cost function ERR(w) in the parametric domain defined by

TI-motif: parameter estimation algorithm
For each i = 1 . . . k, fix an arbitrary integer 1 ≤ S i ≤ 5. Callm i andH i the respective medians over time t of the functions m i (t) and H i (t). Since the function H i (t) The assumption 0 < H i (t) < 1 yields 0 <H i (t) < 1. We will discretize the possible values ofH i (t) by constraining them to belong to a grid h 1 , . . . , h s of 1 ≤ s ≤ 99 percentage values equally spaced in [0, 1]. Sincem i is known, we invert equation (19) to compute a corresponding grid GRD i of s potential values for Note that the set U of all possible such choices for U is of cardinal inferior to N = (5s) k . Since we do explore each one of these possibilities separately, we need to keep the number N at a reasonable level, so we selected s = 99 for k = 1 , s = 60 for k = 2, s = 20 for k = 3 etc.
Fix any U in U. Since all recorded profiles involved in the TD motif are available at time t 1 , . . . , t q , we can use the U value to directly compute all the values H(t j ), and then all the numbers π j , μ j , ν j defined by (18). We want to find values of the two last parameters γ > 0 and λ > 0 which will minimize This problem is equivalent to minimizing the linear objective function: This is a classical constrained linear programming problem, which can be solved by well known fast linear programming algorithms [37], to provide the optimal values γ * , λ * , z * . Then z * = ERR(γ * , λ * ) is the minimal value of ERR(γ , λ).
These optimal values are functions γ * (U), λ * (U), z * (U) of the partial vector of parameters U in U. We then select the optimal U * in U as the value of U which minimizes z * (U) over U. The optimal parametrization w * of our TD motif is then given by w * = [γ * (U * ), λ * (U * ), U * ]. This new parametrization algorithm is fairly fast and has good accuracy. On a current laptop PC, our non-optimized MATLAB code implementation required less than 5 minutes of CPU time for the parametrization of a typical TI-model with 38 time points and 9 parameters [38]. After code optimization and a re-implementation in C, we expect this CPU time to be reduced to 2 minutes. The algorithm does not require any knowledge of the parameters ranges except for the number of binding sites S, which is an advantage for the range of reaction rates of of many molecules of interest are usually unknown.

TD-motifs: parameter estimation algorithm
The parametrization algorithms just presented also apply to TD models, as we now sketch. The output variable is now the expression level g(t) of mRNA gene G.
Discretize the ODE (1) at time dates t 1 , . . . , t q , to get where F(t) is defined in (1). By summation this implies the relations where K (t j ) = sum j n=1 F(t n ) When all expression profiles involved have been recorded until time t j , one can predict the unknown value of g(t j+1 ) by the estimatorĝ(t j+1 ) , The quality of fit of this CKE model with the recorded profiles data is quantified by the size of the prediction error which is a function ERR(w) of the parameter vector w where we have set The TD model parameters SR i > 0, u i > 0 and SA j > 0, w j > 0 are the number of binding sites and the affinity constants for the transcriptional factors R i and A j . They http://www.biomedcentral.com/1752-0509/8/19 constitute a partial parameter vector U which we restrict as above by first imposing a moderate upper bound S max on all the integers SR i , SA j . and by selecting, also as done above, adequate finite grids for the values of the u i and the w j . This constrains U to belong to a finite set U. The cardinal N of U is forced to remain at most of order 10 5 , by adequate constraints on S max and on the coarseness of the u i grids and the v j grids.
To minimize ERR(w) , we fix as above an arbitrary U in U. We can then compute all the numbers τ j , ρ j , η j , θ j . Since U is fixed, the error size ERR(w) becomes a function E(β, v, κ) of the last 3 positive parameters (β, v, κ), still given by equation (23). As above the minimization of E(β, v, κ) is equivalent to a constraint linear programming problem where we want to minimize the linear function (β, v, κ, z) = z over the following set of (2q + 4) linear inequalities Solving this constraint linear programming problem generates optimal parameters (β * , v * , κ * ) and a minimal error z * , which are all functions of U. One concludes as above by selecting an optimal U * minimizing z * (U) over all U in U.

Quality of fit for CKE models
Consider any TD motif or TI motif A. We have seen how to compute a parameter vector w * optimizing the quality of fit between microarray data and our CKE model for A. This was done by minimizing is the estimation of f (t) based on the CKE parametrized by w.
For the optimal CKE parametrization w * , the lack of fit to data can then be evaluated by ERR(w * ). However we have seen in section 'Invariance by affine profile transformations' that when comparing two expression profiles recorded by microarray, natural distances between profiles should be roughly invariant by changes of scale for these profiles. It would then be tempting to replace the absolute error of estimation | f (t) −f (t) | at time t by the relative error of estimation . But relative errors become quite large whenever the output profile f ((t) is close to zero. To avoid such spuriously large error values, while still preserving scale invariance whenever f (t) is not close to zero, we define the Smoothed Relative Error of estimation SRE(t) at time t by the following formula, wheref denotes the mean value of the profile f, (25) We finally quantify the Modeling Error MODER for the optimally parametrized CKE model of motif A by

Examples of application
We implemented our model of TI on microarray data of mouse stem cells undergoing RA-induced differentiation, as provided by LC Science Inc, and previously analyzed by classical techniques in [18]. We took the recorded expression profiles for proteins/mRNAs GCNF, Oct4, Nanog and Sox2 at time points (0, 1.5, 3, 6)/(0, 3, 6) and expression levels for 266 miRNAs on days 0, 1, 3, 6 from [38] during ES cell differentiation. These profile data were interpolated at 19 intermediary time points, by Piecewise Cubic Hermite Interpolation (PCHIP) and the number of parameters were limited to be 4, i.e. only 1 upstream miRNA was selected for the model, to satisfy the parameter parsimony requirement.
For miRNA M i , the following linear transformation, which could be viewed as normalization, was done: is positive for t = 0, 1/3, . . . , 6. Taking ε = 0.15, we applied Minimal Net clustering (the MATLAB code can be downloaded through Additional file 1) to the transformed data of miR-NAs,m i (t), i = 1, 2, . . . , 266, and obtained 107 clusters, in which the maximum cluster contains 14 miRNAs. We consider that the miRNAs belonging to the same cluster share the same normalized expression level within a negligibly small error. For each cluster CL j , j = 1, . . . , 107, we determine the miRNA MC j which is the representative of the cluster CL j for the distance D , and we let mc j (t) be the expression level of MC j at time t. We call mc j (t) is the expression level of cluster CL j .
We successively implemented the TI model with only one repressor miRNA (the MATLAB code can be downloaded through Additional file 2). After parametric modeling of our pre-selected 107 TI motifs, and evaluation of their quality of fit, we have validated only 3 clusters of miRNAs as translational inhibitors repressing protein Oct4. If an miRNA in CL j is validated by modeling as a potential repressor, all other miRNAs belonging to CL j are also potential repressors and can be validated numerically as well by the form invariability of the model under http://www.biomedcentral.com/1752-0509/8/19 affine transformation. Here we present one of these three validated miRNAs of Oct4 (see Figure 2), where the centroid of the cluster is miRNA mmu-miR-10a, while the other 6 miRNAs in the cluster are mmu-miR-203, 330, 342, 470 and 99b. We used TarBase [39] to search all the experimentally validated miRNAs targeting Oct4 (Pou5f1) in Mus musculus. By TarBase, only mmu-miR-470 has been experimentally validated and it is also numerically validated by our TI model.
For protein GCNF, only two miRNAs, mmu-let-7b and mmu-let-181a, have been experimentally validated by Tar-Base and both of them belong to the list of the 20 miRNAs numerically validated by our TI modeling (we presented the validated cluster containing mmu-let-7b in Figure 3). Our modeling approach did not validate any miRNA repressing both proteins Nanog and Sox2, while there are 3 miRNAs and 1 miRNA experimentally validated as separate repressors of Nanog and Sox2 respectively.
We here also present one example of TD motif for downstream factor mRNA Sox2 (Figure 4). With the assumption that the transcription factors are proteins Oct4 and Nanog [38], we validated cluster mmu-miR-134, 30a-3p, 30b, 335, 431, 433-3p, 434-3p, and 487b as degraders. Although (mmu-miR-134, Sox2) is also an experimentally tested pair, we will not discuss deep in detail the validation results of the TD motifs in this paper. The main reason is that the validation results of TD motifs depends much on our knowledge of the transcription factors. More discussion is in the subsection below.

Discussion
In [38], we pre-selected the potential miRNAs for each gene/protein by TargetScan 5.0 and miRanda before applying the two models. The pre-selection of miRNA candidates were not necessary though it greatly reduced the computational cost. However, the application of the CKE modeling was dependant on the target prediction algorithm, such as TargetScan or miRanda. Therefore, we introduced Minimal Net Clustering in this paper so that the data was condensed and the computational cost could be reduced by a purely numerical method without biological bias.
Since the results of TD model depend on information of transcription factors, the modeling validates not only miRNAs acting as mRNA degraders but also upstream transcription factors simultaneously. In [38] we numerically validated proteins GCNF, Oct4 and Nanog as transcription factors for mRNAs Oct4 and Nanog, while Marson et al. and Boyer et al. [40,41] claimed that Oct4 and Sox2 are bounded and act together with Nanog as transcription factors. Considering the impact of the transcription factors, the TD modeling may be less convincing unless the transcription factors are fixed as experimentally validated. In this paper we presented more details on the implementation and validation results of the TI model in     order to focus on the validation of miRNAs and avoid the influence of assumptions of transcription factors.
TarBase shows that, for protein GCNF the experimentally validated miRNAs are mmu-let-7b and mmu-miR-181a; for protein Oct4 the experimentally validated miRNA is mmu-miR-470; for protein Nanog the experimentally validated miRNAs are mmu-miR-134, mmu-miR-470, and mmu-miR-296; for protein Sox2 the experimentally validated miRNA is mmu-miR-134. In this paper, we have validated by TI modeling and minimal net clustering all the experimentally tested miRNA repressors of GCNF and Oct4. In our previous work [38], actually none of these miRNA repressors had yet been studied for modeling for the four proteins because of the restriction of pre-selection. And only the pair (mmu-miR-181a, GCNF) was validated by the classical correlation analysis done in Gu et al. [18] for the 4 proteins GCNF, Oct4, Nanog, Sox2. Therefore, the TI modeling combined with data condensation not only reduced computational cost but also clearly extended the set of miRNA inhibitors validated by model fitting to microarray data.
Since each numerically validated miRNA cluster may contain two or more miRNAs, the miRNAs in the same cluster could also be considered as potential candidate inhibitors for further experiments to validate. For instance, as Figure 2 shows, mmu-miR-203, 330, 342, 10a and 99b are potential candidates for protein Oct4 for they are in the same cluster as mmu-miR-470, which is validated by both the numerical modeling and experiments.
After the pair (mmu-miR-181a , GCNF) was well validated by our CKE model, we found that the cluster (see Figure 5, top left) containing mmu-miR-181a also includes mmu-miR-103 and mmu-miR-107, which are two known miRNAs that have the same roles in regulating insulin sensitivity and promoting metastasis of colorectal cancer [42,43]. We also checked that miR-103 and miR-107 have almost the same mature sequences: mmu-miR-103: AGCAGCAUUGUACAGGGCUAUGA mmu-miR-107: AGCAGCAUUGUACAGGGCUAUCA After the pair (mmu-let-7b, GCNF) was well validated by our CKE model, we observed mmu-let-7g, and mmulet-7i are in the same cluster as mmu-let-7b (see Figure 5, top right). It was claimed that let-7b and 7g reduce tumor growth in mouse models of lung cancer [44]. We then checked that indeed these three miRNAs have very similar mature sequences, namely the Euclidean distance between the expression level profiles (expression distance in short) and the Needleman-Wunsch alignment score (NWA) between the mature sequences of each miRNA pair. We then divided the 35,245 miRNA pairs into two groups: GroupHigh includes the pairs with high alignment score (NWA > 10), i.e. GroupHigh includes miRNA pairs with similar mature sequences, and GroupLow includes the pairs with low alignment score (NWA ≤ 10). We then compared the distribution of expression distances for all miRNA pairs in GroupHigh with the distribution of these distances for miRNA pairs in GroupLow. As seen in Figure 5 (bottom left), the quantiles of these distances in GroupLow are consistently larger than the corresponding quantiles in GroupHigh. This is fully confirmed by Kolmogorov-Smirnov test which yielded the very significant p-value 7 × 10 −32 . The result still holds when we change the alignment score threshold NWA = 10 used to define GroupHigh and GroupLow (see Figure 5, bottom right, where the NWA threshold is now NWA = 15). We conclude that miRNAs having similar mature sequences tend, with high probability, to have similar expression levels. The above analysis and examples indicate that miRNAs belonging to the same cluster are good candidates to have similar mature sequence. Since the match between miRNA mature sequence and target sites is the main determinant for miRNA targets, miRNAs belonging to same cluster may hence also have similar regulatory roles.

Robustness of parameter estimation
Estimation algorithms for nonlinear models may yield parameter estimates that are dependent on the particular set of data or on initial estimates of parameters. Since our parameter estimation algorithm is independent from initial estimates of parameters, we now focus on on the measurements errors affecting microarray data and on their impact for parameter estimation. Considering that  Figure 6 Histogram of re-estimated parameters for a TD model. a) the histogram of re-estimated parameter β, degradation rate of Sox2, the originally estimated β value of the model is below the graph. b) the histogram of re-estimated parameter κ, transcription rate of Sox2, the originally estimated κ value of the model is below the graph. c) the histogram of re-estimated parameter v, reaction rate of Sox2 and mmu-mir-21, the originally estimated v value of the model is below the graph. d) the histogram of re-estimated parameter μ 1 , reaction constant of Sox2 and Oct4, the originally estimated μ 1 value of the model is below the graph. e) the histogram of re-estimated parameter μ 2 , reaction constant of Sox2 and Nanog, the originally estimated μ 2 value of the model is below the graph. f) the histogram of re-estimated parameter S 1 , S 2 , number of binding sites of Oct4 and Nanog on Sox2 respectively, the originally estimated S 1 and S 2 value of the model is below the graph.  Figure 6) and one TI motif ( Figure 7).

Conclusion
We have separately modeled by chemical kinetics equations the 2 distinct modalities of the repressive actions of miRNAs on post-transcriptional processes of mRNA genes and the associated proteins. This was achieved by first defining the formal structure of two types of interaction architectures (Transcription Degradation motifs and Translation Inhibition motifs ) linking miRNAs to subgroups of mRNA genes. The plausibility of each one of these potential TD motifs or TI-motifs was then evaluated by computerized parametric modeling, based on microarray data, of adequate formal chemical kinetics equations (CKEs). We have sketched the formal derivation of 2 specific CKEs modeling by dynamic ODEs the interactions between concentrations of different species of molecules involved in each architecture. This led to a motif validation strategy based on the quantified quality of fit between our optimally parametrized models and the corresponding microarray data.
Our computerized parameter estimation is implemented by an innovative fast algorithm that does not require knowledge of range of molecular reaction rates. On a current standard laptop PC, our implementation of parameter estimation for a typical 9-parameters CKE model requires about 5 minutes of computing time. Histogram of re-estimated parameters for a TI model. a) the histogram of re-estimated parameter γ , degradation rate of GCNF, the originally estimated γ value of the model is below the graph. b) the histogram of re-estimated parameter λ, translation rate of GCNF, the originally estimated λ value of the model is below the graph. c) the histogram of re-estimated parameter μ, reaction constant of GCNF and mmu-let-7b, the originally estimated μ value of the model is below the graph. d) the histogram of re-estimated parameter S, number of binding sites of mmu-let-7b on GCNF respectively, the originally estimated S value of the model is below the graph. http://www.biomedcentral.com/1752-0509/8/19 Our parameter estimation algorithm also provides relatively high-quality optimization for the fit between model and microarray data, by integrating both global and local cost minimization techniques, in contexts where plausible ranges of values for most of the unknown parameters are not available in the literature. By perturbing the expression levels of miRNAs and re-estimating the parameters, we showed that our parameter algorithm has a satisfactory level of robustness. We believe that our parameter estimation technique with associated evaluation of quality of fit would be quite applicable as a generic algorithm to similar problems in chemical kinetics modeling of molecular interactions.
Modeling very large microarray data is computationally quite expensive. We have hence sketched clustering methods to condense large microarray data. This approach has of course been attempted before our work, but the main point is that we have carefully studied the mathematical compatibility of our CKE models with condensation of the profiles data. Since we have proved that the abstract form of our CKE models is invariant by arbitrary multiple affine transformations of profiles data, we have made sure to constrain the distance of two expression levels profiles to be invariant by these types of affine transformations.
We have implemented a Minimal Net Clustering algorithm based on this distance, which allows us to control the radius of the clusters. The number of CKEs to parameterize can be strongly reduced after condensation of the large data sets, and the affine invariance of our CKEs show that the condensed genes network can then still be modeled by similar CKEs.
By applying our TI modeling to multiple proteins such as GCNF, Oct4, Nanog, Sox2, we showed that 3 miRNAtarget pairs experimentally validated can be also validated by the TI model.

Additional files
Additional file 1: Codes-minimal net clustering.