Reduction of linear hierarchical models
Introductory notes
In this section we present a general algorithm for finding dominant subsystems and critical parameters for linear systems with completely separated time scales. Linear systems represent a special situation when all the interactions in the reaction network are monomolecular, i.e., have the form A → B.
Although systems biology models are nonlinear and contain also multimolecular reactions, it is nevertheless useful to have an efficient algorithm for solving linear problems. First, as we shall see in the next section, nonlinear systems can include linear subsystems, containing reactions that are pseudo(monomolecular) with respect to species internal to the subsystem (at most one internal species is reactant and at most one is product). Second, for reactions A + B → ..., if concentrations c_{
A
}and c_{
B
}are well separated, say c_{
A
}>> c_{
B
}, then we can consider this reaction as B → ... with rate constant proportional to c_{
A
}which is practically constant, or changes only slowly. We can assume that this condition is satisfied for all but a small fraction of genuinely nonlinear reactions (the set of nonlinear reactions changes in time but remains small). Thus, linear models can serve as very effective approximations of behavior of nonlinear models in certain windows of time: in this way, nonlinear behavior can be approximated as a sequence of linear dynamics, followed one each other in a sequence of "phase transitions". Third, linear networks represent the case when very large reaction networks models can be approached analytically, and some intuition and design principles can be learned and partially generalized to the nonlinear case. As an example, see the robustness study made in [34]. The linear case offers nice simple illustrations of the concepts of dominant subsystem, critical monomials and critical parameters.
The algorithm presented here in its "recipe" form ready for computational implementations, is developed in detail elsewhere [34], with many examples and rigorous justifications.
The structure of linear (monomolecular) reaction networks can be completely defined by a simple digraph, in which vertices correspond to chemical species A_{
i
}, edges correspond to reactions A_{
i
}→ A_{
j
}with kinetic constants k_{
ji
}> 0. For each vertex, A_{
i
}, a positive real variable c_{
i
}(concentration) is defined.
"Pseudospecies" (labeled ∅) can be defined to collect all degraded products, and degradation reactions can be written as A_{
i
}→ ∅ with constants k_{0i}. Production reactions can be represented as ∅ → A_{
i
}with rates k_{i 0}.
The kinetic equation is
\frac{d{c}_{i}}{dt}={k}_{i0}+{\displaystyle \sum _{j\ge 1}{k}_{ij}{c}_{j}}({\displaystyle \sum _{j\ge 0}{k}_{ji}}){c}_{i},
(1)
or in vector form: ċ = K_{0} + K c.
The advantage of linear dynamics is that it is completely specified by the eigenvectors and the eigenvalues of the kinetic matrix K.
The system has an unique bounded steady state c^{s}= K^{1} K_{0} if and only if the matrix K is nonsingular.
In this case, it is easy to write down the general solution of Eq.(1):
c(t)={c}^{s}+{\displaystyle \sum _{k=1}^{n}{r}^{k}({l}^{k},c(0){c}^{s})\mathrm{exp}({\lambda}_{k}t)}
(2)
where λ_{
k
}, l^{k}, r^{k}, k = 1,..., n are the eigenvalues, the left eigenvectors (vectorrows) and the right eigenvectors (vectorcolumns) of the matrix K, respectively, i.e.
K r^{k}= λ_{
k
}r^{k}, l^{k}K = λ_{
k
}l^{k}.
with the normalization (l^{i}, r^{j}) = δ_{
ij
}, where δ_{
ij
}is Kronecker's delta.
Closed systems are characterized by K_{0} = 0 (no production reactions, although degradation is permitted). Close systems are conservative if the matrix K is singular (a particular case is when there is no degradation at all). Then, the left kernel of K provides a set of conservation laws (if l K = 0, then quantities (l, c) are conserved). Solution of the homogeneous linear equations are simply:
c(t)={\displaystyle \sum _{k=1}^{n}{r}^{k}({l}^{k},c(0))\mathrm{exp}({\lambda}_{k}t)}
(4)
If all reaction constants k_{
ij
}would be known with precision then the eigenvalues and the eigenvectors of the kinetic matrix can be easily calculated by standard numerical techniques. Furthermore, singular value decomposition can be used for model reduction. But in systems biology models often one has only approximate or relative values of the constants (information on which constant is bigger or smaller than another one). In the further we will consider the simplest case: when all kinetic constants are very different (separated), i.e. for any two different pairs of indices I = (i, j), J = (i', j') we have either k_{
I
}>> k_{
J
}or k_{
J
}>> k_{
I
}. In this case we say that the system is hierarchical with timescales (inverses of constants k_{
ij
}, j ≠ 0) totally separated.
Hierarchical linear network can be represented as a digraph and a set of orders (integer numbers) associated to each arc (reaction). The lower the order, the more rapid is the reaction (see Fig. 1). It happens that in this case the special structure of the matrix K (originated from a reaction graph) allows us to exploit the strong relation between the dynamics (1) and the topological properties of the digraph. Big advantage of the fully separated network is that the possible values of {l}_{i}^{k} are 0, 1 and the possible values of {r}_{i}^{k} are 1, 0, 1 with high precision [34]. Thus, if we can provide an algorithm for finding nonzero components of {l}_{i}^{k}, {r}_{i}^{k}, based on the network topology and the constants ordering, then this will give us a good approximation to the problem solution (2).
Some basic notions
Two vertices of a graph are called adjacent if they share a common edge. A path is a sequence of adjacent vertices. A graph is connected if any two of its vertices are linked by a path. A maximal connected subgraph of graph G is called a connected component of G. Every graph can be decomposed into connected components.
A directed path is a sequence of adjacent edges where each step goes in direction of an edge. A vertex A is reachable from a vertex B, if there exists an oriented path from B to A.
A nonempty set V of graph vertexes forms a sink, if there are no oriented edges from A_{
i
}∈ V to any A_{
j
}∉ V. For example, in the reaction graph A_{1} ← A_{2} → A_{3} the onevertex sets {A_{1}} and {A_{3}} are sinks. A sink is minimal if it does not contain a strictly smaller sink. In the previous example, {A_{1}}, {A_{3}} are minimal sinks. Minimal sinks are also called ergodic components.
A digraph is strongly connected, if every vertex A is reachable from any other vertex B. Ergodic components are maximal strongly connected subgraphs of the graph, but the reverse is not true: there may exist maximal strongly connected subgraphs that have outgoing edges and, therefore, are not sinks. If the digraph has no branching (each vertex has only one successor), then we can define a deterministic flow (discrete dynamical system) on the set of its vertices. Every vertex is the origin of an unique directed path.
Basic procedure for approximating eigenvectors
The algorithm we provide is based on the solution of two simplest cases: 1) network without cycles and without branching (i.e, there are no vertices with more than one outgoing edges) (for example, Fig. 1a) and 2) network without branching with a unique sink which is a cycle (for example, Fig. 1b).
For the networks without branching, we can simplify the notation for the kinetic constants, by introducing κ_{
i
}= k_{
ij
}. Also it is useful to introduce a map ϕ (see Fig. 1):
\varphi (i)=\{\begin{array}{ll}j,\hfill & \text{ifthereexists}{A}_{i}\to {A}_{j}\hfill \\ i,\hfill & \text{else}\hfill \end{array}
(5)
Acyclic nonbranching network
In this case, for any vertex A_{
i
}there exists an eigenvector. If A_{
i
}is a sink vertex (i.e. ϕ(i) = i) then this eigenvalue is zero. If A_{
i
}is not a sink (i.e. ϕ(i) ≠ i and reaction A_{
i
}→ A_{ϕ(i)}has nonzero rate constant κ_{
i
}) then this eigenvector corresponds to eigenvalue κ_{
i
}. For left and right eigenvectors of K that correspond to A_{
i
}we use notations l^{i}(vectorrow) and r^{i}(vectorcolumn), correspondingly.
Let us suppose that A_{
f
}is a sink vertex of the network. Its associated right and left eigenvectors corresponding to the zero eigenvalue are given by:
\begin{array}{r}\hfill {r}_{j}^{f}={\delta}_{j}^{f}\\ \hfill {l}_{j}^{f}=\{\begin{array}{ll}1,\hfill & \text{if}{\varphi}^{q}(f)=j\text{forsome}q0\hfill \\ 0,\hfill & \text{else}\hfill \end{array}\end{array}
(6)
Generally, right eigenvectors can be constructed by recurrence starting from the vertex A_{
i
}and moving in the direction of the flow. The construction is in opposite direction for left eigenvectors.
For right eigenvector r^{i}only coordinates {r}_{{\varphi}^{k}(i)}^{i} (k = 0, 1, ..) can have nonzero values, and
\begin{array}{r}\hfill {r}_{{\varphi}^{k+1}(i)}^{i}=\frac{{\kappa}_{{\varphi}^{k}(i)}}{{\kappa}_{{\varphi}^{k+1}(i)}{\kappa}_{i}}{r}_{{\varphi}^{k}(i)}^{i}={\displaystyle \prod _{j=0}^{k}\frac{{\kappa}_{{\varphi}^{j}(i)}}{{\kappa}_{{\varphi}^{j+1}(i)}{\kappa}_{i}}}\\ \hfill =\frac{{\kappa}_{i}}{{\kappa}_{{\varphi}^{k+1}(i)}{\kappa}_{i}}{\displaystyle \prod _{j=1}^{k}\frac{{\kappa}_{{\varphi}^{j}(i)}}{{\kappa}_{{\varphi}^{j}(i)}{\kappa}_{i}}.}\end{array}
(7)
For left eigenvector l^{i}coordinate {l}_{j}^{i} can have nonzero value only if there exists such q ≥ 0 that ϕ^{q}(j) = i (this q is unique because the system of reactions has no cycles), and
{l}_{j}^{i}=\frac{{\kappa}_{j}}{{\kappa}_{j}{\kappa}_{i}}{l}_{\varphi (j)}^{i}={\displaystyle \prod _{k=0}^{q1}\frac{{\kappa}_{{\varphi}^{k}(j)}}{{\kappa}_{{\varphi}^{k}(j)}{\kappa}_{i}}}.
(8)
These formulas (7, 8) are true for all nonbranching acyclic linear systems, even without separation of times. In the case of fully separated systems, they are significantly simplified and do not require knowledge of the exact values of κ_{
i
}. Thus, for the left eigenvectors {l}_{i}^{i} = 1 and, for i ≠ j,
{l}_{j}^{i}=\{\begin{array}{ll}1,\hfill & \text{if}{\varphi}^{q}(j)=i\text{forsome}q0\text{and}{\kappa}_{{\varphi}^{d}(i)}{\kappa}_{i}\text{forall}d=0,\mathrm{...}q1\hfill \\ 0,\hfill & \text{else}\hfill \end{array}
(9)
For the right eigenvectors we suppose that κ_{
f
}= 0 for a sink vertex A_{
f
}. Then {r}_{i}^{i} = 1 and
{r}_{{\varphi}^{k}(j)}^{i}=\{\begin{array}{ll}1,\hfill & \text{if}{\kappa}_{{\varphi}^{k}(i)}{\kappa}_{i}\text{and}{\kappa}_{{\varphi}^{m}(i)}{\kappa}_{i}\text{forall}m=1,\mathrm{...}k1\hfill \\ 0,\hfill & \text{else}\hfill \end{array}
(10)
Vector r^{i}has at most two nonzero coordinates. The formula (10) means that to find the 1 component in r^{i}, one should find the first vertex j downstream of i with κ_{
j
}<κ_{
i
}("bottleneck" vertex): there {r}_{j}^{i} = 1. Following (10,9) we find that for the example at Fig. 1a
\begin{array}{l}{l}^{1}\approx (1,0,0,0,0,0,0,0),{l}^{2}\approx (0,1,0,0,0,0,0,0),\hfill \\ {l}^{3}\approx (0,1,1,0,0,0,0,0),{l}^{4}\approx (0,0,0,1,0,0,0,0),\hfill \\ {l}^{5}\approx (0,0,0,1,1,1,1,0),{l}^{6}\approx (0,0,0,0,0,1,0,0).\hfill \\ {l}^{7}\approx (0,0,0,0,0,1,1,0)\hfill \end{array}
(11)
\begin{array}{l}{l}^{1}\approx (1,0,0,0,0,0,0,1),{l}^{2}\approx (0,1,1,0,0,0,0,0),\hfill \\ {l}^{3}\approx (0,0,1,0,0,0,0,1),{l}^{4}\approx (0,0,0,1,1,0,0,0),\hfill \\ {l}^{5}\approx (0,0,0,0,1,0,0,1),{l}^{6}\approx (0,0,0,0,0,1,1,0).\hfill \\ {l}^{7}\approx (0,0,0,0,1,0,1,0)\hfill \end{array}
(12)
Nonbranching network with a unique simple cyclic sink
In this case we have a reaction network with components A_{1}, ... A_{
n
}and last τ vertices (after some change of enumeration) form a reaction cycle: A_{nτ+1}→ A_{nτ+2}→ ... A_{
n
}→ A_{nτ+1}. We assume that the limiting step in this cycle (reaction with minimal constant) is A_{
n
}→ A_{nτ+1}.
In this case the right eigenvector corresponding to the zero eigenvalue has nonzero components only on the vertices belonging to the cycle:
{l}_{j}^{i}=\{\begin{array}{ll}1,\hfill & \text{if}{\varphi}^{q}(j)=i\text{forsome}q0\text{and}{\kappa}_{{\varphi}^{d}(i)}{\kappa}_{i}\text{forall}d=0,\mathrm{...}q1\hfill \\ 0,\hfill & \text{else}\hfill \end{array}
(13)
Similarly, the stationary distribution has nonzero value only at vertices belonging to the cycle. If b = ∑_{
i
}c_{
i
}is the total (conserved) mass, then the steady state is:
{c}_{j}=\frac{b/{\kappa}_{j}}{\frac{1}{{\kappa}_{n\tau +1}}+\frac{1}{{\kappa}_{n\tau +2}}+\dots \frac{1}{{\kappa}_{n}}}
(14)
for j ∈ [n  τ + 1, n] and zero elsewhere.
If we have a system with well separated constants (which means that κ_{
n
}≪ κ_{
i
}, i ≠ n) then this expression in the first order is simplified to
\begin{array}{cc}{c}_{n}=b\left(1{\displaystyle \sum _{i=n\tau +1}^{n1}\frac{{\kappa}_{n}}{{\kappa}_{i}}}\right),& {c}_{i}=b\frac{{\kappa}_{n}}{{\kappa}_{i}}\end{array},
(15)
which means that most of substance is concentrated just before the "bottleneck" A_{
n
}→ A_{nτ+1}(c_{
n
}≫ c_{
i
}, i ≠ n).
To approximate the dynamics of the reaction network for κ_{
n
}≪ κ_{
i
}, i ≠ n, it is sufficient to remove the slowest step of the cycle A_{
n
}→ A_{nτ+1}. After removing, we will have acyclic nonbranching system of reactions with eigenvalues and eigenvectors that can be computed from the formulas in the previous section. These formulas give n  1 eigenvector sets corresponding to n  1 nonzero eigenvalues λ_{
i
}= κ_{
i
}, i = 1..n  1. For example, removing A_{8} → A_{6} step at Fig. 1b converts the reaction network to the Fig. 1 a whose dynamics approximates the dynamics of the simple cyclic network.
Auxiliary reaction network and auxiliary dynamical system
Now let us consider an arbitrary linear reaction network with wellseparated constants. For each A_{
i
}, let us define κ_{
i
}as the maximal kinetic constant for reactions A_{
i
}→ A_{
j
}: κ_{
i
}= max_{
j
}{k_{
ji
}}. For correspondent j we use notation ϕ(i): ϕ(i) = arg max_{
j
}{k_{
ji
}}. The function ϕ(i) is defined under condition that for A_{
i
}outgoing reactions A_{
i
}→ A_{
j
}exist. If there exist no such outgoing reactions then let us define ϕ(i) = i.
An auxiliary reaction network is the set of reactions A_{
i
}→ A_{ϕ(i)}with kinetic constants κ_{
i
}. The correspondent kinetic equation is
{\dot{c}}_{i}={\kappa}_{i}{c}_{i}+{\displaystyle \sum _{\varphi (j)=i}{\kappa}_{j}{c}_{j}},
(16)
The auxiliary network also defines a auxiliary discrete dynamical system i → Φ (i) that is used to compute the eigenvectors of the kinetic matrix. The auxiliary network can have several connected components. In each connected component the minimal sink is an attractor of the auxiliary dynamical system, hence it is either a node, or a cycle.
General algorithm for calculating the dominant behavior of the linear dynamics
Preprocessing reaction network

1)
Let us consider a reaction network \mathcal{W} with a given structure and fixed ordering of constants that are well separated. Using this ordering let us construct the auxiliary reaction network \mathcal{V}={\mathcal{V}}_{\mathcal{W}}.

2)
If the auxiliary network does not contain cycles then the auxiliary network kinetics (16) approximates relaxation of the initial network \mathcal{W}. To obtain the solution, we use directly formulas (7,8) to calculate the eigenvectors (if all κ_{
i
}are known) or (9,10) to obtain the 0–1 asymptotics (if only the ordering of κ_{
i
}is known).

3)
In general case, let the system \mathcal{V} have several cycles C_{1}, C_{2}, ... with periods τ_{1}, τ_{2}, ... > 1.
By "gluing" cycles into points, we transform the reaction network \mathcal{W} into {\mathcal{W}}^{1} as follows. For each cycle C_{
i
}, we introduce a new vertex A^{i}. The new set of vertices is {\mathcal{A}}^{1}=\mathcal{A}\cup \{{A}^{1},{A}^{2},\mathrm{...}\}\backslash ({\cup}_{i}{C}_{i}) (we delete cycles C_{
i
}and add vertices A^{i}).
Let us consider all the reactions from \mathcal{W} of the form A → B (A, B ∈ \mathcal{A}). They can be separated into 5 groups:

1.
both A, B ∉ ∪_{
i
}C_{
i
};

2.
A ∉ ∪_{
i
}C_{
i
}, but B ∈ C_{
i
};

3.
A ∈ C_{
i
}, but B ∉ ∪_{
i
}C_{
i
};

4.
A ∈ C_{
i
}, B ∈ C_{
j
}, i ≠ j;

5.
A, B ∈ C_{
i
}.

1.
Reactions from the first group ("transitive" reactions) do not change.

2.
Reactions from the second group ("entering to cycles") transform into A → A^{i}(to the whole glued cycle) with the same constant.

3.
Reactions of the third type ("exiting from cycles") change into A^{i}→ B with the rate constant renormalization: let the cycle C^{i}be the following sequence of reactions A_{1} → A_{2} → ... {A}_{{\tau}_{i}} → A_{1}, and the reaction rate constant for A_{
i
}→ A_{i+1}is k_{
i
}({k}_{{\tau}_{i}} for {A}_{{\tau}_{i}} → A_{1}). For the limiting reaction of the cycle C_{
i
}we use notation k_{lim i}. If A = A_{
j
}and k is the rate reaction for A → B, then the new reaction A^{i}→ B has the rate constant kk_{lim i}/k_{
j
}. This corresponds to a quasistationary distribution on the cycle (15). It is obvious that the new rate constant is smaller than the initial one: kk_{lim i}/k_{
j
}<k, because k_{lim i}<k_{
j
}due to definition of limiting constant. If after gluing, several reactions A^{i}→ B appear, then only the one with the maximal constant should be kept.

4.
The same constant renormalization is necessary for reactions of the fourth type ("between cycles"). These reactions transform into A^{i}→ A^{j}.

5.
Reactions of the fifth type ("inside cycles") are discarded.

4)
After the new network {\mathcal{W}}^{1} is constructed, we assign \mathcal{W}:={\mathcal{W}}^{1}, \mathcal{A}:={\mathcal{A}}^{1} and iterate the algorithm from the step 1) until we obtain an acyclic network and exit at step 2).
The algorithm produces an hierarchy of cycles. Notice that the algorithm is based on an asymmetry between entering reactions and outgoing reactions from cycles in the hierarchy. Indeed, some fluxes of \mathcal{W} entering cycles C_{
i
}can be neglected when they are dominated by a stronger flux of \mathcal{V} bifurcating from the same node (this occurs at the first step of the algorithm when constructing \mathcal{V}). The cycles C_{
i
}are minimal sinks in \mathcal{V} (they are attractors of the auxiliary dynamical system). There are no reactions A → B in \mathcal{V} such that A ∈ C_{
i
}, B ∉ C_{
i
}. Nevertheless, there may be such reactions in the initial network \mathcal{W}. These fluxes can not be neglected because there are no exiting fluxes of \mathcal{V} to dominate them. The rule of thumb is: neglect any dominated flux except for the fluxes exiting some cycle in the hierarchy. This explains our algorithm and was rigorously justified in [34].
Constructing the dominant kinetic system
Now we show how to find an approximation of the dynamics of the reaction network \mathcal{W}. To construct this approximation, we produce a new acyclic reaction network with the initial set of vertices A_{
i
}∈ \mathcal{W}, i = 1..n which is called dominant kinetic system. Dynamics of this acyclic system can be computed from (7,8,9,10). To construct the dominant kinetic system, the following algorithm is applied:
Let {\mathcal{V}}^{m} be the result of the network preprocessing algorithm described in the previous section.

1.
For {\mathcal{V}}^{m} let us select the vertices {A}_{1}^{m}, {A}_{2}^{m}, ... that are glued cycles from {\mathcal{V}}^{m1}.

2.
For each glued cycle node {A}_{i}^{m}:

a)
Recall its nodes {A}_{i1}^{m1}\to {A}_{i2}^{m1}\to \dots {A}_{i{\tau}_{i}}^{m1}\to {A}_{i1}^{m1}; they form a cycle of length τ_{
i
},

b)
Let us assume that the limiting step in {A}_{i}^{m} is {A}_{i{\tau}_{i}}^{m1}\to {A}_{i1}^{m1},

c)
Remove {A}_{i}^{m} from {\mathcal{V}}^{m},

d)
Add τ_{
i
}vertices {A}_{i1}^{m1},{A}_{i2}^{m1},\dots {A}_{i{\tau}_{i}}^{m1} to {\mathcal{V}}^{m},

e)
Add to {\mathcal{V}}^{m} reactions {A}_{i1}^{m1}\to {A}_{i2}^{m1}\to \dots {A}_{i{\tau}_{i}}^{m1} (that are the cycle reactions without the limiting step) with correspondent constants from {\mathcal{V}}^{m1},

f)
If there exists an outgoing reaction {A}_{i}^{m} → B in {\mathcal{V}}^{m} then we substitute it by the reaction {A}_{i{\tau}_{i}}^{m1} → B with the same constant, i.e. outgoing reactions {A}_{i}^{m} → ... are reattached to the heads of the limiting steps,

g)
If there exists an incoming reaction in the form B → {A}_{i}^{m}, find its prototype in {\mathcal{V}}^{m1} restore it in {\mathcal{V}}^{m}.

3.
If in the initial {\mathcal{V}}^{m} there existed a "betweencycles" reaction {A}_{i}^{m}\to {A}_{j}^{m} then we find the prototype in {\mathcal{V}}^{m1}, A → B, and substitute the reaction by {A}_{i{\tau}_{i}}^{m1} → B with the same constant, as for {A}_{i}^{m}\to {A}_{j}^{m} (again, the beginning of the arrow is reattached to the head of the limiting step in {A}_{i}^{m}).

4.
Let m ← m  1, and repeat steps 1–4 until no glued cycles left.
One has to notice that in the process of network preprocessing some reaction rates are substituted by monomials of the initial reaction constants, i.e. expressions in the form {k}_{new}=\frac{{k}_{{i}_{1},{i}_{2}}\mathrm{...}{k}_{{j}_{1},{j}_{2}}}{{k}_{{l}_{1},{l}_{2}}\mathrm{...}{k}_{{m}_{1}}{,}_{{m}_{2}}}. In totally separated case the values of these monomials are also well separated from the other constants with probability close to 1, however, the initial order of constants does not prescribe position of these monomials in the rates order. In this case the algorithm produces several dominant systems defined for all possible position of new monomial rate constants in the order. An example of this will be given later in this section. Such a situation can happen during the network preprocessing when maximum reaction constant should be chosen, or in the process of dominant system creation, when determining the limiting step in a cycle.
Finding stationary distributions
The dominant kinetic system fully describes the relaxation modes of the network. The construction of this system depends only on the matrix K and does not depend on the production reactions K_{0}. In particular, relaxation times do not depend on the system being closed or not. However, the stationary distribution c^{s}and the sequence of relaxation events depends on production reactions (see Eq.(2)).
For closed systems, steady states are solutions of the linear homogeneous equation Kc = 0, therefore they are determined up to multiplication by positive constants. They form a kdimensional cone where k is the multiplicity of the zero eigenvalue of the matrix K, also the number of minimal sinks of the network.
Let {A}_{f1}^{m},{A}_{f2}^{m},\mathrm{...},{A}_{fk}^{m} be k sink vertices of the auxiliary network {\mathcal{V}}^{m}. Let A_{
i
}, i = 1..n be vertices in the initial network \mathcal{W}. Below we describe a procedure for finding the basis of all stationary distributions of a closed network:

1.
Let us take the i th sink vertex {A}_{fi}^{m}.

2.
Define x = {A}_{fi}^{m}, b = 1, and a null vector b^{i}∈ R^{n}.

3.
If x is not a glued cycle then it corresponds to a vertex A_{
j
}∈ \mathcal{W}, and the basis vector b^{i}has components {b}_{j}^{i} = δ_{
ij
}; stop.

4.
If x is a glued cycle then recall all its vertices x^{1}, ..., x^{τ}.

5.
Determine the limiting (minimal) rate constant κ_{
lim
}= min_{s=1..τ}{{\kappa}_{{x}^{s}}} and s_{
min
}= arg min_{s=1..τ}{{\kappa}_{{x}^{s}}}.

6.
For each vertex x^{j}of the cycle repeat:

a)
Let {c}_{j}=b\frac{{\kappa}_{lim}}{{\kappa}_{{x}^{j}}} if j ≠ s_{
min
}and {c}_{j}=b\{1{\displaystyle {\sum}_{s\ne {s}_{min}}\frac{{\kappa}_{lim}}{{\kappa}_{{x}^{s}}}}\} otherwise,

b)
if x^{j}corresponds to a simple vertex A_{
k
}then {b}_{k}^{i} = c_{
j
},

c)
if x^{j}corresponds to a glued cycle then do recursively steps 4–6 with x = x^{j}and b = c_{
j
}.
Any possible stationary distribution has form {c}^{s}={\displaystyle {\sum}_{i=\mathrm{1..}k}{c}_{fi}^{m}{b}^{i}},{c}_{fi}^{m}\ge 0, {c}_{fi}^{m}. The coefficients {\tilde{c}}^{s}={\displaystyle {\sum}_{i=\mathrm{1..}k}{c}_{fi}^{m}{\tilde{b}}^{i}} are computed from initial data: they are equal to the total initial mass carried by vertices of {\mathcal{V}}^{m} (when these are glued cycles we consider the total initial mass of the cycle) that are attracted by {A}_{fi}^{m}.
In brief, the distribution of the concentrations on any cycle is approximated by the first order expression (15), and this procedure is applied recursively for the vertices that represent glued cycles. The state thus obtained is equally a good approximation of the steady state of the dominant kinetic system.
Open systems can be reduced to closed ones by considering that all production reactions originate from the node ∅ that has concentration c_{0} = 1. The corresponding reaction are ∅ → A_{
i
}and the constants are the production rates k_{i 0}. The normalization c_{0} = 1 is possible for all bounded steady states, because these are determined up to multiplication by a constant. Furthermore, all steady states are bounded, provided that the following topological condition holds: if there exists an oriented path from ∅ to A_{
i
}, then there exists an oriented path from A_{
i
}to ∅. We suppose this condition to be always fulfilled. Applying the algorithm to the closed system we obtain {\tilde{c}}^{s}={\tilde{c}}^{s}/({\displaystyle {\sum}_{i=\mathrm{1..}k}{c}_{fi}^{m}{\tilde{b}}_{0}^{i}}) that is normalized to {\tilde{c}}^{s}={\tilde{c}}^{s}/({\displaystyle {\sum}_{i=\mathrm{1..}k}{c}_{fi}^{m}{\tilde{b}}_{0}^{i}}) in order to have {c}_{0}^{s} = 1.
Example. Let us consider an example of the network \mathcal{W} shown on Fig. 2 (1). Below we briefly detail every step of the algorithm.

2)
An auxiliary reaction network {\mathcal{V}}_{\mathcal{W}} is constructed (this gives a nonbranching network);

3)
The cycle A_{3} → A_{4} → A_{5} → A_{3} in the auxiliary reaction network is glued in one vertex {A}_{3}^{1} (shown by the circle node); In the initial network we find an "exiting from cycle" reaction A_{4} → A_{2}, renormalize its rate to {k}_{32}^{1}=\frac{{k}_{24}{k}_{35}}{{k}_{54}} and insert in the new network {\mathcal{W}}^{1};

4)
The cycle A_{3} → A_{2} → A_{3} in the auxiliary reaction network {V}_{{\mathcal{W}}^{1}} (which is now coincides with {\mathcal{W}}^{1} itself) is glued in one vertex {A}_{2}^{2}; now the network {\mathcal{W}}^{2} is acyclic and we stop the network preprocessing.
Now if we restore the cycle A_{3} → A_{2} → A_{3} and try to determine the limiting step in it, we have two possibilies: \frac{{k}_{24}{k}_{35}}{{k}_{54}} <k_{32} and \frac{{k}_{24}{k}_{35}}{{k}_{54}} > k_{32}. Let us consider them separately:
Case \frac{{k}_{24}{k}_{35}}{{k}_{54}} <k_{32}
3.1.1) Since {A}_{3}^{1} <k_{32}, then we remove the limiting step {A}_{3}^{1} → A_{2} and obtain 3.1.2).
3.1.3) We restore the glued cycle corresponding to {A}_{3}^{1} and we recall that the reaction A_{2} → {A}_{3}^{1} in {\mathcal{W}}^{1} corresponds to A_{2} → A_{3} in \mathcal{W}.
3.1.4) We remove the limiting step reaction in the cycle A_{3} → A_{4} → A_{5} → A_{3} (it is A_{5} → A_{3}) and as a result we obtain acyclic dominant kinetic system shown at Fig. 2 (3.1.4).
Case \frac{{k}_{24}{k}_{35}}{{k}_{54}} > k_{32}
3.2.1) Since \frac{{k}_{24}{k}_{35}}{{k}_{54}} > k_{32}, then we remove the limiting step A_{2} → {A}_{3}^{1} and obtain 3.1.2).
3.2.3) We restore the glued cycle corresponding to {A}_{3}^{1} this time we should reattach the reaction {A}_{3}^{1} → A_{2} to the head of the limiting step in the cycle (it is A_{5} vertex); the rate of A_{5} → A_{2} is \frac{{k}_{24}{k}_{35}}{{k}_{54}}.
3.2.4) We remove the limiting step reaction in the cycle A_{3} → A_{4} → A_{5} → A_{3} (it is A_{5} → A_{3}) and as a result we obtain acyclic dominant kinetic system shown at Fig. 2 (3.2.4).
Discussion and perspectives
Dominant approximations of hierarchical linear reaction network allow us to introduce some new concepts important for the dynamics of multiscale systems.
Hybrid and qualitative dynamics
Piecewise affine dynamics has been widely used to approximate dynamics of gene regulatory networks [35–37] as a sequence of discrete transitions between attractors of affine systems. This picture is based on threshold response of genes in models with steep regulation functions (Hill functions and other representations of sigmoidal response) and is not directly related to time scales. Here, we emphasize another possible way to obtain hybrid, or qualitative representations of dynamics, based on time separation.
Indeed, zeroone approximation of eigenvectors in hierarchical linear systems justifies a discrete coding of dynamics. Suppose that initial state is concentrated in j_{0}, c_{
i
}(0) ~ {\delta}_{i,{j}_{0}}. At times just larger than 1/λ_{
k
}an exponential vanishes in Eq. (2) and the state has a "jump" r^{k}(l^{k}, c(0) c_{
s
}). Let us consider that eigenvalues are ordered λ_{1} >> λ_{2} >> ... >> λ_{n1}. Then, the sequence of right eigenvectors r^{k}such that (l^{k}, c(0) c_{
s
}) ≠ 0 codes the dynamics starting in c(0). In other words, there is a sequence of well separated times τ_{1} = 1/λ_{1} <<λ_{2} = 1/λ_{2} << ... <<τ_{n1}= 1/λ_{n1}such that something happens (a state transition) between each one of these times. Left eigenvectors provides the lumping (several species cumulated to form pseudospecies) and right eigenvectors provide the sequence of state transitions. On timescales τ_{
k
}<t <τ_{k+1}one can observe a jump r^{k}in state space provided that (l^{k}, c(0)  c^{s}) ≠ 0. On this timescale the dynamics is equivalent to the degradation of pseudospecies (l^{k}, c), \frac{d({l}^{k},c)}{dt} = λ_{
k
}(l^{k}, c).
Critical parameters and design principles
Our approach to dominant subsystems emphasizes some simple but important principles. First of all, dynamics of a hierarchical linear network can be specified if a) the topology of the network is given and if b) to each reaction we associate a positive integer representing order (1 for the most rapid reaction, 2 for the second most rapid reaction, and so on ...); c) for cyclic topologies, some monomials grouping constants of several reactions have also to be ordered in the same manner (which reactions depend both on topology and on initial ordering).
In the process of simplification some reaction pathways are dominated and do not appear in the dominant subsystem. Therefore, the corresponding constants are not critical for the system: although their ordering matters for establishing the simplification, their precise value have little importance. Because parameters of the dominant subsystem are generally monomials of parameters of the whole system, critical parameters are those parameters that occur in critical monomials. Our findings show rather counterintuitive properties of critical and noncritical parameters, that can be useful as design principles. Thus, in cycles, the limiting step (slowest reaction) has little influence on dynamics (though is important for the steady state). Dynamically, a cycle with separated constants behaves like the chain obtained from the cycle by eliminating the limiting step. In particular, the slowest relaxation time of a cycle is the inverse of its second slowest constant [1, 34].
We should add some words about the relation between linear and nonlinear models. Mathematical models of biochemical reaction networks in molecular biology contain with necessity nonlinear, nonmonomolecular reactions (complex binding, catalysis, etc.). However, the developed algorithms of model reduction for linear networks can be useful in systems biology, in several situations:

1)
When some submechanisms of a complex and nonlinear network are linear, given fixed (or slowly changing) values of external inputs (boundaries);

2)
For approximating nonlinear dynamics. For multiscale nonlinear reaction networks the expected dynamical behaviour is to be approximated by the system of dominant networks. These networks may change in time but remain small enough. To give an example, we provided the Fig. 3S1–3S2 in Additional File 1 demonstrating that in a model of complex reaction network of NFκ B pathway, containing 17 multimolecular reactions, only two reactions show genuinely nonlinear behavior in some windows of time, with two more showing borderline behavior, and all others have wellseparated reactant concentrations in any moment of time. The rigorous justification of these hybrid approximations for mass action reaction networks will be discussed elsewhere.
Reduction of nonlinear multiscale systems
Complex formation is a source of nonlinearity in biochemical networks. For instance, in signalling, ligand molecules form complexes with receptors. Transcription factors are often dimers or multimers or are sequestered by forming complexes with their inhibitors. In these examples, the reaction rates are nonlinear functions of the concentrations of two or more molecules.
To construct a nonlinear reaction network we need the list of components, \mathcal{A} = {A_{1}, ..., A_{
n
}} and the list of reactions (the reaction mechanism):
{\displaystyle \sum _{i}{\alpha}_{ji}{A}_{i}\to {\displaystyle \sum _{k}{\beta}_{jk}{A}_{k}}},
(17)
where j ∈ [1, r] is the reaction number. Unless reactants and products belong to compartments of different volumes α_{
ji
}, β_{
jk
}are nonnegative integers (stoichiometric coefficients). Reactions involving components from different compartments have noninteger stoichiometry. For instance, a reaction translocating a molecule from nucleus to the cytosol has stoichiometry (..., 1, k_{
v
}, ...) where k_{
v
}is the volume ratio of cytosol to nucleus.
Dynamics of nonlinear networks is described by a system of differential equations:
\frac{dc}{dt}=F(c)={\displaystyle \sum _{j=1}^{r}{\nu}_{j}{R}_{j}(c)}=SR(c)={\displaystyle \sum _{j=1}^{r}{\nu}_{j}({R}_{j}^{+}(c){R}_{j}^{}(c)})
(18)
ν_{
j
}= β_{
j
} α_{
j
}is the global stoichiometric vector. S is the stoichiometric matrix whose columns are the vectors ν_{
j
}. The reaction rates {R}_{j}^{+/} (c) are nonlinear functions of the concentrations. For instance the mass action law reads {R}_{j}^{+}(c)={k}_{j}^{+}{\displaystyle {\prod}_{i}{c}_{i}^{{\alpha}_{ji}}},{R}_{j}^{}(c)={k}_{j}^{}{\displaystyle {\prod}_{i}{c}_{i}^{{\beta}_{ji}}}.
There are no simple rules to relate timescales to reaction constants of nonlinear models. The units of the inverse constants of bimolecular reactions are concentration multiplied by time and one needs at least one concentration value in order to construct a timescale. Generally, timescales are functions of many reaction constants and concentration variables. These functions are not necessarily smooth. Near bifurcations (for instance, near Hopf or saddlenode bifurcations), at least one timescale of the system diverges for finite changes of the reaction constants. However, nonlinear biochemical networks have wide distributions of timescales, as can be shown by simple (Jacobian based) analysis of models.
Various reduction methods of nonlinear models are based on projection of the dynamics on a lower dimensional invariant manifold [4–8]. The reduced models are systems of differential equations, but no longer networks of chemical reactions. Quasiequilibrium and quasistationarity methods keep the network structure of the model and propose lumped reaction mechanisms as dominant subsystems. This approach has some advantages. Indeed, it leads to more transparent analysis of the results and of design principles, produces hierarchies of models and facilitates model comparison. Graphical reduction methods using elementary modes, were proposed by Clarke [26] for chemical systems and more recently in systems biology by Klamt [38]. Similar methods can be found in [39], from which we have borrowed the terminology. The choice of the species to be eliminated and of the reactions to be aggregated, as well as the calculation of rates of elementary modes have no theoretical justification in these methods and their inappropriate use can alter dynamics (for instance, as Clarke noticed, the stability of limit cycles is not guaranteed). Thus, in order to have a complete practical recipe that applies to multiscale biochemical networks we need to solve three more problems: detection of rapid species, resolution of quasistationarity equations and calculation of reaction rates of the dominant mechanisms.
A major improvement in calculating dominant subsystems can be obtained by combining quasistationarity and averaging. Averaging techniques are widely used in physics and chemistry to simplify models by eliminating fast, oscillating (microscopic) variables [22–24]. Our use of averaging is different, because we employ it to obtain averaged stationarity equations for slow, nonoscillating variables and to eliminate these species. After choosing a "middle" time scale (corresponding to the time resolution of the experiment), we want to reduce all scales that are faster but also all scales that are slower than this middle scale. In order to do that we provide a unified framework for species elimination and reaction aggregation, either by quasistationarity (fast species) or by averaged stationarity (slow species).
Let I be the set of indices of intermediate components, that will be eliminated. {\mathcal{R}}^{(I)} is the set of reactions that either produce or consume species from I. Rates of {\mathcal{R}}^{(I)} depend on the concentrations of intermediate species and also on the concentrations of other species, which in the terminology of Temkin [39] are called terminal. Let T be the set of indices of terminal species. Terminal species represent the frontier between the rest of the system and the subsystems made of intermediate species. Although instead of terminal the name boundary species could be more appropriate, the latter term has already been employed in systems biology with a different meaning, which is species whose concentrations are fixed in a simulation.
Extracting from the matrix S the columns corresponding to the reactions {\mathcal{R}}_{I} and the lines corresponding to the species \mathcal{I} and \mathcal{T} we obtain the intermediate stoichiometric matrix S_{
I
}and the terminal stoichiometric matrix S_{
T
}, respectively.
Eliminating fast species: quasistationarity
In multiscale biochemical systems, some components react much more rapidly to changes in the environment than others. The reasons for the existence of such fast species can be multiple. Thus, rapidly transformed or rapidly consumed molecules (for instance those taking part in metabolic chains or rapid chemical transformations such as phosphorylation), or promoter sites submitted to rapid binding/unbinding processes are examples of fast species. Fast species are good candidates for intermediate species. Indeed, it is easy to prove that they can be eliminated by quasistationarity. When production rates are not weak, fast species are those whose concentrations are small and well separated from the concentrations of other species. Though straightforward, the precise condition connecting quasistationarity and smallness of concentrations can not be easily found in literature, hence we briefly discuss it below.
Let ϵ be a small parameter, representing concentrations. Suppose for simplicity that reactions {\mathcal{R}}^{(I)} are pseudomonomolecular. This means that S_{
I
}R_{
I
}(c_{
I
}, c_{
T
}) = K_{
I
}(c_{
T
}) c_{
I
}+ {K}_{I}^{0}(c_{
T
}), where R_{
I
}is the restriction of the vector R to the intermediate species. An important assumption is {K}_{I}^{0}(c_{
T
}) = \mathcal{O}(1) meaning that the production of intermediate species is not weak.
Suppose that among the reactions {\mathcal{R}}^{(I)} consuming intermediates, at least some have rates of order \mathcal{O}(1). This is current, because these reactions produce terminal species which have larger concentrations.
Because c_{
I
}= \mathcal{O}(ϵ), it means that K_{
I
}(c_{
T
}) = \mathcal{O}(1/ϵ). This leads to the following asymptotic:
\u03f5\frac{d{\tilde{c}}_{I}}{dt}={\tilde{K}}_{I}({c}_{T}){\tilde{c}}_{I}+{K}_{I}^{0}({c}_{T})
(19)
\frac{d{c}_{{I}^{c}}}{dt}=g(\u03f5{\tilde{c}}_{I},{c}_{{I}^{c}})
(20)
where {\tilde{c}}_{I} = c_{
I
}/ϵ, {\tilde{K}}_{I} = ϵ K_{
I
}= \mathcal{O}(1), I^{c}is the complement of I designating species other that I. Intermediate species are fast and the system (19) can be reduced using Tikhonov's [40, 41] and Fenichel's [42] results. According to these results, after a short laps of time, the system evolves on an invariant manifold (an invariant manifold is defined by the property that any trajectory starting in the manifold stays inside the manifold) which is at distance \mathcal{O}(ϵ) from the quasisteady state (QSS) manifold defined by the quasistationarity equations:
{\tilde{K}}_{I}({c}_{T}){\tilde{c}}_{I}+{K}_{I}^{0}({c}_{T})=0
(21)
Quasistationarity equations can be used to express concentrations of the intermediate species as functions of the concentrations of terminal species. If matrix K_{
I
}has not full rank, conservation laws should be added to the quasistationarity equations in order to obtain a full rank system. Let μ_{1}, ..., μ_{
k
}be a basis of the left kernel of S_{
I
}(a complete set of conservation laws). We say that species of indices I are quasistationary if they approximately fulfill the equations:
S_{
I
}R_{
I
}(c_{
I
}, c_{
T
}) = 0
and exactly fulfill the conservation laws:
μ_{
i
}i_{
I
}= C_{
i
}, i = 1,..., k,
where C_{
i
}are real constants.
Fast, quasistationary species are generally difficult to detect. For instance, the strong production condition {K}_{I}^{0}(c_{
T
}) = \mathcal{O}(1), although informative for understanding of the dynamics, can not be used in practice. Furthermore, small concentration is not a necessary condition for quasistationarity. Therefore, our practical method for detection of fast, quasistationary species is based on the direct checking of Eqs.(22), (23) (see Fig 3a and the Results section for an example).
Once quasistationary species are detected, the recipe proposed by Clarke [26] can be applied to simplify the reaction mechanism. Let us reformulate this recipe here:

1.
Eliminate the intermediate concentrations by solving the equations (22), (23). Express c_{
I
}as function of c_{
T
}.

2.
Replace the mechanism {\mathcal{R}}^{(I)} by "simple submechanisms".

3.
Compute the rates of the simple submechanisms as functions of c_{
T
}.
The simplicity criterion employed by Clarke does not follow from a physical principle. Nevertheless, in systems biology, biochemical reactions are simplified representations of complex physicochemical processes. In the absence of detailed information, simplicity arguments are often employed. Elementary modes analysis widely used in metabolic control and gene network analysis [43–45] is based on exactly the same argument.
The same recipe applies also to model comparison, when we want to compare two models which differ in complexity (some species in one model are not present in the second). In this situation we declare the extra species intermediate and apply the three steps of the algorithm.
Simple submechanisms and rates
Let us introduce some more definitions. A reaction route is a combination of reactions in {\mathcal{R}}_{I} transforming terminal species into other terminal species and conserving the intermediate species. It is defined by a integer coefficient vector γ ∈ ℤ^{s} (the dimension s is the number of reactions in {\mathcal{R}}_{I}) satisfying the following three conditions:
S_{
I
}γ = 0
γ_{
i
}≥ 0, if the reaction i is irreversible
S_{
T
}γ > 0
Reaction routes are usually defined [39] without the condition (26). By imposing this condition, we exclude internal cycles with zero terminal stoichiometry.
A submechanism M(γ) is the set of all the reactions in the reaction route γ, M(γ) = {iγ_{
i
}≠ 0}. A submechanism is simple if it is minimal with respect to inclusion, i.e. if M (γ') ⊂ M (γ) ⇒ γ = γ'. Simple submechanisms are pathways with a minimal number of reactions, connecting terminal species without producing accumulation or depletion of the intermediate species. Thus, they are candidates for reduced reaction mechanisms. Simple submechanisms are minimal dependent sets in oriented matroids [46], similar to elementary modes in flux balance analysis [43]. Algorithms for finding elementary modes can be applied for the search of simple submechanisms [43–45].
In the reduced model, the reactions of the intermediate mechanism {\mathcal{R}}_{I} are replaced by the submechanisms γ_{1}, ..., γ_{
s
}.
Each terminal species is produced or consumed by one or several reactions of the intermediate mechanism. The reduction should preserve the flux of each terminal species, meaning that the following equation should be satisfied identically, for all c_{
T
}and c_{
I
}satisfying (22),(23)^{:}
\begin{array}{cc}{\displaystyle \sum _{m\in {\mathcal{R}}_{I}}{\nu}_{m}^{j}{R}_{m}({c}_{I},{c}_{T})}={\displaystyle \sum _{i=1}^{s}{({S}_{T}{\gamma}_{i})}_{j}{{R}^{\prime}}_{i}(c)},& j\in T\end{array}
(27)
where {{R}^{\prime}}_{i}(c) are the rates of the simple submechanisms.
Suppose that for any simple submechanism i there is a terminal species j such that S_{
T
}γ_{
i
}is the unique vector (among the s different ones) having nonzero coordinate j, (S_{
T
}γ_{
i
})_{
j
}≠ 0. Then, there is a straightforward solution for (27):
{{R}^{\prime}}_{i}(c)=\frac{1}{{({S}_{T}{\gamma}_{i})}_{j}}{\displaystyle \sum _{m\in {\mathcal{R}}_{I}}{\nu}_{m}^{i}{R}_{m}({c}_{I},{c}_{T})}
(28)
The above uniqueness condition is not fulfilled if there are two submechanisms for which the terminal stoichiometries are proportional. This situation can be avoided by quotienting with respect to the following equivalence relation: γ_{
i
}and γ_{
j
}are equivalent iff S_{
T
}γ_{
i
}= α S_{
T
}γ_{
j
}, for some α = ≠ 0. After discarding some submechanisms and keeping only one representative per class, we have a reduced set of simple submechanisms for which rates can be calculated from (28).
Dominant solutions to the quasistationarity equations, multiscale ensembles
The most difficult part of the above algorithm is to solve the quasistationarity equations (22),(23). Even in the monomolecular case, symbolic solutions of the linear system (21) can involve long expressions. Furthermore, mass action law leads to polynomial equations in the binary or multimolecular case. Symbolic methods for solutions of systems of polynomial equations are limited to a small number of variables.
In this subsection we show how the multiscale nature of the system can be used to obtain approximate, dominant solutions of the quasistationarity equations.
In linear hierarchical models, ensembles with well separated constants appear (see also [1]). We could represent them by a loguniform distribution in a sufficiently big interval log k ∈ [α, β], but most of the properties of this probability distribution will not be used here. The only property that we will use is the following: if k_{
i
}> k_{
j
}, then k_{
i
}/k_{
j
}≫ 1 (with probability close to one). It means that we can assume that k_{
i
}/k_{
j
}≫ a for any preassigned positive value of a that does not depend on k values. One can interpret this property as an asymptotic one for α → ∞, β → ∞. This property allows us to simplify algebraic formulas. For example, k_{
i
}+ k_{
j
}can be substituted by max{k_{
i
}, k_{
j
}} (with small relative error), or
\frac{a{k}_{i}+b{k}_{j}}{c{k}_{i}+d{k}_{j}}\approx \{\begin{array}{cc}a/c,& \text{if}{k}_{i}\gg {k}_{j};\\ b/d,& \text{if}{k}_{i}\ll {k}_{j},\end{array}
for nonzero a, b, c, d.
Of course, some ambiguity can be introduced, for example, what is (k_{1} + k_{2})  k_{1}, if k_{1} ≫ k_{2}? If we first simplify the expression in brackets, it is zero, but if we open brackets without simplification, it is k_{2}. This is a standard difficulty in use of relative errors for roundoff. If we estimate the error in the final answer, and then simplify, we shall avoid this difficulty. Use of o and \mathcal{O} symbols also helps to control the error qualitatively: if k_{1} ≫ k_{2}, then we can write (k_{1} + k_{2}) = k_{1}(1 + o(1)), and k_{1}(1 + o(1)  k_{1} = k_{1}o(1). The last expression is neither zero, nor absolutely small – it is just relatively small with respect to k_{1}.
It is slightly more difficult to solve equations. Some recipes were proposed such as Newton polyhedra for approximate solutions of polynomial systems of equations [47] but this type of methods suffers from combinatorial complexity. Here we use a simpler, but not so rigorous approach. In the case of pseudomolecular subsystems, our algorithms for linear hierarchical systems are enough for this purpose. In general, we choose the dominant terms in the solutions as monomials of the parameters. This can be done either by educated guess, or by testing numerically the orders of various terms in the equations. The most frequent, truly nonlinear simplification that occurs in biochemical models is the "minfunnel", which we present below.
Let us consider the production of a complex C from two proteins A and \text{B}:\varnothing \underset{}{\overset{{k}_{A}}{\to}}\text{A} (production of A), \varnothing \underset{}{\overset{{k}_{B}}{\to}}\text{B} (production of B), \text{A}\underset{}{\overset{{k}_{deg,A}}{\to}}\varnothing (degradation of A), \text{B}\underset{}{\overset{{k}_{deg,B}}{\to}}\varnothing (degradation of B), \text{A}+\text{B}\underset{}{\overset{{k}_{c}}{\rightleftharpoons}}\text{C} (complex formation).
Supposing A, B quasistationary we have to find the positive solutions of the equations {k}_{A}=\tilde{A}+{\tilde{k}}_{c}\tilde{A}\tilde{B}, {k}_{B}=\tilde{B}+{\tilde{k}}_{c}\tilde{A}\tilde{B}, where \tilde{A}={k}_{deg,A}A, \tilde{B}={k}_{deg,B}B, {\tilde{k}}_{c}={k}_{c}/({k}_{deg,A}{k}_{deg,B}). We will consider two cases a) 1/{\tilde{k}}_{c} <<k_{
A
}<<k_{
B
}and b) 1/{\tilde{k}}_{c} <<k_{
B
}<<k_{
A
}. Both cases mean that degradation of A, B is weak and/or the propensity of complex formation is high. Case a) means also that B is in excess, the opposite being true in case b).
Let us consider the case a). We consider that the order of \tilde{A} in the dominant solution is larger than the order of \tilde{B}, \tilde{A}<<\tilde{B}. From the linear equation k_{
A
} k_{
B
}= \tilde{A}\tilde{B} we obtain \tilde{B} = k_{
B
}and from the second nonlinear equation we obtain \tilde{A}=\frac{{k}_{A}}{1+{\tilde{k}}_{c}{k}_{B}}\approx \frac{{k}_{A}}{{\tilde{k}}_{c}{k}_{B}}. Finally, we have \tilde{A}<<\tilde{B} consistently with the starting guess. The dominant solution in case b) is obtained by symmetry from the one in case a). The quantity of interest in this example, for which we want a reduced expression is the production rate of the complex R_{
c
}= k_{
c
}AB. Actually, the two solutions can be summarized by:
R_{
c
}= min(k_{
A
}, k_{
B
})
Using the exact solution of the system (after eliminating A from the linear equation we remain with a quadratic equation for B) we can show that the minfunnel approximation (29) is valid under less restrictive conditions. The only separation condition that we need is min(k_{
A
}, k_{
B
}) >> k_{
deg, A
}k_{deg,B}/k_{
c
}. We can easily identify the critical parameters k_{
A
}, k_{
B
}and the noncritical ones k_{deg,A}, k_{deg,B}, k_{
c
}. The validity of the expression (29) depends on order relations involving monomials of critical and noncritical parameters.
Eliminating slow species: averaging
Averaging is an useful model reduction technique for highdimensional clocks or for other types of oscillating molecular systems (the activity of some transcription factors, among which NFκ B, present oscillations under some conditions).
Averaging can be applied rather generally [22–24] to produce coarse grained quantities and reduced models. The typical mathematical result applying here is due to Pontryagin and Rodygin [48, 49]. Supposing that the oscillating species are x, the nonoscillating species are y, and ∈ is a small parameter, then we have:
\u03f5\frac{dx}{dt}=f(x,y)
(30)
\frac{dy}{dt}=g(x,y)
(31)
It is supposed that for any y, the fast dynamics (30) has an attractive hyperbolic limit cycle x = ψ (τ, y), of period T(y): ψ (τ + T(y), y) = ψ(τ, y) (τ = t/ϵ). Then, after a short transient, the slow variable satisfies the averaged equation:
\frac{dy}{dt}=\frac{1}{T(y)}{\displaystyle {\int}_{0}^{T(y)}g(\psi (\tau ,y),y)d\tau}
(32)
The result can be extended to the case when x = ψ(τ, y) describes damped oscillations, with damping time much larger than ϵ, i.e. when the fast dynamics (30) has a stable focus and the eigenvalues of the Jacobian ∂f/∂x calculated at the focus are of the form λ ± iμ, 0 <λ <<μ = \mathcal{O}(1/ϵ).
The following averaged steady state equation allows to eliminate the slow species y:
{\displaystyle {\int}_{0}^{T(y)}g(\psi (\tau ,y),y)d\tau =0}
(33)
If (32) has a stable steady state, we always reach this situation. In this case, the slow nonoscillating variables y are constant in time and can be considered to be conserved, which has two significant consequences.
First, Eq.(33) restores conservation. Slow variables are often the result of broken conservation laws. In fact, in biological open systems, nothing is conserved. Conservation laws result from balancing production and degradation either passively (slow processes) or actively (feedback). Thus, we can ignore production and degradation of molecules whose level is rigorously controlled. Eq.(33) describes such a case.
Second, (33) are averaged steady state equations for the slow variables. If slow variables y reach stationarity, the only variables that change in time are the oscillating variables x. Eq.(30) describes the dynamics of x, considering that y satisfies (33). For oscillators, averaging provides a way to eliminate slow nonoscillating variables, which is formally equivalent to quasistationarity and represents a new case of applicability of Clarke's method. The difference between the two cases is that we eliminate fast variables by solving quasistationarity equations and we eliminate slow variables by solving averaged stationarity equations. Thus, intermediate nonoscillating variables are expressed in terms of only nonoscillating terminal variables. If there are no nonoscillating terminal variables, then nonoscillating intermediates become conserved quantities.