### Definitions

We consider the standard steady-state problem of a metabolic network with *m* internal metabolites and *n* reactions, i.e. \mathit{S}\xb7\widehat{\mathit{v}}=\mathbf{0}. Here, *S* denotes the *m*×*n* stoichiometric matrix of the network, and \widehat{\mathit{v}} the *n*-dimensional flux vector through the network.

Let *ê* be an EM flux vector [14, 15] fulfilling the steady state condition, and \mathit{e}=\mathit{e}\left(\mathit{\xea}\right) its binary representation,

{e}_{i}:=e\left({\xea}_{i}\right)=\left\{\begin{array}{c}1\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\mathrm{\text{if}}\phantom{\rule{1em}{0ex}}{\xea}_{i}\ne 0\\ 0\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\mathrm{\text{if}}\phantom{\rule{1em}{0ex}}{\xea}_{i}=0\end{array}\right.\phantom{\rule{3.5em}{0ex}},\phantom{\rule{1em}{0ex}}i=\{1,\dots ,n\}.

(1)

*e*_{
i
} indicates whether reaction *i* is part of the EM *ê*. That is, *e*_{
i
}=1 if and only if a reaction is carrying flux either in forward or backward direction. Similar to equation (1) let *v* denote the binary representation of any valid flux distribution \left(\right)close="">\n \n \n \n v\n \n \u0302\n \n \n. Then the product

{\mathit{e}}^{\mathrm{T}}\mathit{v}\le {\mathit{e}}^{\mathrm{T}}\mathit{e}=\sum _{i=1}^{n}{e}_{i}^{2}=\sum _{i=1}^{n}{e}_{i}=:\left|\right|\mathit{e}\left|\right|,

(2)

indicates if *e* is part of *v* as the equality only holds when all “active” reactions in *e* are also carrying flux in *v*.

Finally, we group all *q* binary EM of *S* into three matrices

\begin{array}{l}\mathit{G}:={({\mathit{e}}_{1},\dots ,{\mathit{e}}_{r})}^{\mathrm{T}},\end{array}

(3a)

\begin{array}{l}\mathit{H}:={({\mathit{e}}_{r+1},\dots ,{\mathit{e}}_{r+s})}^{\mathrm{T}},\end{array}

(3b)

\begin{array}{l}\mathit{K}:={({\mathit{e}}_{r+s+1},\dots ,{\mathit{e}}_{r+s+t})}^{\mathrm{T}}.\end{array}

(3c)

where *q*=*r* + *s* + *t*, as all EM are in one of the three matrices. The “goal matrix”, *G*, contains all desirable EM, which define the minimal properties of the NMF and must therefore be kept. The “kill matrix”, *K*, consists of the unwanted EM, which must not be part of the final flux space and have to be deleted from the network. Finally, the helper matrix, *H* holds all remaining EM. These modes do not affect the primary design criterion, and therefore may or may not be present in the final design.

In the notation of Hädicke and Klamt [16], our kill matrix *K* is their set of target modes **T**. Our *G* is a subset of their set of desired modes **D**. We collect all other modes in *H*, while they split these EM between the sets of desired modes, **D**, and the sets of neutral modes. In their formulation Hädicke and Klamt [16] aim to keep at least *n* desired EM out of all modes in **D**. These “surviving” EM build our *G*. If, however, |**D**|=*n* then **D**=*G* and hence, both definitions are identical.

### Minimum number of deletions, Δ_{min}

By setting up a BLP problem, equation (2) may be used to predict the minimal set of knockouts to stop any given set of EM, i.e. the *K*-matrix, contributing to the steady state flux distribution

\begin{array}{l}\text{max}\phantom{\rule{1em}{0ex}}\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}}\left|\right|\mathit{x}\left|\right|\end{array}

(4a)

\begin{array}{l}\text{s.t.}\phantom{\rule{2.5em}{0ex}}\mathit{G}\mathit{x}=\mathit{\left|}\mathit{g}\mathit{\right|}\end{array}

(4b)

\begin{array}{l}\phantom{\rule{4em}{0ex}}\mathit{H}\mathit{x}\le \left|\right|\end{array}

(4c)

\begin{array}{l}\phantom{\rule{4em}{0ex}}\mathit{K}\mathit{x}\le \left|\mathit{k}\right|-\mathit{1}\end{array}

(4d)

\begin{array}{l}\phantom{\rule{4em}{0ex}}\mathit{x}={({x}_{1},\dots ,{x}_{n})}^{\mathrm{T}},\phantom{\rule{1em}{0ex}}{x}_{i}\in \{0,1\}\forall i,\end{array}

(4e)

We used **|** **g** **|**=(||*e*_{1}||,…,||*e*_{
r
}||)^{T}, **|** **h** **|**=(||*e*_{r + 1}||,…,||*e*_{r + s}||)^{T}, and **|** **k** **|**=(||*e*_{r + s + 1}||,…,||*e*_{r + s + t}||)^{T} to denote the vector of norms of each row of the matrix **G**,**H**, and **K**, respectively. **1**=(1,…,1) represents a vector of ones. The solution vector **x**, is the binary representation of all reactions participating in the designed NMF. Equation (4) is indeed a BLP problem as **x** is binary and \left(\right)close="">\n \n |\n |\n x\n |\n |\n =\n \n \n \u2211\n \n \n i\n =\n 1\n \n \n n\n \n \n \n \n x\n \n \n i\n \n \n \n is linear.

In equation (4) we used a matrix formulation, which is shorthand for the optimization problem in terms of all *q*=*r* + *s* + *t* binary EM vectors *e*_{
i
},

\begin{array}{ll}\text{max}\phantom{\rule{1em}{0ex}}& \left|\right|\mathit{x}\left|\right|\phantom{\rule{2em}{0ex}}\\ \text{s.t.}\phantom{\rule{1em}{0ex}}& {\mathit{e}}_{g}^{\mathrm{T}}\mathit{x}=\left|\right|{\mathit{e}}_{g}\left|\right|,\phantom{\rule{1em}{0ex}}g\in \{1,\dots ,r\}\phantom{\rule{2em}{0ex}}\\ {\mathit{e}}_{h}^{\mathrm{T}}\mathit{x}\le \left|\right|{\mathit{e}}_{h}\left|\right|,\phantom{\rule{1em}{0ex}}h\in \{r+1,\dots ,r+s\}\phantom{\rule{2em}{0ex}}\\ {\mathit{e}}_{k}^{\mathrm{T}}\mathit{x}\le \left|\right|{\mathit{e}}_{k}\left|\right|-1,\phantom{\rule{1em}{0ex}}\phantom{\rule{0.3em}{0ex}}k\in \phantom{\rule{0.3em}{0ex}}\{r+s+1,\dots ,r+s+t\}.\phantom{\rule{2em}{0ex}}\end{array}

Here we used indices *g*,*h*,*k* as a reminder that these EM vectors are the rows of the matrices *G*,*H*, and *K*, respectively. Note that each EM acts as a constraint for the optimization problem.

To understand equation (4b) requires that any solution includes all desired EM as – according to equation (2) – only then the product \left(\right)close="">\n \n \n \n e\n \n \n i\n \n \n T\n \n \n x\n \n is limited by the norm of *e*_{
i
}. Similar, equation (4d) demands that its solutions are at least one active reaction short, i.e. has more zeros than any EM in *K*. As already one single knockout in an EM kills it, these modes will not contribute to the desired design. Finally, constraint (4c) states that the EM of *H* may be included in the solution. In fact, the inequality (4c) does not constrain the system in any way. Equation (4c) is merely included for the sake of accounting completely for all EM in the network.

The minimal number of deletions can then be determined easily by counting the number of zeros in the calculated solution *x*,

{\mathrm{\Delta}}_{\text{min}}=n-\left|\right|\mathit{x}\left|\right|.

(5)

### Predicting all optimal sets of deletions

Equation (4) may either have no or a finite number of solutions. In the first case, no knockout strategy accommodates all constraints. However, if the constraints are relaxed, i.e. EM are shifted from *G* to either *H* or *K* [the limit being *G*=(*e*_{1})^{T},*H*=*0*, and *K*=(*e*_{2},…,*e*_{
q
})^{T}], it is always possible to find at least one solution.

Alternate optimal solutions may be found by successively excluding already existing solutions *x*^{(j)} of equation (4) by adding [17, 18],

\begin{array}{l}\sum _{i\in B}{x}_{i}-\sum _{i\in N}{x}_{i}\le \left|B\right|-1,\end{array}

(6a)

\begin{array}{l}B=\left\{i\right|{x}_{i}^{\left(j\right)}=1\},\phantom{\rule{1em}{0ex}}N=\{i|{x}_{i}^{\left(j\right)}=0\}.\end{array}

(6b)

Note that repeatedly applying equation (6) will not only generate all sets of different minimal knockouts but also enumerate all other solutions sorted by the number of deletions.

The final sequence contains all possible solutions. It also contains “inefficient” or non-minimal solutions. Consider a series of two reactions, *A*→*B*,*B*→*C*. To suppress the production of *C*, the knocking out of either reaction suffices. Knocking out both is admissible, although inefficient. To avoid calculating non-minimal solutions we split equation (6) into two constraints,

\begin{array}{l}\sum _{i\in B}{x}_{i}\le \left|B\right|-1,\end{array}

(7a)

\begin{array}{l}\sum _{i\in N}{x}_{i}\ge 1.\end{array}

(7b)

In matrix notation these constraints read

\phantom{\rule{2.5em}{0ex}}\begin{array}{ll}{\left[{\mathit{x}}^{\left(j\right)}\right]}^{\mathrm{T}}\mathit{x}& \le \left|\right|{\mathit{x}}^{\left(j\right)}\left|\right|-1\phantom{\rule{2em}{0ex}}\end{array}

(8a)

\begin{array}{ll}{[\mathit{1}-{\mathit{x}}^{\left(j\right)}]}^{\mathrm{T}}\mathit{x}& \ge 1.\phantom{\rule{2em}{0ex}}\end{array}

(8b)

The first excludes already existing solutions, *x*^{(j)}, the second ensures that all solutions will be minimal. In other words, no supersets of already determined solutions will be calculated.

It is possible to influence the succession of solutions by adding weights *w*_{
i
} to the objective function. Rather than maximizing ||*x*|| in equation (4a) we may use

\text{max}\phantom{\rule{0.3em}{0ex}}\left|\right|{\mathit{w}}^{\mathrm{T}}\mathit{x}\left|\right|,

(9)

with *w*^{T}=(*w*_{1},…,*w*_{
n
}). This allows to easily distinguish chemical from genetic interventions. If uptake reactions are assigned a small and all other reactions a large weight, our algorithm will favor deletions in the uptake reactions as they contribute little to the objective function. Deleting uptake reactions can simply be achieved by removing the substrate from the culture medium. We give guidelines for the choice of reaction weights in the example below.