Comparison on extreme pathways reveals nature of different biological processes

Background Constraint-based reconstruction and analysis (COBRA) is used for modeling genome-scale metabolic networks (MNs). In a COBRA model, extreme pathways (ExPas) are the edges of its conical solution space, which is formed by all viable steady-state flux distributions. ExPa analysis has been successfully applied to MNs to reveal their phenotypic capabilities and properties. Recently, the COBRA framework has been extended to transcriptional regulatory networks (TRNs) and transcriptional and translational networks (TTNs), so efforts are needed to determine whether ExPa analysis is also effective on these two types of networks. Results In this paper, the ExPas resulting from the COBRA models of E.coli's MN, TRN and TTN were horizontally compared from 5 aspects: (1) Total number and the ratio of their amount to reaction amount; (2) Length distribution; (3) Reaction participation; (4) Correlated reaction sets (CoSets); (5) interconnectivity degree. Significant discrepancies in above properties were observed during the comparison, which reveals the biological natures of different biological processes. Besides, by demonstrating the application of ExPa analysis on E.coli, we provide a practical guidance of an improved approach to compute ExPas on COBRA models of TRNs. Conclusions ExPas of E.coli's MN, TRN and TTN have different properties, which are strongly connected with various biological natures of biochemical networks, such as topological structure, specificity and redundancy. Our study shows that ExPas are biologically meaningful on the newborn models and suggests the effectiveness of ExPa analysis on them.


Background
Many large-scale biological networks, including metabolic networks (MNs) [1], signaling networks [2], transcriptional regulatory networks [3] and transcriptional and translational networks [4] have been reconstructed along with the development of high-throughput technology in the past decades. These networks are then transformed into mathematical models for further analysis. Constraint-Based Reconstruction and Analysis (COBRA) is one of the most commonly used frameworks introduced to model and analyze steady-state biochemical networks [5]. In the past two decades, it has been successfully applied on MNs to study various phenotypes [6][7][8][9]. Recently, the same principles were also extended to other types of biochemical networks mentioned above [2][3][4]10].
All the possible phenotypes, i.e. the flux distributions of feasible steady states, of a constraint-based biochemical model form a high-dimensional cone. Network-based pathways such as Extreme Pathways (ExPas) [11] are defined to study this cone. ExPas are vectors of fluxes that lie on the edges of the cone [12]. They constitute the minimal and unique vector set which generates the space of all feasible steady states through non-negative linear combination. Since ExPas characterize the limits on the capabilities of a cell's metabolic system [13], ExPa analysis will reveal systemic properties of metabolism [14]. ExPa analysis as an approach to characterize the fundamental and time-invariant topological properties of a given network [15] has been successfully applied to MNs, such as those of human red blood cells [16], Escherichia coli [17][18][19], Sacchoromryces cerevisiae [20,21], Helicobacter pylori [22,23], Haemophilus influenzae [11,24] and Methylobacterium extroguens [25]. Besides, network models respectively describing a prototypic signaling system [10] and the JAK-STAT signaling system in the human B-Cell [2] have also been studied through ExPa analysis.
Recently, there emerged two COBRA models of biochemical systems with different types: E.coli transcriptional regulatory network (TRN) [3] and E.coli transcriptional and translational network (TTN) [4]. What should be clarified is whether ExPa analysis is still useful for new types of networks and whether ExPas of TRN or TTN show some properties different from those of MN. These questions are biologically significant because the answers determines whether we can rely on the existing analysis approaches to obtain novel and biologically meaningful findings in a brand new field. In this paper, we try to provide an anwser by comparing properties of ExPas among the E.coli TRN, MN and TTN. In the comparison, differences between biological processes were observed from multiple perspectives, including network structure, reaction participation, specificity and redundancy. The results indicate that ExPa analysis can be extended to biochemical systems of TRN and TTN, which helps researchers to further understand the corresponding biological systems. Besides an improved method was introduced to simplify the calculation and interpretation of ExPas on TRN models [3], which could also be useful.

Results
Firstly, we calculated extreme pathways of the three biological networks mentioned above. Since the number of ExPas grows exponentially with a networks' complexity [15], the enumeration of ExPas on the highly complex ones such as E.coli MN and TTN is computationally intractable. Fortunately ExPa calculation will be much more manageable if a MN or a TTN is divided into smaller sub-networks. Therefore, we chose the sub-networks with relatively complete and independent functions as the representatives of their belonging biologic systems. For the E.coli MN, two sub-networks were chosen: (1) Amino acid, Carbohydrate and Lipid metabolism (sACL) and (2) Membrane and Murein metabolism (sMM). For E.coli TTN, the two sub-networks were: (1) transcription (sTC) and (2) translation (sTL).
Then ExPa analysis was performed on each network/ subnetwork and the properties from different aspects were obtained, including the total number of ExPas, the number-based ratio of ExPa to reaction, ExPa length distribution, reaction participation distribution, correlated reaction sets (CoSets), and the inter-connectivity of ExPas. Finally, a horizontal comparison on the properties was made among the five networks/subnetworks. Moreover, some incompleteness and incorrectness in the E.coli TRN model which were stumbled through ExPa analysis are also reported in this section. This findings illustrate that ExPa analysis is capable of directing model refinement.

E.coli TRN model
The E.coli TRN model was published by Gianchandani et al. in 2009 [3]. It contains 147 environmental stimuli, 125 transcriptional factors and 503 downstream target genes which are represented in a matrix R * [3]. The TRN model was improved to enhance the efficiency of ExPa calculation (Details are provided in Materials and Methods). The final TRN model contains 1009 components, 1106 internal regulatory reactions, and 1009 exchange reactions each corresponding to a component. All the extracellular metabolites were considered as inputs and all protein products were considered as outputs. There were 1599 ExPas, of which 9 were biologically infeasible because they employed conflicting input fluxes, and thus they were excluded from the ExPa set used in analysis.
In E.coli TRN, 16 reactions do not participate in any ExPa; namely they are never used to form a transcriptional state of the network. These unused reactions were categorized into two types as listed in Table 1 and Table 2 respectively.
Reactions in Table 1 all relate to NOT_BirA (absence of protein BirA). However, no regulatory rule corresponds to the presence or absence of BirA, and therefore, the initial steps are unknown. As a result, the internal reactions using NOT_BirA (b0774_1, b0775_1, b0776_1 and b0778_1) and the corresponding exchange reactions (Ex_b0774, Ex_b0775, Ex_b0776 and Ex_b0778) will never be initiated. Furthermore, proteins BirA and the Table 1 Unused reactions in the E.coli TRN (Type I -Regulatory rules missing).

Reactions
Reuglatory rules Reaction type NOT_BirA represents the regulatory reaction that leads to the inhibition of gene transcription generating BirA. '-' in the second column represents the deficiency of corresponding regulatory rules in the model. gene products of b0774, b0775, b0776 and b0778 do not participate in any other reaction except those in Table 1, so their invalidation will not affect other reactions in the network. In a word, these 9 reactions do not participate in any ExPa because their relevant reactions (either producing their substrates or consuming their products) are unavailable in the network. The unused reactions in Table 1 show the incompleteness of the E.coli TRN model and necessitate further refinement. For the reactions in Table 2, the regulatory rule of b1814 can be divided by simple logical transformation into 6 rules, of which 3 contradict with each other (the shaded parts in Table 2). Since there are still 3 operational regulatory rules relating to the transcription of b1814, its corresponding exchange reaction can be initiated. Similarly, the regulatory rules of b3942 and b4111 are both contradictory and cannot be used in any ExPa. These reactions may imply some incorrect information in the model. Therefore, new biological knowledge is needed to improve E.coli TRN.  [4]. It consists of 11991 components and 13694 reactions which give rise to 423 functional gene products [4]. Given the critical inherent problem of combinatorial explosion during ExPa calculation, E.coli MN and TTN were divided into small sub-networks depending on the reactions' functions [11]. Sub-networks as representatives of important biological processes were chosen.

E.coli MN and TTN model
The E.coli MN was divided into 6 discrete sub-networks with different functions: one for exchange reactions which transfer metabolites in and out of the metabolic system and the others for internal reactions. Each reaction was assigned to one of the six sub-networks, whose details are listed in Table 3. Two sub-networks, Amino acid, Carbohydrate and Lipid metabolism (sACL) and Membrane and Murein metabolism (sMM), lie in the central part of E.coli MN and form the basis of other biological processes, and therefore they were chosen as the representatives of E.coli MN for ExPa analysis.
The E.coli TTN model comprises of 27 biological processes and the details are provided in [4]. Each process was treated as a discrete sub-network. The largest two sub-networks, Transcription and Translation, were chosen for further ExPa analysis.

ExPa counting
The total numbers of ExPas and the number-based ratios of ExPa to reaction (P/R) are listed in Table 4. P/R depicts the proportionality of the numbers of ExPas and reactions in a network. Table 4 shows that the P/Rs of sACL (33.44) and sMM (32.40) are much higher than those of TRN (0.75), sTC (0.12) and sTL (0.25), which are a consequence of the linear structures of TRNs and TTNs [3,4]. In contrast, MNs are in more complex interconnection with a large number of alternative pathways, and thus their P/Rs are much higher. The redundancy of ExPas increases a metabolic system's flexibility and fitness to sudden environmental changes [23,27]. These results illustrate the fundamental differences in topological structure and redundancy among the three types of networks.

ExPa length
The length of an ExPa equals to the number of reactions that participate in it [13]. Figure 1 shows the histograms of ExPa length distribution for each network/ sub-network above. The details are listed in Table 5.
The length distributions of ExPas corresponding to those biological processes are very diverse. The longest ExPas consists 51, 82 32 and 109 reactions in sACL, sMM, sTC and STL, respectively, which is much longer than that in TRN (21). Reactions in E.coli TRN represent transcriptional regulatory rules rather than real biochemical reactions as in MN and TTN, and thus the ExPa length in TRN depicts the number of regulatory rules used for expressing certain genes. A regulatory rule describes how environmental stimuli affect transcriptional factors, which in turn affect downstream target genes. Therefore, the ExPa in TRN is reasonably shorter as the biological network has a relatively flat hierarchical structure [3]. Given the number of reactions, the ratio of average ExPa length to reaction  -Exchange

Ex_b4111
-Exchange Crp appears in the second column is a transcription factor and others are extracellular metabolites. The contradictory regulatory rules are labeled with shade in the table and these reactions never occur.
number (L/R) was calculated for each biological network or subnetwork ( Table 5). The L/Rs of the two representatives in MN are higher than those in TRN and their counterparts in TTN. Since ExPas convert substrates into products, ExPa length relates to how many reaction steps are needed to carry out the corresponding function. ExPa length can be characterized as the size and complexity of the corresponding flux distribution map [13]. The results indicate that the flux distribution map in MN is much more complex than those in TRN and TTN.

Reaction participation
The reaction participation rate (RPR) is defined as the percentage of ExPas in which a given reaction participates [13]. Figure 2 shows the distribution of RPRs for each biological network/sub-network. Most reactions participate in less than 10% of ExPas, especially in TRN, sTC and sTL, but a few active reactions participate in many ExPas. Although the high-RPR reactions are most exchange reactions, some of them are internal reactions which usually play a more important role in determining the phenotypic potentials of the five biological processes. Given this, RPR can be reasonably considered as a metric for evaluating the importance of a reaction to implement the corresponding biological function [13].
Here the top 10 internal reactions with the highest RPRs of each process are sorted in a descending order (Table 6). Several reactions of vital importance were found, and representatives were chosen for detailed study.
In TRN, the two most active reactions CRP_noGLC_1 and Crp_1 relate to the regulation rules of the transcription factor (TCF) C-reactive protein (CRP). Other high rank reactions Fis_1, Lrp_1, Fnr_1, and NOT_ArcA_1 relate to the regulation rules of the TCFs Fis, Lrp, Fnr and ArcA, respectively. In E.coli, the above TCFs belong to the seven global regulators that control most of the regulated genes [28]. The reaction NOT_Cra_1 is relevant to the regulation rules of the TCF Cra, a pleiotropic regulatory protein that controls carbon and energy fluxes in enteric bacteria [29,30]. The reaction NOT_PdhR_1 concerns the regulation rules of PdhR, a TCF that controls the respiratory electron transport system in E.coli. Its regulation target, the pyruvate dehydrogenase (PDH) multienzyme complex, plays a key role in the metabolic interconnection between glycolysis and the citric acid cycle [31].
In sACL, the most active reaction is ASPTA. It transfers oxoglutarate and aspartate to corresponding ketoacid, which are indispensable in glyoxylate cycle, an anabolic metabolic pathway occurring in E. coli [32]. The second one is ASAD which is the second step in the biosynthesis of amino acids in prokaryotes, fungi, and some higher plants. ASAD forms an early branch point in the metabolic pathway producing lysine, methionine, leucine and isoleucine from aspartate as well as diaminopimelate which plays an essential role in bacterial cell wall formation [33]. Deletion of gene asd (encoding ASAD) is lethal to the organism as demonstrated by experiments with Legionella pneumophila, Salmonella typhimurium, and Streptococcus mutans, which indicates that ASAD may also be an essential reaction in the metabolism of E.coli [34]. Another active reaction is ASPK, which is the  commitment step in the pathway to the synthesis of lysine, methionine, threonine and isoleucine. In sMM, the reaction ACCOAC is most active. It is a rate-determining step in the fatty acid synthetic pathway and may play a pivotal role in regulating fatty acid oxidation [35]. The second most active reaction MCOATA transfers Malonyl CoA to acyl-carrier proteins (ACPs). The product Malonyl ACP provides malonyl groups for biosynthesis of fatty acid and polyketide. On the other hand, Malonyl CoA, the substrate of MCOATA, is a highly-regulated molecule in fatty acid synthesis as it inhibits the rate-limiting step in beta-oxidation of fatty acids [36]. Flux change in MCOATA affects the consistency of Malonyl CoA and guarantees the biosynthesis of fatty acid.
In sTC, all the top reactions relate to the formation of the transcription elongation complex, an extremely complicated and highly regulated molecular machine that can sense signals coming from numerous regulatory protein factors, as well as those encoded in the DNA sequence. They are the basis of transcription elongation, because transcription can run smoothly and continuously only depending on their precise work.
In sTL, the reactions IF2_RECHARG, Rib_30_ini_-FORM and Rib_70_DISS are used by all ExPas. IF2_RE-CHARG recharges the initiation factor 2 (IF2) with GTP and Rib_30_ini_FORM produces 30S translation initiation complex which consists of 30S subunit, IF1, IF2-GTP and IF3. In bacteria, the correct mRNA starting site and the reading frame are selected when, with the help of IF1, IF2 and IF3, the initiation codon is decoded in the peptidyl site of the 30S ribosomal subunit by the anticodon fMet-tRNAfMet. Furthermore, Rib_30_ini_-FORM is also proved to be the intermediate step in the formation of 70S initiation complex (70SIC) which regulates translation initiation, the rate-limiting step in protein synthesis [37]. The other reaction Rib_70_DISS dissociates 70S ribosomes to 30S ribosomal subunit/IF1/ IF3 complex (rib_30_IF1_IF3) and 50S ribosomal subunit (rib_50_inact). This is an essential step before a ribosome can participate in a new round of translation since the initiation complex for protein synthesis involves a 30S subunit. The dissociation of 70S ribosomes contributes to the efficiency and sustainability of protein synthesis [38].
Reportedly, RPRs help to find important reactions in MN [13]. Our results further indicate that RPR can also be extended to TRN and TTN to evaluate the relative importance of a given reaction.

Correlated reaction set
A correlated reaction set (CoSet) comprises reactions that always participate in the same ExPa set in a given network [13]; namely if one reaction functions, the others in the same CoSet function simultaneously. A CoSet can be transformed to a graph by treating each reaction as a node and adding an edge between two reactions that involve a common substance. In a certain CoSet, some member reactions are topologically connected while others are not. The correlationship of the second type of reactions often indicates a transcriptional coregulation by the corresponding genes [11] while that of the first type has relatively trivial biological meaning. Therefore, a CoSet is defined as a trivial set if all its member reactions are connected in topology. A trivial CoSet provides less novel information, and thus it is unworthy of deep study. In this paper, the adjacent ratio is used to represent the percentage of trivial CoSets.
CoSets were calculated for each biological network/ sub-network about which several features, including the adjacent ratio, were stretched and shown in Table 7. The adjacent ratios of TRN, sTC and sTL are much higher than those of sACL and sMM, which indicates that almost all the CoSets obtained in the former three networks are due to the linear structure. For the metabolic netowrk, more CoSets consist of reactions which are not adjacent in topology. The results suggest that CoSet analysis may be more useful in study of MNs.

Crosstalk analysis
Crosstalk analysis was first raised to illustrate the relationship between multiple inputs or outputs of a signaling pathway [39]. The whole ExPa set was compared pairwise to build the simplest form of crosstalk [2,10]. A pair of ExPas may have identical, overlapped or disjoint inputs (or outputs). There are 9 categories of crosstalk with their biological meanings described in [10]. Here, crosstalk analysis is applied to other biological processes to detect the relationships between fundamental functional states. Various forms of crosstalk in the five networks/sub-networks above were characterized. As several exchange reactions participate in most ExPas of sACL, MN, sTC and sTL, almost all of the ExPa pairs have overlapped inputs or outputs. A close look at the highly participating exchange reactions reveals that most of them relate to small molecules such as H 2 O, ATP and NADP commonly seen in various biochemical reactions. In order to further elucidate the difference in crosstalk between ExPa pairs, all the exchange reactions in the four sub-networks were sorted in a descending order depending on RPR and the top 20% ExPa pairs were neglected in the subsequent crosstalk analysis. As shown in Figure 3, more than 90% of the ExPa pairs have disjoint inputs and disjoint outputs in TRN, sTC and sTL in contrast to sACL and sMM. A higher disjoint input/disjoint output rate implies that each ExPa has more specific functions and cannot be replaced easily by others. This indicates that the  biological processes in E. coli TTN and TRN are more deterministic than those in MN. Reportedly, a large number of genes are regulated by only a few independent regulatory rules in E.coli TRN [3], and the majority of the associated functions in E.coli TTN have only one coding gene in the genome [4]. These facts indicate that the specificity of TRN and TTN is much higher than MN. In order to function normally, cells have to respond accurately to the environmental signals with the help of precise transcriptional regulations and subsequently produce necessary gene products through accurate transcription and translation systems. Except sTC, the other networks/sub-networks all have ExPa pairs with identical inputs and identical outputs. These ExPas are redundant pathways which fulfill completely identical function through systemically independent routes. ExPa redundancy was demonstrated in genomescale MNs [23,24], as well as a prototypic signaling network [10] and the JAK-STAT signaling network [2]. The redundant ExPas in E.coli TRN can be attributed to the fact that the transcription of some genes can be stimulated by different transcriptional factors. For example, two redundant ExPas shown in Figure 4 stimulate the expression of gene b2243 in the same environment, but they employ the regulatory rules of 'CRP_noRIB AND Fnr AND NOT(GlpR)' and 'CRP_noRIB AND ArcA AND NOT (GlpR)', respectively. From Figure 3, the percentage of ExPa pairs with overlapped inputs and overlapped outputs in the biological processes of MN is much higher than those in TTN and TRN. These results indicate that E.coli MN is more flexible than TTN and TRN.

Discussion
ExPa analysis were applied to two new models, the E. coli TRN and TTN. A horizontal comparison was performed for the five networks/sub-networks: TRN, sACL, sMM, sTC, sTL from five aspects: (1) Total number of ExPas and the P/R ratios; (2) ExPa length distribution and L/R ratios; (3) Reaction participation rates; (4) Correlated reaction sets and adjacent ratios; (5) Inter-connectivity of ExPas.
Reactions in TTN represent actual biochemical reactions like those in MN, and thus, ExPas in TTN characterize the steady-states of the corresponding biological systems. In contrast, columns in TRN represent the transcriptional regulatory rules and coefficents only reflect the qualitative information describing the presence or absence of the corresponding components rather than the quantitative information describing reaction stoichiometries as in TTN and MN. Therefore, an ExPa in TRN characterizes a specific transcriptional regulatory state, namely which transcriptional regulatory rules are activated and which genes are expressed in a specific environmental state.    determinacy. Comparisons from these aspects indicate that MN is more flexible but less deterministic than TRN and TTN. Environmental cues affect transcriptional regulation, which controls the following transcription and translation processes. Then the resulting gene products (enzymes) enter the metabolic system to catalyze the corresponding reactions. It is necessary for a cell to respond accurately to the environment and produce the required enzymes. MN is more robust to environmental changes, which reflects the struggle of a cell to achieve an alternative steady-state to provide substance support for TRN and TTN and maintain life.
The distributions of reaction participation in the five networks/sub-networks are similar except that there are more reactions participating in more than 10% ExPas in sACL and sMM. Only a small percent of the reactions participate in a large number of ExPas, which indicates the phenotypic potentials of TRN, TTN and MN are affected greatly by a small number of important reactions. Evaluations on the representatives show that reactions with high participation rates often play an important role in certain biological processes. These reactions are the relatively weak part of the networks because a large number of ExPas will be destroyed when these reactions become invalid, which may cause the loss of various functions. These reactions may be used as drug targets and further direct the design of new drugs.
CoSets were identified via the calculation of reaction participation. Besides the expected topological connections, the topologically unconnected reactions in a CoSet may indicate the information of transcriptional coregulation in MN. However, most Cosets of TRN and TTN are trivial, and thus have few chances to be a clue giving novel information like in MN.
Last but not least, an improved approach was introduced to calculate the ExPas on TRN models. Compared to the existing method, the biggest advantage of ours is the high efficiency in calculating all the extreme pathways of a TRN, especially for the one which may work under huge amount of environmental conditions. For example, the E.coli TRN model which we studied in the paper has 776 components whose availability (i.e., presence or absence) constitute the environmental condition, including environmental stimuli, transcription factors or proteins. It is impossible to enumerate all the possible conditions due to "combination explosion" without mentioning the calculation of the ExPas under each condition. However, using the approach we proposed, it took only about 45 seconds to computing the whole ExPa set on a PC with four 3.2-GHz Intel(R) XEON processors and 16GB RAM (in fact, only one processor and 15MB RAM are used for the calculation).
We believe that this approach could be helpful for readers who are also interested in the ExPas of TRNs.

Conclusions
This study presents the first horizontal comparison among the E.coli TRN, MN and TTN through ExPa analysis. The results show that ExPa also has biological meanings in TRN and TTN. Different properties of ExPas reflect the biological nature of each biological process. Along with the the increase of reconstructed models on TRNs and TTNs as well as the development of new methods, ExPa analysis may reveal more biological properties and get larger space of application in the medical and biochemical fields.

COBRA framework and ExPa analysis
The COBRA framework stoichiometrically represents a biochemical network as a matrix S, whose rows and columns correspond to components and reactions respectively. COBRA is capable of predicting and understanding the achievable cellular function, namely the phenotypic behavior of a biochemical network. With the hypothesis of steady state and certain constraints, all possible flux distributions lie in the null space of S: where S m×n is the stoichiometric matrix of a biochemical network with m components and n reactions and v n×1 is a vector of the fluxes through each reaction in the system [40].
Given the reversibility of reactions, an internal reversible reaction can be divided into a forward and a backward sub-reactions, each taking a non-negative flux. The model's solution space is now a convex polyhedral cone in high-dimensional space [19,40], which can be demarcated by an ExPa set p i (i = 1, · · · , k) [11,41]. All steadystates lie in the cone and each can be represented by a nonnegative linear combination of ExPas: For a given network, the ExPa set has the following properties: (1) It is unique; (2) Each ExPa uses fewest reactions to form a function unit; (3) It is systemically independent which means an ExPa cannot be represented by a nonnegative linear combination of other ExPas [42,43].

ExPa calculation on the MN and TTN
ExPas were calculated using an open source tool 'expa' [44]. The E.coli MN and TTN models were divided into small sub-networks using the method proposed in [11].

An improved approach to compute the ExPas of TRN models
A TRN is composed of a set of transcriptional regulatory rules which describe cells' transcriptional responses to environmental signals. A regulatory network matrix R was used by Gianchandani et al. to represent the components (environmental cues, metabolites, genes and proteins) and reactions (regulatory rules and exchange reactions of products) of a TRN [3]. It was further combined with an environmental matrix E, which characterizes a particular environmental state, yielding a complete regulatory state matrix R * = [R|E]. Each column of E delineates the availability of a unique environmental cue, transcription factor, target gene or protein [3,45]. Different environmental states correspond to different Es, thus forming different R * s.
For example, given a toy TRN with three regulatory rules: A + B → Protein 1; C → Protein 2; D → Protein 2; where A, B, C and D are four metabolites enacting as signalling stimuli.
The corresponding converses are: A → Protein 1; B → Protein 1; C + D → Protein 2; The matrix R * is illustrated in Figure 5A under the environmental condition that A and D are present while B and C are absent. The shaded columns represent the inputs of environmental cues. Any steady state of TRN under the given environmental cues lies in the space which satisfies R * v = 0 and ∀i, v i ≥ 0. The convex basis of the right null space of R * forms the ExPa set under the given environmental state.
In order to calculate all the ExPas of the TRN, all the environmental states, namely all possible Es, need be enumerated. Then ExPas participating in each possible environmental state are generated and the unique ones are grouped to form the complete ExPa set. Since the number of possible environmental states grows exponentially with the number of extracellular metabolites, it is inefficient to enumerate all possible environmental states for a TRN with numerous envionmental cues [45]. Therefore, an improved method is introduced here to simplify the ExPa calculation on the COBRA model of TRN.
The gist of the method is to improve Gianchandani's method by employing two columns instead of one to delineate the presence and absence of a unique envionment cue respectively, by which a new environment matrix E new is constructed. The matrix E new covers all possible environmental states. Without loss of generality, we assume that the top n rows in R and E new represents the present state of n environmental inputs m e (e = 1, · · · , n) one-to-one and the following n rows represents the absent state of them. The original regulatory state matrix is R * = [R|E] = [r 1 , r 2 , · · · , r k |r k+1 , r k+2 , · · · , r k+n ] and the new matrix is R * new = [R|Enew] = [r1, r2 · · · rk|rk+1, rk+2, · · · , rk+n|rk+n+1, rk+n+2, · · · , rk+2n] (k is the number of columns in R, and E new = [r k+1 , r k+2 , · · · , r k+n |r k+n+1 , r k+n+2 , · · · , r k+2n ]). For an input m e, column r k+e represents its presence and column r k+n+e represents its absence under the environmental condition, where r k+e (e) and r k+n+e (n + e) equal to 1 and the other elements are all zeros. For example, the R * new matrix of the above toy model is illustrated in Figure 5B. The shaded columns constitute E new . Obviously, the space and time complexity for constructing E new is O(n), where n is the number of components of a TTN model. The convex basis of the right null space of R * new comprises the ExPa set of the TRN which could then be enumerated by the tool 'expa' [44].
Notably, some infeasible steady states employing contradictory inputs may be involved in the right null space of R * new . For example, Figure 6A shows an infeasible steady sate of the TRN described in Figure 5B. The two shaded elements of v both equal to 1. This means metabolite A is both present and abscent in the environment, which is obviously impossible. If an ExPa proves to be an infeasible steady state, it should be removed from the ExPa set. Figures 6B and 6C show two ExPas resulting from the matrixs in Figures 5A and 5B respectively. The two vectors represent the same steady state of the TRN in which gene G1 is inhibited because of lack of metabolite B. In Figure 6B, the exact meaning of "1" in element B AV cannot be decided directly from ExPa without referring to the shaded part of matrix in Figure 5A. However, in Figure 6C, "1" in column B AB clearly means the absence of metabolite B. Namely, the interpretation of an ExPa resulting from the improved method is independent from the environmental matrix, which makes an ExPa easier to understand.

Validation of the approach of ExPa calculation on TRNs
Given n environmental cues, there are 2 n possible environmental states, each corresponding to a matrix E i and the corresponding R * i (R * i = [R|E i ], i = 1, · · · , 2 n ). The ExPa set obtained from R * i is denoted as P i and the feasible ExPa set calculated from R * new is denoted as P new . Since the meaning of the environmental part of P i is dependent on the environmental states, ExPas of different environmental states should be normalized to eliminate the dependence before being grouped up. We normalized a ExPa p  In a normalized ExPaP = [v 1 , v 2 , · · · , v k+2n ], "1" on v k+e (e = 1, · · · , n) indicates that m e is present on the ExPa while "1" on v k+n+e (e = 1, · · · , n) indicates that m e is absent, and 0 indicates that m e does not affect the transcriptional states characterized by this Expa. The normalized ExPa set of P i is denoted asP i and the union ofP i (i = 1 . . . 2 n ) is denoted asP. As explained above, the ExPas in set P new are already in the normalized form, hence no normalization are needed.
Here we prove thatP new equals toP: Statement 1: each ExPa inP can be obtained from R * new . Proof: given n extracellular metabolites m p (p = 1, · · · , n), each R * i can be transformed toR * i as follows (Algorithm 2): Data: R * i , A i ,Ā i . // R * i represents TRN in the ith environment, where i = 1, · · · , 2 n ; // R * i = [r 1 , r 2 , · · · , r k |r k+1 , r k+2 , · · · , r k+n ]; // A i is a set which consists of all the absent inputs; //Ā i is a set which consists of all the present inputs; Result:R * i . // R * i = [r 1 , r 2 , · · · , r k |r k+1 , r k+2 , · · · , r k+n , r k+n+1 , r k+n+2 , · · · , r k+2n ] ; For q = 1 to 2n do r k+q = 0; End for For q = 1 to n do If q ∈ A i do r k+n+q (n + q) = 1; Else if q ∈Ā i r k+q (q) = 1; End if End for Algorithm 2: Procedure of transforming R * i toR * i . ForR * i = [r 1 , r 2 , · · · , r k |r k+1 ,r k+2 , · · · , r k+n , r k+n+1 , r k+n+2 , · · · , r k+2n ] (i = 1, · · · , 2 n ) resulted from Algorithm 2, if ∃j ∈ {k + 1, · · · , k + 2n} such that r j = 0, then a constraint v j = 0 is added. Then the resulting network is a sub-network of that represented by R * new . As proven in [46], G and G are two MNs whose reactions are all irreversible and whose ExPa sets are EP and EP , respectively. If EP is a sub-network of EP, then EP ⊆ EP. Thereforê P i ⊆P new , because P = n i=1P i,P ⊆P new . Statement 2: each feasible ExPa inP new can be obtained by someR * i . Proof: Since any environmental cue is impossible to be both present and absent in a specific environment, v k+e × v k+n+e = 0 (e = 1, · · · , n) is true for each ExPa in P new . For any ExPa p ∈P new , let T = R * new . For any e, T is modified as follows: (1) If v k+e = 0 and v k+n+e = 0, t k+e = 0; (2) If v k+n+e = 0 and v k+e = 0, t k+n+e = 0; (3) If v k+e = 0 and t k+e = 0, t k+e = 0, where t i is the ith column of T. As can be shown easily, p is an ExPa of the right null space of T. According to Algorithm 2, a legalR * i contains one zero column and one non-zero column corresponding to the two input reactions of a certain input component respectively. Therefore, T is a legalR * i , and each ExPa inP new can be obtained by someR * i , or in other words,P new ⊆P . From statements (1) and (2), we conclude that P new =P , and thus all possible ExPas of a TRN can be obtained using our new representation.  Figure 5B.