Intervention in gene regulatory networks via greedy control policies based on longrun behavior
 Xiaoning Qian^{1, 2}Email author,
 Ivan Ivanov^{3},
 Noushin Ghaffari^{1} and
 Edward R Dougherty^{1, 4, 5}
DOI: 10.1186/17520509361
© Qian et al; licensee BioMed Central Ltd. 2009
Received: 16 January 2009
Accepted: 15 June 2009
Published: 15 June 2009
Abstract
Background
A salient purpose for studying gene regulatory networks is to derive intervention strategies, the goals being to identify potential drug targets and design genebased therapeutic intervention. Optimal stochastic control based on the transition probability matrix of the underlying Markov chain has been studied extensively for probabilistic Boolean networks. Optimization is based on minimization of a cost function and a key goal of control is to reduce the steadystate probability mass of undesirable network states. Owing to computational complexity, it is difficult to apply optimal control for large networks.
Results
In this paper, we propose three new greedy stationary control policies by directly investigating the effects on the network longrun behavior. Similar to the recently proposed meanfirstpassagetime (MFPT) control policy, these policies do not depend on minimization of a cost function and avoid the computational burden of dynamic programming. They can be used to design stationary control policies that avoid the need for a userdefined cost function because they are based directly on longrun network behavior; they can be used as an alternative to dynamic programming algorithms when the latter are computationally prohibitive; and they can be used to predict the best control gene with reduced computational complexity, even when one is employing dynamic programming to derive the final control policy. We compare the performance of these three greedy control policies and the MFPT policy using randomly generated probabilistic Boolean networks and give a preliminary example for intervening in a mammalian cell cycle network.
Conclusion
The newly proposed control policies have better performance in general than the MFPT policy and, as indicated by the results on the mammalian cell cycle network, they can potentially serve as future gene therapeutic intervention strategies.
Background
Boolean networks (BNs), and more generally, probabilistic Boolean networks (PBNs) [1, 2], have been used for finding beneficial interventions in gene regulatory networks through the study of network dynamics. Upon describing these dynamics via Markov chains, optimal stochastic control policies can be determined via dynamic programming [3–5] to change the longrun dynamics, which are characterized by the steadystate distribution (SSD) of the network (Markov chain), the purpose being to reduce the risk of entering aberrant states and thereby alter the extant cell behavior. Three problems arise with this approach. First, the dynamic programming algorithm used to find optimal policies has complexity which increases exponentially with the number of the genes in the network. Approximate, or modelfree, control policies have been proposed to alleviate this computational burden [6, 7]. Second, the classical infinitehorizon approach to control requires a cost function and this requires subjective input. Third, and most importantly relative to the algorithms proposed in the current paper, optimization is with respect to the cost function and is only secondarily related to the steadystate distribution. Here, our purpose is to reduce the mass of the steadystate distribution corresponding to undesirable states and increase the mass corresponding to desirable states, and to do this directly without the mediating factor of a cost function.
In [7], a stationary meanfirstpassagetime (MFPT) control policy is proposed that circumvents the need for a cost function and works directly with the transition probability of the Markov chain associated with the network. Biologically, it can be motivated by the following example. In a stable cancer cell line, the cells will keep proliferating without intervention. Assume that the goal of the intervention is to push the cell into programmed cell death (apoptosis) by intervening two candidate genes: p53 and telomerase. The p53 gene is the most wellknown tumor suppressor gene [8–10]. The telomerase gene encodes telomerase, which maintains the integrity of the ends of chromosomes (telomeres) in germ cells and progenitor cells; therefore, it is responsible for replenishing cells during the normal cell turnover (homeostasis). In somatic cells, the telomerase gene is turned off, resulting in telomere shortening each time the cell divides – a key reason for the limited lifespan of normal cells [11]. In the majority of tumor cells, telomerase is activated, which is believed to contribute to the prolonged lifespan of the tumor cells [12]. This worsens prognosis for cancer patients [13, 14]. Extensive experimental results indicate that when p53 is activated in the cells, for example in response to radiation, the cells undergo rapid growth inhibition and apoptosis in as short as a few hours [15]. In contrast, inhibition of the telomerase gene also leads to cell growth inhibition, differentiation, and cell death, but only after cells go through a number of cell divisions (allowing telomere shortening). Cell death takes a longer time through this latter process than through p53 activation. The use of mean first passage times for finding the best control gene is intuitive because the activation of p53 can lead more quickly (or with higher probability) to apoptosis than the inactivation of telemerase. Based on this kind of observation, the MFPT control policy employs two heuristics: (1) it is preferable to reach desirable states (apoptosis in the example) as early as possible; (2) it is preferable to leave undesirable states (cell proliferation) as early as possible.
Motivated by the success of the MFPT algorithm, in this paper, we propose three stationary control policies under the assumption that we have the transition probability matrix and the state transition diagram for the Markov chain of the network. Given that the intervention objective is to shift the steadystate distribution to desirable states, we propose to directly use the longrun behavior as our criterion for control instead of using the mean first passage time, which is an indirect measure. For the first control policy, we directly investigate the attractor states, which have been conjectured to correspond to phenotypes of the modeled cell. We replace the meanfirstpassagetime criterion by the distance to (un)desirable attractor states of the underlying Markov chain, which can be computed efficiently. Since the shift of steadystate distribution can be computed efficiently using the analytic formula derived in [16–18], a second new control policy, having similar time complexity as the original MFPT control policy, is proposed based on the shift of steadystate distribution. A third policy also uses the steadystate distribution as the criterion, but gives up some computational efficiency in order to increase the certainty that applying the derived control policy will lead to the reduction of the total stationary mass for undesirable states. Because these new policies directly utilize the longrun characteristics of the network, we expect them to perform better than the MFPT policy with respect to shifting the steadystate distribution. A simulation study supports this expectation. In addition, a preliminary example of applying these control policies on a mammalian cell cycle network has shown that we can successfully identify gene E2F as the best potential control target, which has been conjectured in [19] through different mathematical modeling.
Methods
Background
Probabilistic Boolean networks
We focus on intervention in binary PBNs in this paper but these results directly extend to more general PBNs having any discrete range of values since the underlying models are always finite Markov chains. Following the standard definitions of the Boolean network model [1, 2], PBNs are described by truth tables determined by Boolean regulatory rules and the related parameters, including various probabilities. In a binary Boolean network of n genes, the state of each gene x_{ i }∈ {0, 1} at time t + 1 is determined by the values of a set V_{ i }of predictor genes at time t via a Boolean "predictor" function , where K_{ i }= V_{ i } denotes the number of predictor genes in V_{ i }and is called the input degree of x_{ i }in the network. Given a truth table, the network evolves as a trajectory of geneexpression vectors (states) X_{ t }∈ {0, 1}^{ n }, each known as a gene activity profile (GAP). From the initial state, a BN will eventually reach a set of states through which it will cycle forever. Each such set is called an attractor cycle and states within attractor cycles are attractors. The set of states leading to a specific attractor cycle is known as its basin of attraction (BOA).
For the stochastic extension of basic Boolean network model, we can have different variations. For Boolean networks with random perturbations (BNps), perturbation is introduced with a positive probability p by which the current state of each gene in the network can be randomly flipped. To further model the stochastic properties arising from either latent variables affecting network dynamics or the uncertainty from model inference, we allow m vectorvalued network functions, F = {f_{1}, f_{2},..., f_{ m }}, to determine the expression states of genes in the network model through time. We consider the network model which consists of a family {B_{1}, B_{2},..., B_{ m }} of BNps governed by the corresponding functions, each BNp being referred to as a context. At any time point there is a positive probability q of switching the current governing context. Once a switch is called for at time point t, then one function from among f_{1},⋯, f_{ m }is randomly selected according to the probability distribution c = {c_{1},⋯, c_{ m }}, where it is possible for the current function to be chosen. There are two types of PBNs with different interpretations regarding q. If q < 1, the PBN is contextsensitive and q is usually assumed to be small. If q = 1, as in the original formulation of PBNs [2], the PBN is said to be instantaneously random. All of these various PBNs inherit the attractor cycles of their constituent BNs governed by predictor functions. Introduction of random perturbation makes the corresponding Markov chain of a PBN irreducible. Hence, it possesses a steadystate distribution π describing the longrun behavior. With sufficiently small p, π will reflect the attractor structure. For developing therapeutic intervention, we are especially interested in the proportion of time the network occupies an attractor in its steady state.
The dynamics of PBNs can be analyzed via their associated homogeneous irreducible finite Markov chains. For instantaneously random PBNs, the states of the associated Markov chain are the states (GAPs) of the network; for a contextsensitive PBN, the chain states are (context, GAP) pairs. We can derive the transition probability matrix P from the truth tables and the involved probabilistic parameters, and from there derive the steadystate distribution π. The computation of the transition probabilities between states in PBNs has been discussed in several papers [5, 20, 21]. We reiterate them in the following theorem. The proof of the theorem can be referred to in Additional file 1.
where r, s denote the r th and s th BNp, which are the BNps at time t + 1 and t.
where π is the corresponding steadystate distribution for P and T denotes transpose.
In this paper, we focus on BNps and instantaneously random PBNs. Subsequently we will comment on extension to contextsensitive PBNs. For the moment, we note that, for the purposes of intervention, the reduction of a contextsensitive PBN to an instantaneously random PBN has been proposed in [5]. This reduction results in a large computational savings when deriving control strategies. Moreover, from the perspective of geneexpression inference of a contextsensitive PBN, the variables r and s in P_{s, y}(r, x) are hidden variables because we only observe the transition of geneexpression states. Hence, it is difficult to accurately infer the transient structure of contextsensitive PBNs without a large amount of data [22]. In [21], the effect of the reduction from contextsensitive PBNs to instantaneously random PBNs in [5] is investigated and it is shown that, while there is some loss of control performance using the reduced model, the loss depending on the structure of the PBN, generally there can still be significant therapeutic benefits for these control strategies in situations where it is impractical to utilize the full model. Hence, our focus on the control policies on instantaneously random PBNs still leads to practical benefits to shift steadystate distributions beneficially.
Stochastic optimal intervention
the policy being independent of time in this stationary policy. Because the size of the search space for this optimization problem is O( ), the optimal solution quickly becomes computationally infeasible as network size increases.
J* is the unique solution of this equation within the class of bounded functions. Equation 8 is known as the Bellman optimality equation. An optimal control policy is a stationary policy that attains the minimum in the righthand side of the Bellman optimality equation for all the states in the network.
From a strictly longrun perspective, absent any assigned costs, the preceding solution can be considered as an approximate way to find a control that most beneficially shifts the steadystate distribution. The problem here is that the algorithm requires appropriate settings for the cost function , which are difficult to obtain. Moreover, the existing iterative algorithms to find an optimal control policy in the above framework still have high computational complexity O(2^{3n}) for each iteration [7].
Meanfirstpassagetime (MFPT) control policy
In [7], a greedy stationary control policy using mean first passage times of the underlying Markov chain is proposed. When considering therapeutic interventions, the state space can be partitioned into the set D of desirable states and the set U of undesirable states according to the expression values of a given set of genes. Based on the intuition that an effective intervention strategy should reduce the likelihood of visiting undesirable states by increasing the time to reach undesirable states or decreasing the time to desirable states, a greedy meanfirstpassagetime control policy can be derived.
where e denotes column vectors of 1's with the appropriate length; the vectors K_{ U }and K_{ D }contain the MFPTs from each state in D to U, and from each state in U to D, respectively.
To design the MFPT control policy, we check K_{ D }(x)  K_{ D }(x_{ c }) or K_{ U }(x_{ c })  K_{ U }(x) for each state x and the corresponding flipped state x_{ c }. If x is undesirable, we check whether K_{ D }(x)  K_{ D }(x_{ c }) ≥ λ to make the time to reach the desirable states D faster; Otherwise when x is desirable, we check whether K_{ U }(x_{ c })  K_{ U }(x) ≥ λ to make the time to leave the undesirable states U faster. The parameter λ is set to a higher value when the ratio of the cost of control to the cost of the undesirable states is higher, the intent being to apply the control less frequently; if we are not interested in limiting application of control, we set λ = 0. The computational complexity for finding the MFPT control policy in the original PBN is O(2^{ n }).
The pseudocode for the MFPT algorithm in Appendix 1, reproduced from [7], summarizes the procedure when applied to all possible control genes.
As noted in [7], even if one wishes to apply a costbased optimization, the MFPT procedure can serve to find a good control gene and also gain insight on the controllability of the network.
Control policies directly based on longrun behavior
Basin of Attraction (BOA) control policy
Although mean first passage time is closely related to the steadystate distribution, the MFPT control policy does not use the shift of stationary mass directly as a criterion. Given the basins of attraction (BOA), which determine the longrun behavior of a PBN, we can use this information to derive a control policy more directly related to the steadystate distribution. For this BOA control policy, we again assume that the state space is partitioned into the sets D and U of desirable and undesirable states. For any state x, let A^{ j }(x) be the set of attractors (the cycle) for the basin containing x in the j th constituent BNp, keeping in mind that a state x belongs to exactly one basin of attraction in each constituent BNp. Let , the union of attractors for the basins of x taken across all constituent BNps. Further, for each constituent BNp, we compute the minimal distance of each state to states in D or to states in U, and then we compute the respective PBN distances d_{ D }and d_{ U }by taking weighted averages with respect to the selection probabilities. Since most of the stationary mass is distributed in the attractors, the structural properties of the basins, including the properties of their attractors and their sizes, determine the longrun behavior and the steadystate distribution of the network [20]. Hence, it is reasonable to design a control policy based on the BOA structure.
We proceed in similar way to the manner in which the MFPT policy is derived in [7] to obtain a basin of attraction (BOA) control policy. For a pair of undesirable states x and x_{ c }, we first check whether B(x) or B(x_{ c }) contains any desirable attractors. If only one of them contains desirable attractors, then we decide the control policy to always go to that state so that we increase the likelihood of entering into desirable attractors. Otherwise, if both of them have desirable attractors or neither B(x) or B(x_{ c }) has desirable attractors, we compare d_{ D }(x) and d_{ D }(x_{ c }): Whichever is minimum, we apply control to get that state so as to reach the desirable states D faster. We do not apply any control if d_{ D }(x_{ c }) = d_{ D }(x). For a pair of desirable states x and x_{ c }, we first check whether B(x) or B(x_{ c }) contains any undesirable attractors. If only one of them contains undesirable attractors, then we apply control to flip to that state so that we reduce the risk of getting into undesirable attractors. If the condition is satisfied for both of the states or neither of them, we then check d_{ U }(x) and d_{ U }(x_{ c }) to make the time to reach the undesirable states U slower. The computational complexity for finding this control policy in the original PBN is O(2^{ n }), similar as in deriving the MFPT control policy. Since finding BOA structures of PBNs does not involve computing matrix inversions and is relatively less expensive than computing mean first passage times, especially with increasing number of genes in the network, the algorithm to find the BOA control policy is more efficient than the previous algorithm to find the MFPT control policy. The pseudocode in Appendix 2 summarizes the BOA procedure for finding the best control gene and the corresponding stationary control policy.
Steadystate distribution (SSD) control policy
Although the BOA structural properties constitute one determinant factor for the steadystate distribution that we aim to shift, the BOA algorithm does not use the steadystate distribution directly. Thus, we consider a control policy directly using the shifted stationary mass as the criterion of applying control. A key issue for such an algorithm is the efficient computation of the shifted stationary mass resulting from intervention. Recently, we have adapted the perturbation theory in finite Markov chains [18, 25] to derive an analytic solution to compute the shifted mass efficiently [17]. We next state a theorem for general Markov chains which has appeared in [17, 18, 25].
where e is a vector with all its elements equal to 1, t and u are any vectors such that π^{ T }t ≠ 0 and u^{ T }e ≠ 0, and β^{ T }= b^{ T }[I  P + tu^{ T }]^{1}.
where β^{ T }= b^{ T }Z and Z is the fundamental matrix of the underlying Markov chain for the original network. Hence, the steadystate distribution of the rankone perturbation is expressed in terms of π and Z, the steadystate distribution and fundamental matrix of the original network. Thus, for rankone perturbations to regulatory functions, we have an explicit way to compute the exact shifted stationary mass.
where p_{ x }and are the two rows corresponding to the states x and x_{ c }in P, z^{ x }is a column corresponding to the state x in Z, π_{ x }is the stationary mass for x, and (x) denotes the steadystate distribution after we apply gene flipping at the state x. Following this analytic solution, we can quickly compute the total stationary mass for undesirable states π_{ U }and (x), and therefore the shifted mass after the possible controls to each state. Once we have that, we can derive a steadystate distribution (SSD) control policy based on a procedure similar to deriving the MFPT control policy. We compare the total stationary mass of undesirable states after applying control to x and x_{ c }: (x) and (x_{ c }). If both of them are larger than the original stationary mass, π_{ U }, of undesirable states, then we do not apply any control. Otherwise, we adopt the control on the state which leads to less stationary mass of the undesirable states. The computational complexity for finding this new control policy is again O(2^{ n }), while the complexity for each iteration in the algorithm increases from both the MFPT and BOA control policies a little bit by vectormatrix multiplications involved in (11). The pseudocode for the SSD algorithm in Appendix 3 summarizes the procedure for finding the best control gene and the corresponding stationary control policy based on shifted stationary mass.
As in the previous algorithms, deriving the SSD control policy only looks into the effects caused by perturbations to the pairs of states x and x_{ c }. Considering the perturbation to the original transition matrix P, for each network state x, we compare the steadystate distributions π and (x) that correspond to P and the controlled transition matrix by replacing the row in P corresponding to the state x by the row that corresponds to the flipped state x_{ c }. If the undesirable stationary mass based on this onerow perturbation caused by flipping the control gene g at x reduces the undesirable stationary mass, then we decide the control policy for the state x: u_{ g }(x) = 1. For the derivation of the final stationary control policy for all the network states, we investigate the perturbation effects independently by studying the change from P to the corresponding controlled transition matrices by onerow perturbations at all the states: . The final control policy u_{ g }for the network actually leads to a multirow perturbation to P by combining all the beneficial one row perturbations determined in the algorithm: . Generally, is different from all the controlled transition matrices by onerow perturbations considered during the derivation of u_{ g }. Although, intuitively, this combination of beneficial onerow perturbations should reduce the total undesirable stationary mass, it is difficult to find the analytic characterization of the effect to the undesirable mass caused by the combination. We note here that in our simulations (Results and Discussion), the derived SSD control policy always reduces the stationary mass for undesirable states and performs better than the MFPT and BOA algorithms.
Conservative steadystate distribution (CSSD) control policy
All of the above control policies are relatively aggressive. Whenever we see a desirable difference by intervention through perturbing the original transition matrix of the network, we will apply control in a greedy manner. For singlegene control, the previous algorithms check whether applying gene flipping leads to immediate benefits. The decisions for different pairs of original and flipped states are independent. Therefore, we can derive the previous control policies for all the network states in a parallel fashion. However, there is no theoretical guarantee that applying these control policies will lead to the reduction of the stationary mass for undesirable states or, analogously, the increase of the stationary mass for desirable states. We now present a control policy, the conservative steadystate distribution (CSSD) control policy, for which we have a theoretical guarantee that the steadystate distribution after intervention will have less than or equal to the stationary mass of the undesirable states in the original network.
While we can compute the shifted steadystate distribution accurately using Theorem 2, we now introduce another theorem, proven in [17, 25], which gives the new fundamental matrix after applying a rankone perturbation to the network.
Flipping one control gene at any given state x, similar to the derivation of (11), simply replaces the row in the original transition matrix corresponding to the state x by the row that corresponds to the flipped state x_{ c }. Hence, we have a = e_{ x }, which has 1 for the element corresponding to the state x and all 0's for the remaining elements in the vector; and . We can substitute these into (12) to compute the updated fundamental matrix.
where K is the total number of iterations, and each pair of neighboring transition matrices differ by only one row. Here, is the final controlled transition matrix with the derived control policy u_{ g }. As we obtain the controlled transition matrix at each iteration by a onerow perturbation to the previously computed controlled transition matrix, we can keep updating the exact steadystate distribution and the fundamental matrix using (11) and (12), respectively. Thus, at each iteration, we can directly compute the true stationary mass for undesirable states after intervention and make the decision about the control policy for the selected state as well. We let the algorithm run iteratively until we find that intervention to any state will actually increase the stationary mass of undesirable states from the previous iteration. In this way, we are guaranteed that the derived control policy will always have undesirable stationary mass less than or equal to that of the undesirable states in the original network. This algorithm is computationally more expensive compared with the previous algorithms as the search space is O(2^{ n }) at each iteration and the number of iterations K depends on the controllability of the networks. The advantage here is that the CSSD control policy is guaranteed to decrease undesirable stationary mass after intervention and, as well see in simulations, tends to outperform the SSD policy. The pseudocode in Appendix 4 summarizes the CSSD procedure for finding the best control gene and the corresponding stationary control policy.
The following theorem provides the guarantee that applying the conservative SSD algorithm [Appendix 4] will reduce the total undesirable stationary mass.
where K is the number of total iterations of the sequential algorithm.
Proof: We prove the theorem by induction. Starting with the first iteration k = 1, we always have since we only apply the control when as shown in Appendix 4. Now at the k th iteration, assuming , we want to show that . Indeed, in the CSSD algorithm [Appendix 4], at each iteration we apply the control only when . Hence, we have . QED
Extension to contextsensitive PBNs
We have restricted ourselves to BNps and instantaneously random PBNs till now. All the algorithms, including the MFPT algorithm, focus on the GAP space and this only corresponds to the Markov chain space for BNps and instantaneously random PBNs. However, these algorithms, including the MFPT algorithm, can be extended to intervene in contextsensitive PBNs with no theoretical obstacles. But as the state space changes from the space of GAPs to the space of (context, GAP) pairs in contextsensitive PBNs, the computational complexity of these algorithms will increase. Moreover, for the algorithms directly based on steadystate distributions, we have to apply iterative update schemes to compute the shifted steadystate distributions since the perturbations to the transition matrix become multiplerow perturbations [17].
Results and discussion
Comparison of four greedy control policies
In this section, as with the ensemble analysis in [7, 26–28], we study the performance of the four greedy control policies, MFPT, BOA, SSD, and CSSD, based on a large number of randomly generated networks with similar network properties. The two most important parameters for generating random Boolean networks are the bias (p_{ b }) and connectivity (K). Here, p_{ b }is the mean of the Bernoulli distribution to generate the truth table of one Boolean function in a Boolean network, the bias p_{ b }being the probability that a randomly generated Boolean function takes on the value 1. K is the maximum input degree of the Boolean functions in the network. All simulation results in this section are based on 1, 000 randomly generated PBNs of 10 genes, including BNps and instantaneously random PBNs, with fixed K = 3 and different p_{ b }'s.
Performance comparison for BNps
Performance comparison for randomly generated BNps.
Control policies  p _{ b }  

0.1  0.3  0.5  0.7  0.9  
ORG  0.8923  0.7000  0.5034  0.2781  0.1110 
MFPT  0.8641  0.5572  0.3222  0.1574  0.0763 
BOA  0.8644  0.5717  0.3352  0.1657  0.0777 
SSD  0.8609  0.5415  0.3093  0.1491  0.0748 
CSSD  0.8594  0.5102  0.2472  0.1308  0.0743 
Percentages of random BNps with improved performance after applying control.
Control policies  p _{ b }  

0.1  0.3  0.5  0.7  0.9  
MFPT  96.4%  91.1%  90.6%  91.1%  97.6% 
BOA  100.0%  99.4%  98.4%  99.5%  100.0% 
SSD  100.0%  100.0%  100.0%  100.0%  100.0% 
CSSD  100.0%  100.0%  100.0%  100.0%  100.0% 
Performance comparison for instantaneously random PBNs
Performance comparison for randomly generated instantaneously random PBNs.
Control policies  p _{ b }  

0.1  0.3  0.5  0.7  0.9  
ORG  0.8939  0.6934  0.4997  0.2967  0.1063 
MFPT  0.8567  0.5807  0.3484  0.1912  0.0747 
BOA  0.8595  0.5966  0.3662  0.2026  0.0784 
SSD  0.8547  0.5637  0.3349  0.1822  0.0728 
CSSD  0.8525  0.5439  0.2971  0.1670  0.0723 
Percentages of random PBNs with improved performance after applying control.
Control policies  p _{ b }  

0.1  0.3  0.5  0.7  0.9  
MFPT  94.4%  91.3%  94.4%  92.5%  95.6% 
BOA  99.5%  96.1%  97.0%  94.9%  99.6% 
SSD  100.0%  100.0%  100.0%  100.0%  100.0% 
CSSD  100.0%  100.0%  100.0%  100.0%  100.0% 
We observe similar trends for randomly generated networks with different numbers of genes [see Additional file 1]. Finally, while the MFPT and BOA control policies are close in terms of the shift of undesirable stationary mass, the MFPT control policy being slightly better in many cases, the BOA control policy is always significantly better than the MFPT control policy in terms of producing a beneficial shift.
A mammalian cell cycle network
We now apply these different control policies on a probabilistic Boolean network (PBN) model of the mammalian cell cycle recently proposed in [29]. For a normal mammalian organism, cell division coordinates with overall growth controlled via extracellular signals. These signals indicate whether a cell should divide or remain in a resting state. The positive signals, or growth factors, instigate the activation of Cyclin D (CycD), which is one of the key genes in the mammalian cell cycle. The other two important genes are retinoblastoma (Rb) and p27. Rb is a tumorsuppressor gene. This gene is expressed in the absence of the cyclins, which inhibit Rb by phosphorylation. Gene p27 is also active in the absence of the cyclins. Whenever p27 is present, it blocks the action of CycE or CycA and Rb can also be expressed, even in the presence of CycE or CycA. Hence, it stops the cell cycle.
The preceding explanation represents the wildtype cell cycle model. In this model, when p27 is active, the cell cycle can be stopped in cancerous situations. When we follow one of the proposed mutations in [29], in which p27 is mutated and it is always off, the mutation introduces a situation where both CycD and Rb might be inactive. As a result, in this mutated phenotype, the cell cycles in the absence of any growth factor. In other words, we consider the logical states in which both Rb and CycD are downregulated as undesirable states.
Definitions of Boolean functions for the mutated mammalian cell cycle PBN.
Order  Gene  Regulating function 

x _{1}  CycD  extracellular signals 
x _{2}  Rb 

x _{3}  E 2F 

x _{4}  CycE 

x _{5}  CycA 

x _{6}  Cdc 20  CycB 
x _{7}  Cdh 1 

x _{8}  UbcH 10 

x _{9}  CycB 

Total stationary mass of undesirable states after applying control policies for all the potential target genes in the mutated mammalian cell cycle PBN.
Control policies  Potential control genes  

E2F  CycE  CycA  Cdc20  Cdh1  UbcH10  CycB  
MFPT  0.0445  0.1534  0.1799  0.2003  0.1399  0.2037  0.1320 
BOA  0.0505  0.1534  0.2092  0.1832  0.1912  0.2161  0.1712 
SSD  0.0386  0.1534  0.1784  0.1614  0.1369  0.2025  0.1531 
CSSD  0.0386  0.1534  0.1770  0.1371  0.1369  0.2025  0.1303 
A recent paper suggests that the MycRbE2F pathway functions as a bistable switch that separates quiescence and proliferation for the mammalian cell cycle [19]. The paper shows that E2F activation correlates directly with the ability of a cell to reverse the R(estriction)point, which marks the critical event when a mammalian cell commits to proliferation independent of growth stimulation. The Rpoint is fundamental for normal differentiation and appears to be dysregulated in virtually all cancers. It is interesting to see that through different mathematical modeling we reach the same conclusion that E2F is the best potential target for future gene therapy design.
Running time for deriving four control policies in the mutated mammalian cell cycle PBN.
Control policies  MFPT  BOA  SSD  CSSD 

Running time (sec.)  27.9033  27.8034  28.0793  599.7273 
Conclusion
In this paper, we propose three new greedy stationary control policies directly using longrun behavior change by intervention to define the control criteria. Through simulations, we have shown that the MFPT, BOA, and SSD policies perform similarly with respect to computational complexity and all reduce the risk of entering undesirable states that correspond to aberrant phenotypes of the modeled cells, with the SSD policy having better average performance in this regard than the other two. For the conservative CSSD policy, we are guaranteed that intervention will lead to beneficial shift of steadystate distributions. We have also illustrated how these control policies can serve as the potential gene therapeutic intervention strategies in the future with a mammalian cell cycle network. As with the MFPT control policy, not only can they be used to directly shift the steadystate distribution without the need for a cost function, they can also be used to predict the potential control gene of the network, serve as reducedcomplexity approximations to costbased control policies, and provide measures of network controllability. Our future direction will be focused on understanding the performance of these greedy control policies relative to their robustness in the presence of inaccurate inference [30] and network reduction [31, 32].
Appendix
Appendix 1 – MFPT algorithm [7]
Partition the statespace into undesirable U and desirable D subsets.
Compute K_{ U }and K_{ D }.
g ← 1.
repeat
for All states x in U do
x_{ c }← flip control gene g in x.
if K_{ D }(x)  K_{ D }(x_{ c }) > λ then
u_{ g }(x) = 1;
else
u_{ g }(x) = 0;
end if
end for
for All states x in D do
x_{ c }← flip control gene g in x.
if K_{ U }(x_{ c })  K_{ U }(x) > λ then
u_{ g }(x) = 1;
else
u_{ g }(x) = 0;
end if
end for
g ← g + 1.
until g > number of genes
Appendix 2 – BOA algorithm
Partition the statespace into undesirable U and desirable D subsets.
Determine the BOA structure of the network, including B(x), d_{ D }(x) or d_{ U }(x) for each state x. g ← 1.
repeat
for All states x in U do
x_{ c }← flip control gene g in x.
if B(x) contains no desirable attractors &&B(x_{ c }) contains desirable attractors then
u_{ g }(x) = 1;
else
if d_{ D }(x) > d_{ D }(x_{ c }) then
u_{ g }(x) = 1;
else
u_{ g }(x) = 0;
end if
end if
end for
for All states x in D do
x_{ c }← flip control gene g in x.
if B(x) contains undesirable attractors &&B(x_{ c }) contains no undesirable attractors then
u_{ g }(x) = 1;
else
if d_{ U }(x_{ c }) > d_{ U }(x) then
u_{ g }(x) = 1;
else
u_{ g }(x) = 0;
end if
end if
end for
g ← g + 1.
until g > number of genes
Appendix 3 – SSD algorithm
Partition the statespace into undesirable U and desirable D subsets.
Compute the original steadystate distribution π and the fundamental matrix Z.
g ← 1.
repeat
for All pairs of states x and x_{ c }← flip control gene g in x do
Compute and using (11).
if then
u_{ g }(x) = 0;
u_{ g }(x_{ c }) = 0;
else
if then
u_{ g }(x) = 1;
u_{ g }(x_{ c }) = 0;
else
u_{ g }(x) = 0;
u_{ g }(x_{ c }) = 1;
end if
end if
end for
Compute the shifted stationary mass of undesirable states , where is the stationary mass of the undesirable states after we apply the derived control policy;
g ← g + 1.
until g > number of genes
Appendix 4 – Conservative SSD algorithm
Partition the statespace into undesirable U and desirable D subsets.
Compute the original steadystate distribution π and the fundamental matrix Z.
g ← 1.
repeat
π^{ best }= p;
Z^{ best }= Z;
Assign the set of states with no control assigned as L;
repeat
Δ^{ best }= 0;
for All states x ∈ L do
Compute with π^{ best }and Z^{ best }based on (11);
if Δπ_{ U }> Δ^{ best }then
Δ^{ best }= Δπ_{ U };
Assign the best state to control: x^{ best }= x;
π^{ best }= ;
end if
end for
if Δ^{ best }> 0 then
u_{ g }(x^{ best }) = 1;
u_{ g }( ) = 0;
L = L\{x^{ best }, };
Compute Z^{ best }based on (12) accordingly;
end if
until Δ^{ best }= 0
Δ(g) = π_{ U }^{} ;
g ← g + 1.
until g > number of genes
Declarations
Acknowledgements
This work has been supported by the National Science Foundation (CCF0514644) and the National Cancer Institute (2 R25CA09030106).
Authors’ Affiliations
References
 Kauffman SA: Homeostasis and differentiation in random genetic control networks. Nature. 1969, 224: 177178. 10.1038/224177a0View ArticlePubMedGoogle Scholar
 Shmulevich I, Dougherty ER, Kim S, Zhang W: Probabilistic Boolean networks: A rulebased uncertainty model for gene regulatory networks. Bioinformatics. 2002, 18: 261274. 10.1093/bioinformatics/18.2.261View ArticlePubMedGoogle Scholar
 Datta A, Choudhary A, Bittner ML, Dougherty ER: External control in Markovian genetic regulatory networks. Machine Learning. 2003, 52 (1–2): 169181. 10.1023/A:1023909812213.View ArticleGoogle Scholar
 Datta A, Pal R, Choudhary A, Dougherty ER: Control approaches for probabilistic gene regulatory networks. IEEE Signal Processing Mag. 2007, 24: 5463. 10.1109/MSP.2007.273057.View ArticleGoogle Scholar
 Pal R, Datta A, Dougherty ER: Optimal infinite horizon control for probabilistic Boolean networks. IEEE Trans on Signal Processing. 2006, 54 (6–2): 23752387. 10.1109/TSP.2006.873740.View ArticleGoogle Scholar
 Faryabi B, Datta A, Dougherty ER: On approximate stochastic control in genetic regulatory networks. IET Systems Biology. 2007, 1 (6): 361368. 10.1049/ietsyb:20070015View ArticlePubMedGoogle Scholar
 Vahedi G, Faryabi B, Chamberland JF, Datta A, Dougherty ER: Intervention in gene regulatory networks via a stationary meanfirstpassagetime control policy. IEEE Transactions on Biomedical Engineering. 2008, 55 (10): 23192331. 10.1109/TBME.2008.925677View ArticlePubMedGoogle Scholar
 Eldeiry WS, Tokino T, Velculescu VE, Levy DB, Parsons R, Trent JM, Lin D, Mercer WE, Kinzler KW, Vogelstein B: Waf1, a potential mediator of p53 tumor suppression. Cell. 1993, 75 (4): 817825. 10.1016/00928674(93)90500PView ArticleGoogle Scholar
 Miyashita T, Reed JC: Tumorsuppressor p53 is a direct transcriptional activator of the human bax gene. Cell. 1995, 80 (2): 293299. 10.1016/00928674(95)904123View ArticlePubMedGoogle Scholar
 Owenschaub LB, Zhang W, Cusack JC, Angelo LS, Santee SM, Fujiwara T, Roth JA, Deisseroth AB, Zhang WW, Kruzel E, Radinsky R: Wildtype human p53 and a temperaturesensitive mutant induce Fas/Apo1 expression. Molecular and Cellular Biology. 1995, 15 (6): 30323040.View ArticleGoogle Scholar
 Harley CB: Telomere loss – mitotic clock or genetic time bomb. Mutation Research. 1991, 256 (2–6): 271282.View ArticlePubMedGoogle Scholar
 Kim NW, Piatyszek MA, Prowse KR, Harley CB, West MD, Ho PLC, Coviello GM, Wright WE, Weinrich SL, Shay JW: Specific association of human telomerase activity with immortal cells and cancer. Science. 1994, 266 (5193): 20112015. 10.1126/science.7605428View ArticlePubMedGoogle Scholar
 Hiyama E, Hiyama K, Yokoyama T, Matsuura Y, Piatyszek MA, Shay JW: Correlating telomerase activity levels with human neuroblastoma outcomes. Nature Medicine. 1995, 1 (3): 249255. 10.1038/nm0395249View ArticlePubMedGoogle Scholar
 Zhang W, Piatyszek MA, Kobayashi T, Estey E, Andreeff M, Deisseroth AB, Wright WE, Shay JW: Telomerase activity in human acute myelogenous leukemia: Inhibition of telomerase activity by differentiationinducing agents. Clinical Cancer Research. 1996, 2 (5): 799803.PubMedGoogle Scholar
 Lowe SW, Schmitt EM, Smith SW, Osborne BA, Jacks T: p53 is required for radiationinduced apoptosis in mouse thymocytes. Nature. 1993, 362 (6423): 847849. 10.1038/362847a0View ArticlePubMedGoogle Scholar
 Hunter JJ: Mixing times with applications to perturbed Markov chains. Linear Algebra and its Applications. 2006, 417: 108123. 10.1016/j.laa.2006.02.008.View ArticleGoogle Scholar
 Qian X, Dougherty ER: Effect of function perturbation on the steadystate distribution of genetic regulatory networks: optimal structural intervention. IEEE Trans on Signal Processing. 2008, 56 (10–2): 49664976. 10.1109/TSP.2008.928089.View ArticleGoogle Scholar
 Schweitzer PJ: Perturbation theory and finite Markov chains. J Appl Prob. 1968, 5: 401413. 10.2307/3212261.View ArticleGoogle Scholar
 Yao G, Lee TJ, Mori S, Nevins JR, You L: A bistable MycRbE2F switch: a model for the restriction point. Nat Cell Biol. 2008, 10: 476482. 10.1038/ncb1711View ArticlePubMedGoogle Scholar
 Brun M, Dougherty ER, Shmulevich I: Steadystate probabilities for attractors in probabilistic Boolean networks. Signal Processing. 2005, 85 (10): 19932013. 10.1016/j.sigpro.2005.02.016.View ArticleGoogle Scholar
 Faryabi B, Vahedi G, Chamberland JF, Datta A, Dougherty ER: Intervention in contextsensitive probabilistic Boolean networks revisited. EURASIP Journal of Bioinformatics and Systems Biology. 2009, 2009 (360864): 13[doi:10.1155/2009/360864].Google Scholar
 Marshall S, Yu L, Xiao Y, Dougherty ER: Inference of probabilistic Boolean networks from a single observed temporal sequence. EURASIP Journal on Bioinformatics and Systems Biology. 2007, 2007 (32454): 15Google Scholar
 Bertsekas DP: Dynamic programming and optimal control. 2005, Belmont, MA: Athena ScientificGoogle Scholar
 Kemeny JG, Snell JL: Finite Markov Chains. 1960, New York: VAn NostrandGoogle Scholar
 Hunter JJ: Stationary distributions and mean first passage times of perturbed Markov chains. Linear Algebra and its Applications. 2005, 410: 217243. 10.1016/j.laa.2005.08.005.View ArticleGoogle Scholar
 Kauffman SA, Peterson C, Samuelsson B, Troein C: Genetic networks with canalyzing Boolean rules are always stable. Proc Natl Acad Sci. 2004, 101: 1710217107. 10.1073/pnas.0407783101PubMed CentralView ArticlePubMedGoogle Scholar
 Shmulevich I, Dougherty ER: Genomic signal processing. 2007, Princeton: Princeton University PressView ArticleGoogle Scholar
 Shmulevich I, Lahdesmaki H, Dougherty ER, Astola J, Zhang W: The role of certain Post classes in Boolean network models of genetic networks. Proc Natl Acad Sci. 2003, 100: 1073410739. 10.1073/pnas.1534782100PubMed CentralView ArticlePubMedGoogle Scholar
 Faure A, Naldi A, Chaouiya C, Theiffry D: Dynamical analysis of a generic Boolean model for the control of the mammalian cell cycle. Bioinformatics. 2006, 22 (14): 124131. 10.1093/bioinformatics/btl210View ArticleGoogle Scholar
 Pal R, Datta A, Dougherty ER: Robust intervention in probabilistic Boolean networks. IEEE Trans on Signal Processing. 2008, 56 (3): 12801294. 10.1109/TSP.2007.908964.View ArticleGoogle Scholar
 Ivanov I, Dougherty ER: Reduction mappings between probabilistic Boolean networks. Applied Signal Processing. 2004, 4: 125131.View ArticleGoogle Scholar
 Ivanov I, Pal R, Dougherty ER: Dynamics preserving size reduction mappings for probabilistic Boolean networks. IEEE Transactions on Signal Processing. 2007, 55 (5): 23102322. 10.1109/TSP.2006.890929.View ArticleGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.