State feedback control design for Boolean networks

Background Driving Boolean networks to desired states is of paramount significance toward our ultimate goal of controlling the progression of biological pathways and regulatory networks. Despite recent computational development of controllability of general complex networks and structural controllability of Boolean networks, there is still a lack of bridging the mathematical condition on controllability to real boolean operations in a network. Further, no realtime control strategy has been proposed to drive a Boolean network. Results In this study, we applied semi-tensor product to represent boolean functions in a network and explored controllability of a boolean network based on the transition matrix and time transition diagram. We determined the necessary and sufficient condition for a controllable Boolean network and mapped this requirement in transition matrix to real boolean functions and structure property of a network. An efficient tool is offered to assess controllability of an arbitrary Boolean network and to determine all reachable and non-reachable states. We found six simplest forms of controllable 2-node Boolean networks and explored the consistency of transition matrices while extending these six forms to controllable networks with more nodes. Importantly, we proposed the first state feedback control strategy to drive the network based on the status of all nodes in the network. Finally, we applied our reachability condition to the major switch of P53 pathway to predict the progression of the pathway and validate the prediction with published experimental results. Conclusions This control strategy allowed us to apply realtime control to drive Boolean networks, which could not be achieved by the current control strategy for Boolean networks. Our results enabled a more comprehensive understanding of the evolution of Boolean networks and might be extended to output feedback control design.


Background
Boolean networks have been successfully applied to model gene regulations and protein interactions for the last two decades because the up or down regulation of molecular expressions can be described as discrete Boolean functions [1][2][3][4]. In these applications, molecules and their interactions were treated as nodes and edges, respectively. Boolean networks were characterized with network structure, i.e. the organization of nodes and edges, and such experimental design was performed without answering the following questions: 1) whether changing the state of one node or a group of nodes of a network will drive the network to desired states; and 2) how to determine the effect of structural and functional changes of a network.
Similar questions have been answered for linear time invariant systems as reachability and controllability of a system. In general, a particular state x 1 is reachable if there exists a control input to transfer the system from any initial state to x 1 in a finite time. Meanwhile, a system is defined as reachable if every state of the system is reachable [6]. Controllability of a system is very similar to reachability definition, which means if there exists a control input to transfer the system from any initial condition to the origin in finite time. For a linear time invariant system, we can always translate a state to the origin using coordinate transformation. And therefore, reachability is a fundamental check for controllability.
Preliminary results on controllability of general networks were obtained via pinning control strategy in terms of the spectral properties of network structure [7]. Barabasi's group has mapped controllability conditions of linear time invariant systems to complex networks and computationally determined the driver nodes for a network [8]. Their results answered the question which nodes might affect the progression of a network. Yuan and colleagues further examined the effect of weights of the edges on controllability of a general network [9]. Both results focused on finding the minimal number of nodes to control the network. However, these results are computational analysis due to the lack of mathematical representation of complex networks. In the year of 2003, Cheng proposed a mathematical representation of Boolean networks with semi-tensor product [10], which provided a possible approach to systemically examine the controllability of Boolean Networks. Sun and Cheng defined the controllability of a Boolean network and obtained preliminary controllable condition on network structure [11][12][13]. However, the definition and conditions were mathematical oriented and have not been linked to Boolean operations in real networks, which imposed extra difficulty for users without the required mathematical background.
In this study, we defined both structural and functional requirements for a reachable Boolean network using semi-tensor product. We found 6 forms for controllable 2-node Boolean networks with both structural and functional conditions, developed a sharable tool to determine whether an arbitrary Boolean network is reachable or not, and gave possible structural and functional changes to modify the reachability. Most importantly, we proposed the first state feedback control strategy to drive a Boolean network by integrating current status of all nodes in the network. The control strategy allowed realtime application and will provide effective control to drive the network to a desired state.

Boolean networks
Boolean networks proposed by Kauffman are discretetime dynamics systems with Boolean state-variables [5]. Each node of a Boolean network is a Boolean state variable with logic value 0 (false) or 1 (true) corresponding to down or up regulation of a molecule in a biological network. States of all nodes in a Boolean network will lead to a Boolean vector.
A Boolean function with k variables is a mapping B: {0, 1} k → {0, 1} from the set of all k-tuples over {0, 1} to a binary output. This function describes how to determine a Boolean-valued output based on certain logical operations from k binary inputs. It can also be interpreted as how the expression of a molecule will be determined by other k molecules interacting with it. The basic Boolean operations include AND (conjunction), OR (union), and NOT (inhibition). A list of sixteen logical operations was shown in Table 1.

Algebraic representation of Boolean networks
A Boolean network with n logical variables V i , i = 1, 2, . . . , n and m control inputs u j , j = 1, 2, . . . , m can be expressed as . . .
where V i and u j take value from the set {0, 1} [14]. The representation of each Boolean function is defined as B i : {0, 1} n+m → {0, 1}, i = 1, . . . , n, which is preassigned Boolean logical functions determined by the biological process. For a n-node boolean network, there are 2 n possible states. If there is no control input u j , B i is a 2 × 2 n matrix because each logical value 0 or 1 is expressed as a vector (0, 1) T or (1, 0) T , respectively. The algebraic statespace representation of the Boolean control network is set up based on the semi-tensor product of matrices which will be introduced in our method part [10,14,15]. For each Boolean function, there is a unique truth table while the algebraic expression of a Boolean function is not unique. This means that there exist different forms of structures and operations of a network with same Boolean function. In this study, we assume each Boolean function is represented with the simplest form to reduce the complexity of analysis.

Results
We first defined all reachable states of a Boolean network with control applied at the beginning and then removed the control input from the system. This exactly mimics the  situation of modifying one node or a group of nodes in the network initially and examining the response. We then extended the reachability to controllability.

Determining reachability using graphical approach
For a n-node Boolean network, an integrated state represents the status of n variables in the network. All together there are 2 n integrated states, representing each possible status of the n nodes. An integrated state is denoted as e j 2 n , j = 1, 2, · · · , 2 n , in which e j 2 n means the j th column of 2 n × 2 n identity matrix. A graphical representation, time transition diagram, was proposed to illustrate the transition among the integrated states. Each node of the time transition diagram corresponds to one integrated state e j 2 n of a dynamic network. A directed edge from e j 2 n to e k 2 n , j, k = 1, 2, · · · , 2 n , indicates temporal transitions from an integrated state e j 2 n to an integrated state e k 2 n . The directed edge also represents that the j th column in the transition matrix is e k 2 n . The transition matrix of a Boolean network is calculated using semi-tensor product, and each column of the transition matrix is a vector e k 2 n . From the left to the right, each column of the transition matrix represents the transition from e j 2 n , j increasing from 1 to 2 n , to its next integrated state represented by a column vector e k 2 n . Specifically, the left most column of the transition matrix represents the transition from e 1 2 n to its next integrated state, and the right most column in the matrix represents the transition from e 2 n 2 n to its next integrated state. Therefore, there are a total of 2 n outgoing arrows in the time transition diagram and a node may have multiple incoming arrows but has only one outgoing arrow.
Reachability of a node in the time transition diagram means the corresponding integrated state can be reached from any initial integrated state in finite time. If each node in the time transition diagram is reachable, the Boolean network is reachable.

Finding 1 A Boolean network with n nodes (n > 1) is reachable if and only if the signal flow goes through each node in the time transition diagram by one direction, indicating that each node has one outgoing arrow and one incoming arrow.
There are some specific properties for the transition matrix of a reachable Boolean network: 1) There is only one 1 in each column and each row, suggesting an integrate state can only be reached by one other integrated state; 2) Every diagonal elements is zero. It means that the j th column is not e j 2 n . This property excludes self transition of one integrated state. 3) If the j th column is e k 2 n , then the k th column is not e j 2 n , n ≥ 2, which excludes transition between two integrated states. However, this property is not true for a 1-node reachable Boolean network. The transition matrix of 1-node reachable boolean network satisfies that the 1 st column is e 2 2 while the 2 nd column is e 1 2 . Here, an example of a 3-node Boolean network is presented in Fig. 1 to show how the reachability is determined and all 8 integrated states representing possible status of the 3 nodes in the Boolean network are listed in Table 2. Based on these integrated states listed in Table 2  to the nodes through knock out of a node (value 0) nor dosage injection to a node (value 1), the network can not reach the integrated state e 1 8 (node 1 is 0, node 2 is 1, and node 3 is 1), e 2 8 (node 1 is 1, node 2 is 1, and node 3 is 0), e 6 8 (node 1 is 0, node 2 is 1, and node 3 is 0). If we force the initial status of the system to be these three states, the network will deviate from these states and never come back. This result can provide a guideline for experiment design to examine down stream effect for a giving pathway with known Boolean network. For the network shown in Fig. 1, when e 1 8 , or e 2 8 , or e 6 8 is a desired state we would like the Table 2 Relationship between eight integrated states of a 3-node Boolean network and logical values of the 3 nodes

Reachable 2-node Boolean network with logical operations.
We examined all 2-node Boolean networks with combinations of 16 logical operations as shown in Table 1. We found that there were only six simplest forms of reachable 2-node Boolean networks. These six Boolean networks were shown in Fig. 2 with their corresponding time transition diagrams and transition matrices. Interestingly, these six simplest networks show highly coupled property, which can be divided into three groups. In each group, if state x 1 is swapped with x 2 in one of the coupled networks, it exactly becomes the other network. Therefore, for any given 2-node Boolean network dynamics with logical operations, it will be straightforward to know that it is reachable or not when it reduces to its simplest form. In addition, this provided a baseline to check reachability and controllability of a Boolean network with more nodes.
Feedback control design for N-node lower-triangle Boolean networks Starting from the known 6 forms of 2-node reachable Boolean networks, their extensions to N-node Boolean networks can be derived based on the property of transition matrix. Further, for the extended N-node Boolean network with control input added to the nth node directly, the feedback control input can be designed to implement the reachability of the N-node Boolean network.

Finding 2
For a given N-node lower-triangle Boolean network dynamic with control input located at the nth node, if the first N-1 Boolean network dynamic is a reachable (N-1)-node Boolean dynamics, a feedback control can be designed, which is extracted from the N th logical function of extended N-node reachable Boolean dynamics from the (N-1)-node reachable Boolean dynamics.
Given one of the 6 reachable 2-node boolean networks in Fig. 2, we can extend the network with extra nodes once the added boolean functions guarantee the time transition diagram satisfy the condition in our 1st finding. For an extended N-node reachable Boolean network, if we divide its (2 n × 2 n transition matrix L N into sub-blocks, and define 0-block as a square matrix with all zero elements, and 1-block as square matrix with non-zero element, the structure of the transition matrix L N in terms of the sub-blocks will mimic the transition matrix for boolean networks with less nodes. Specifically, if 1-block in transition matrix of 2-node network appears at row i and column j, then for a 3-node network extended from 2-node network, the two 1-blocks only appear at row 2i − 1 and column 2j − 1, row 2i and column 2j or at row 2i − 1 and column 2j, row 2i and column 2j − 1 respectively. An example of how to design the feedback control input of the 3-node Boolean network is shown below, which extends from 2-node reachable Boolean network. And the relationship between transition matrices was shown in Fig. 3. Further, the Boolean function for the 3rd node can be treated as control input u as shown below, a c d e b Fig. 3 The pipeline of extended 3-node reachable Boolean network from 2-node reachable Boolean network. If transition matrix L 3 2 3 × 2 3 of 3-node Boolean network system, is divided into 4 × 4 blocks, then the new transition matrix represented by the 4 × 4 matrix is exactly the same as transition matrix L 2 of fundamental 2-node Boolean network dynamic. a The transition matrix of a 2-node reachable network; (b) Time transition diagram of 2-node network; (c) Each 1-block is extended to two 1-blocks; (d) The transition matrix of extended 3-node extended reachable network; (e) Corresponding time transition diagram of extended 3-node extended network where u is the control input of the lower-triangle dynamic, which will be designed later.
For the 2-node reachable Boolean network represented by we illustrate the inter relationship between the transition matrices and time transition diagram. Based on one possible transition matrix that guarantees the reachability of each integrated state, the boolean operation matrix M can be obtained and the corresponding boolean function for the 3rd node is determined. With the possible transition matrix shown in Fig. 3, the corresponding Boolean function is listed as Then, the feedback control input u is designed as

Analysis of reachability for P53 pathway
The p53 pathway responds to intrinc and extrinsic stress signals that can disrupt the fidelity of DNA replication, genome stability, cell cycle progression, and cell division.
The pathway contains complicated feedback regulatory mechanisms and many experimental results have been accumulated to illustrate the regulations. In the major switch of p53 pathways as shown in Fig. 4, there are four state nodes are denoted as x 1 , x 2 , x 3 and x 4 , which present as ' ATM' , 'p53' , 'Wip1' , 'Mdm2' , respectively [16]. The relationship between integrated states and its corresponding Boolean values of four genes is shown in Table 3 below. The Boolean network representation of 4 genes is The transition matrix is L = e 14 16 , e 10 The corresponding time transition diagram is shown in Fig. 5. From the time transition diagram, there exists a cycle including e 8 16 , e 4 16 , e 2 16 , e 10 16 , e 13 16 , e 15 16 , suggesting a stable pulse generated by P53 pathway switches. Based on Table 3, each integrated state corresponds the specific values of four states. In Fig. 5, the high expression level of a gene presents Boolean value '1' while low expression level means Boolean value '0' .
Additionally, this stable pulse can be reached by different initial integrated states. One of the time course, which includes the main loop, is presented in Fig. 6 based on   our simulation. The network exhibits the one-phase or two-phase dynamic, which depends on the initial states.
If the initial is one of e 8 16 , e 4 16 , e 2 16 , e 10 16 , e 13 16 , e 15 16 , there exists only one-phase pulse, i.e. steady state pulse, which is a periodical pulse. If the initial states are others integrated states, there exists the two-phase pulse (transient pulse and steady state pulse), where the first phase is depends on the time distance between any state belongs to the periodical circle and the initial states and it ends at reaching any one state in the e 8 16 → e 4 16 → e 2 16 → e 10 16 → e 13 16 → e 15 16 → e 8 16 circle. The second phase is characterized by the periodical circle.
To verify that our predictions on P53 pathway progression, we examined the experimental results published on P53 pathways. The published results confirmed that 1) P53 pathway has a stable pattern pulses generation [17], and 2) there exists two-phase transition in P53 pathways [18].

Discussion and conclusions
Reachability of Boolean networks is a central issue for network analysis. However, due to the lacking of a systemic approach to present network progression with respect to the structure and functions of a network, little is known about reachability of a complex network. Recent results are acquired with computational estimates and on structural property [8,9,19]. The most significant contributions of this study were listed below. We have developed a tool to determine the reachability for Boolean networks with arbitrary number of nodes and Boolean functions. This tool allows general non-engineer users to verify whether a Boolean network is reachable or not. Further, with a given Boolean network, we can recognize all the reachable states and separate them from non-reachable states. If a desired state of the network is among the reachable states, a modification of initial states through gene knock out or dosage injection may lead to desired response. Otherwise, a more complicated control should be introduced.
We also found six simplest forms for reachable 2-D boolean networks. This result provided the structure of reachable transition matrix and allowed us to examine possible modification of structure and function of a network. Finally, we proposed the first state feedback control design strategy of N-node Boolean networks. The control is determined by status of all nodes in the network and is feasible for realtime application. For instance, a possible control design was introduced to a 2-node network to  Fig. 3. Though the last Boolean function may be complicated, it provides possible direction for state feedback control design. Simplification and optimization of the state feedback control design and output feedback control design will be conducted as our future research.
Finally, we presented the analysis of the major switch in P53 pathway to predict the progression of the pathway and validated our prediction with published results.

Methods
Semi-tensor product. Semi-tensor product allows us to multiply two matrices without the requirement of matching their dimensions [10].
For a logical dynamics, we know that 1 and 0 are used to represent logical states True and False , respectively. In order to define the logical values for computing and analysis, vector forms of Boolean variables are applied using semi-tensor product paper. The semi-tensor product of two matrices A ∈ R m×n and B ∈ R p×q is that where α = lcm(n, p), lcm(n, p) denotes the least multiple of n and p. I α/n and I α/p are the (α/n×α/n) identity matrix and (α/p × α/p) identity matrix, respectively. Operation ⊗ means the Kronecker product [20].
Representation of Boolean network dynamicss using semi-tensor product. We summarize the mathematical tool of semi-tensor product in Cheng's papers as follows. [14,21] Cheng's result 1: Any logical function f (x 1 , x 2 , · · · , x n ) with logical states x 1 , x 2 , · · · , x n ∈ D can be expressed in a multi-linear form as where M is a 2 × 2 n logical matrix. Cheng's result 2: Consider a Boolean network with states x i ∈ D and denote integrated state L is the transition matrix of this Boolean network.
Cheng's results allow us to represent the dynamics of Boolean networks with an algebraic state space representation. Then, the time transition diagram can be determined by this transition matrix L.
There exists a unique matrix L such that Moreover, 2 is a fixed 16 × 4 matrix provided below, which only depends on the number of nodes.
By substituting the expressions of M 1 , M 2 , and 2 into the calculation formula of L, the matrix L is obtained as Based on Finding 1, in order to ensure it is a reachable Boolean network, the transition matrix should satisfy the j th column is not e j 2 n and if j th column is e k 2 n , then k th column is not e j 2 n , which means transion matrix L here must be a special asymmetric permutation matrix, and satisfy that all the elements on the diagonal are zeros. So, there are only six different forms of L in total as shown below.
According to the forms of L i , i = 1, 2, · · · , 6, the related M 1 and M 2 can be determined and their corresponding most simplistic logic equations can be obtained respectively, which are listed below.
In terms of L 1 , M 1 and M 2 can be reduced as Then, the most simplistic Boolean network dynamics equation related to L 1 is All the other five sets of M 1 and M 2 corresponding L i , i = 2, 3, 4, 5, 6, can be obtained through the same way. All possible Boolean networks and corresponding transition matrices were shown in Fig. 2.
Extending N-node reachable network dynamics from 2-node reachable Boolean networks. Denote M i , i = 1, 2, · · · , n as 2 × 2 n logical matrices of each logical function of a N-node Boolean network. Extension of the first n logical functions to extended n + 1 nodes Boolean functions leads to M * i , i = 1, 2, · · · , n + 1. Specifically, M * n+1 indicates the (2 × 2 n+1 ) logical matrix of the last logical function. Moreover, the relationship between M i and M * where matrix E is a fixed 2 × 4 matrix shown as In terms of n nodes reachable Boolean network system, the corresponding expression of transition matrix L n is where M i are 2 × 2 n logical matrices of dynamics, i = 1, 2, · · · , n. n is a fixed 2 2n × 2 n matrix, which only depends on the number of nodes. Then, for n + 1 nodes reachable Boolean network system, the corresponding expression of transition matrix L n+1 is is a 2 n+2 × 2 n+1 matrix, L * n matrix will be extended as L * n ⊗ I 2 when doing the semi-product. Therefore, L n+1 is a 2 n+1 × 2 n+1 matrix, which has multi-level-nested structure based on matrices L 2 , L 3 , · · · , L n of 2 − D, 3 − D, · · · , n − D reachable or reachable Boolean network systems. According to the property of transition matrix, when extending to more nodes reachable networks, each 1-block will be extended to a 2×2 identity matrix or skewidentity matrix, which means a 1-block can be extended to two 1-blocks. If there is an odd number of identity matrices or skew-identity matrices, the extended Boolean network is reachable. Based on this rule, the logical matrix M * n+1 can be derived, then the feedback controller can be determined.
Starting from the six forms of 2-node reachable logical Boolean network, we can find all the 3-node corresponding reachable Boolean network dynamics with logical operations. By that analogy, N-node (n > 2) reachable logical Boolean network can be obtained.

Declarations
The publication costs for this article were funded by the corresponding author. This article has been published as part of BMC Systems Biology Volume 10 Supplement 3, 2016: Selected articles from the International Conference on Intelligent Biology and Medicine (ICIBM) 2015: systems biology. The full contents of the supplement are available online at http://bmcsystbiol. biomedcentral.com/articles/supplements/volume-10-supplement-3.

Availability of data and materials
The Matlab code STP − Toolbox used in this paper can be found at http://lsc. amss.ac.cn/ dcheng/. The transition matrices of examples and real network pathways described in this manuscript are calculated by this Matlab package. The matlab code supporting the example result in Fig. 1 of this article are also included in the supplementary material.