Sensitivity analysis of biological Boolean networks using information fusion based on nonadditive set functions

Background An algebraic method for information fusion based on nonadditive set functions is used to assess the joint contribution of Boolean network attributes to the sensitivity of the network to individual node mutations. The node attributes or characteristics under consideration are: in-degree, out-degree, minimum and average path lengths, bias, average sensitivity of Boolean functions, and canalizing degrees. The impact of node mutations is assessed using as target measure the average Hamming distance between a non-mutated/wild-type network and a mutated network. Results We find that for a biochemical signal transduction network consisting of several main signaling pathways whose nodes represent signaling molecules (mainly proteins), the algebraic method provides a robust classification of attribute contributions. This method indicates that for the biochemical network, the most significant impact is generated mainly by the combined effects of two attributes: out-degree, and average sensitivity of nodes. Conclusions The results support the idea that both topological and dynamical properties of the nodes need to be under consideration. The algebraic method is robust against the choice of initial conditions and partition of data sets in training and testing sets for estimation of the nonadditive set functions of the information fusion procedure.

. Additional file 1: Table S1. Data-set containing the values of all 7 attributes for each node.
x 7 ... (2) Collect the values of attributes for each node of the network in a matrix like Table   1. The fibroblast network has 130 nodes, so the the resulting matrix has size 130 × 7.
ii. Iterate both the wild-type and the mutated network for T = 800   Table S2. Data-set containing the values of all 7 attributes for each node together with the target values obtained for the given initial state.
iii. Compute the target value, AHD, over the last 500 iterations, according to the formula Table 1 with an extra column containing the 130 target values obtained from mutating the 130 nodes one by one. The extended dataset is shown in Table 2, and corresponds to the top table in Figure 3 of the main text.

Sub-data-sets.
(4) Choose n = 3, 4, 5 attributes at a time to define nonadditive set functions; this = 91 possible combinations out of 7 attributes. Create 91 × 100 = 9100 individual sub-data-sets corresponding to the 91 combinations of attributes and the 100 initial states. A sample sub-data-set is shown in Table 3 and corresponds to Figure 2 and the bottom of Figure 3 in the main text. Observe that the target values are the same (by nodes) for any combination of attributes in the subdata-bases and a selected initial state. The information fusion can be performed using one single value of n; however in our work we use all three, so the steps for the estimation of the nonadditive set functions shown below are repeated for each n. Table 3. Additional file 1: Table S3. Sample sub-data-set containing the values of 3 of the 7 attributes for each node together with the target values obtained for the given initial state. Node Information fusion. Estimation of the nonadditive set functions and target values for the testing set.
(5) Calculate the nonadditive set functions for each of the 9100 sub-data-sets and generate estimated target values: FOR NumberSubDataSet = 1, 2, . . . 9100 perform the following steps: (a) Split the sub-data-set in two parts: the first T ≥ 2 n − 1 lines represent the training set, used to identify the nonadditive set functions µ. Here n is the number of combined attributes. The remaining L = 130 − T lines represent the testing set. We choose T = 120 since larger training sets produce more accurate estimations of the nonadditive set functions. In Section 4.3 of the main text (five-fold cross-validation) we discuss that a smaller training set produces only slightly less accurate estimates. Also, the choice of the T nodes to be used for training does not influence the outcome of the method.
A basic sample split of the sub-data-set of Table 3 is shown in Table 4.
is positive or j = 2 n − 1, and z j = 0 otherwise. Here f rc stands for fractional part. Table 4. Additional file 1: Table S4. Sample split of the sub-data-set of Table 3 in a training set of size 120 and a testing set of size 10 for the given initial state j.
(9) Plot these errors as in Figure 4 of the main text to identify their magnitude. The smaller the errors the better the estimation, and that validates the method as in Section 4.1 of the main text.
Best combination of attributes.
(10) Find the best combination of attributes as follows: i. Compute the consistency in target values where std[·] N umberInitialState stands for standard deviation over the 100 different initial states.
ii. Compute the consistency in nonadditive set functions