Qualitative modeling framework
In this section, we briefly revisit the formal framework introduced by René Thomas as originally found in literature [38, 39].
Definition 1
(Biological Regulatory Network) “A Biological Regulatory Network (BRN) is a labeled directed graph G=(V,E), where V is a finite set of vertices, also called biological entities and E⊆V×V is the set of interactions.
The successors and predecessors of a biological entity are represented as \(G_{\nu _{i}}^{+}\) and \(G_{\nu _{i}}^{}\), respectively. Each vertex is provided with a limit \(\ell _{\nu _{i}} = \left G_{\nu _{i}}^{+}\right \) when \(\left G_{\nu _{i}}^{+}\right  \geq 1\), and \(\ell _{\nu _{i}} = 1\) when \(\left G_{\nu _{i}}^{+}\right  = 0\). The edges are labeled by a pair (τ,σ), where \(\tau \leq \ell _{\nu _{x}}\) is the threshold of influence, and σ={+,−} is called sign of interaction (+ for activation and  for inhibition). Each entity ν_{i}∈V has its abstract expression level in the set \( E_{\nu _{i}} \,=\, \left \{0,1,....,r_{\nu _{i}}\right \}\) where \( r_{\nu _{i}} \!\leq \!l_{\nu _{i}}\). The state of a BRN is a configuration of expression levels of all biological entities at a particular time instant.
Definition 2
(State) A State of BRN is ntuple \(S = \left \{s_{\nu _{1}},.., s_{\nu _{n}}\right \}\), \(\forall s_{\nu _{i}} \in E_{\nu _{i}}\), where \(s_{\nu _{i}}\) is the abstract expression level of ν_{i}.
The state space of a BRN is a cartesian product obtained on the range of expression levels of all entities and can be computed using Formula 1.
$$ \prod\limits_{i=1}^{n} E_{\nu_{i}} $$
(1)
In a given state, each biological entity ν_{i} is regulated by its predecessors \(G_{\nu }^{}\), formally denoted as set of resources, \(W_{\nu _{i}}\), defined as follows;
Definition 3
(Resources) Let G=(V,E) be a BRN. The set of resources \(W_{\nu _{y}}\) of a variable ν_{y}∈V, at level \(s_{\nu _{y}}\), is defined as; \(W_{\nu _{y}} = \left \{ \nu _{x} \in G^{}_{\nu _{y}}  \left (s_{\nu _{x}} \geq \tau _{\nu _{x},\nu _{y}}\ and\ \alpha _{\nu _{x},\nu _{y}}=+\right) or \left (s_{\nu _{x}}< \tau _{\nu _{x},\nu _{y}}\ and \ \alpha _{\nu _{x},\nu _{y}}=\right) \right \}\)
In order to determine resources of an entity ν_{i}, the presence of activators and absence of inhibitors is considered as resource. Consequently, \(W_{\nu _{i}}\) contains inhibitors and activators of ν_{i}. The targets towards which the levels of variables ν_{i} evolve, depend on the set of positive integers \(K_{\nu _{i}}\left (W_{\nu _{i}}\right)\), also called logical parameters, indexed by \(W_{\nu _{i}}\). The evolution operator (△) in the following formula shows next state towards which ν_{i} evolves.
$$ s_{\nu_{i}}\bigtriangleup K_{\nu_{i}}\left(W_{\nu_{i}}\right)= \left\{\begin{array}{lll} s_{\nu_{i}} + 1 & \texttt{if} & s_{\nu_{i}}< K_{\nu_{i}}\left(W_{\nu_{i}}\right)\\ s_{\nu_{i}}  1 & \texttt{if} & s_{\nu_{i}} > K_{\nu_{i}}\left(W_{\nu_{i}}\right) \\ s_{\nu_{i}} & \texttt{if} & s_{\nu_{i}} = K_{\nu_{i}}\left(W_{\nu_{i}}\right) \end{array}\right. $$
(2)
When variable ν_{i} has a certain expression level \(s_{\nu _{i}}\), its evolution has three possibilities: (1) When \( s_{\nu _{i}} < K_{\nu _{i}}\left (W_{\nu _{i}}\right)\), the value of \(s_{\nu _{i}}\) is incremented by one unit. Conversely, if \(s_{\nu _{i}} > K_{\nu _{i}}\left (W_{\nu _{i}}\right)\), \(s_{\nu _{i}}\) is decremented by one unit. However, \(s_{\nu _{i}}\) does not evolve and remains constant, if \(s_{\nu _{i}} = K_{\nu _{i}}\left (W_{\nu _{i}}\right)\).
The number of possible parameter combinations (parametrization), even for a small network, can be huge. Let G=(V,E) be a BRN with nvariables, and G^{−}(v_{i}) be the cardinality of the regulators of ν_{i}∈G, then number of possible parametrization can be computed using formula 3.
$$ \prod\limits_{i=1}^{n} \left(\ell_{v_{i}}+1\right)^{2^{\leftG^{}\left(v_{i}\right)\right}} $$
(3)
Definition 4
(State Graph) Let G be a BRN and \(s_{\nu _{a}}\) denotes the expression level of biological entity a in a state s∈S. Then the state graph R=(S,T) of G=(V,E) is a directed graph, where S represents set of states, and T⊆S×S is a relation between states, also called the transition relation, such that s→s^{′}∈T iff:

∃ a unique x ε V such that \(s_{\nu _{x}} \neq s_{\nu _{x}}' \) and \( s_{\nu _{x}}' = s_{\nu _{x}} \bigtriangleup K_{x}\left (W_{\nu _{x}}\right)\), and

∀\(y ~ \epsilon ~\mathcal {V} \setminus \{x\} ~ s_{\nu _{y}}' = s_{\nu _{y}}\).” [38, 39].
Different updating schemes have been proposed for qualitative modeling of biological regulatory networks. These updating methods follow synchronous or asynchronous schemes [40]. In a synchronous qualitative model, all variables in the network evolve simultaneously with time. The synchronous mechanism is considered as computationally less expensive [40]. However, it is also less accurate because biological systems are considered asynchronous where changes in expression level of genes or proteins are not concurrent and take place at different time points [16, 21]. In this work, we use asynchronous updating scheme for construction of state graph. The asynchronous scheme is computationally expensive and therefore, we use parallel computing to reduce processing time [40].
To explain working of asynchronous qualitative modeling framework, we apply qualitative framework on BRN of pseudomonas aeruginosa. It is an opportunistic pathogen, commonly found in environment and responsible for mucus production in human lungs effected with cystic fibrosis. The BRN of pseudomonas aeruginosa is shown in Fig. 2a. It comprises of two entities i.e. ALGU (represented by node/vertex ’x’) and its inhibitor protein AntiALGU (represented by node/vertex ’y’). The activation and inhibition relationships are represented by using weighted directed edges by using Definition 1.
In order to measure parameters, experimental observations are encoded into temporal logic formulas. In case of pseudomonas aeruginosa, Fig. 2b shows two CTL formulas ψ_{1} and ψ_{2} that represent normal homeostasis and a pathogenic response receptively. In normal response, the expression level of gene x, starting from (x=0), does not reach (x=2). Whereas, when a pathogenic condition arises, the biological system reaches to a state where gene x is overexpressed finally leading to mucus production.
Figure 2c shows model construction for one parameter combination that satisfies experimental observations. For each state in biological system, its successor states are generated using Definitions 3 and 4. The model construction results in a dynamic model that provides useful insights such as stable steady states and deadlocks. Figure 2d shows the dynamic model as a State Graph(See Definition 4).
Model checking
Model Checking is used to evaluate the dynamic model M against experimental observations expressed as formula CTL ϕ. The verification process determines correctness of ϕ in M, by employing a graphtheoretic procedure that exhaustively explores the entire state space of the system. Finally, model checker confirms correctness of ϕ, if formula is satisfied, or it produces a counter example to provide a trace of the execution path that violates ϕ. The counter example generation is a useful feature for diagnostic purposes.
In a CTL formula, We denote boolean true as ⊤ and boolean false as ⊥. The formula \((s_{\nu _{i}}=n)\) is true iff expression level of variable ν_{i}, in current state, is equal to n. The CTL formula combines a set of connectives: ¬(negation), ∧ (logical AND), ∨ (logical OR) and ⇒ (implication) with temporal operators. The temporal operators are pairs of symbols; the first element of which is A (all paths) or E (at least one path), followed by X (next state), F (any future state) or G (all future states).
Definition 5
(CTL Formula) Let G=(V,E) be a BRN. A CTL formula Φ on G is defined as follows:

atomic formulas are ⊤, ⊥ or any atomic proposition of the form (ν_{i}=n), where ν_{i} is a variable in state graph and \(n \in \left [0, \ell _{\nu _{i}}\right ]\).

If ϕ and ψ are atomic formulas, then so are (¬ϕ), (ϕ∧ψ), (ϕ∨ψ), (ϕ⇒ψ), Xϕ, EXϕ, AGϕ, EGϕ, EFϕ, AFϕ, \((A\phi \bigcup \psi)\) and \((E\phi \bigcup \psi)\)
Parallel implementation in MPJ express
In general, parallel computations are divided into two categories based on communication requirements during the computation phase. The applications that do not need any communication during different computation phases are known as embarrassingly parallel computations. On the other hand, the applications that require frequent communication in between different computation phases are generally known as synchronous computations. One programming approach to implement the embarrassingly parallel computations is to use the “master/slave” model. Since the parameter estimation problem that we are tackling in this study is embarrassingly parallel in nature, we employ the master/slave model to produce the parallel code [1].
Embarrassingly parallel applications are parallelized using master/slave model typically involving three stages. In the first stage, the master process reads the input data, performs domain decomposition, and communicates the relevant chunk to each slave process. The second stage is the computation stage, where all worker processes perform parameter estimation on their own data. During the third and the final stage, all slave processes communicate results back to the master process that generates output for the end user. Out of all three stages, the second stage i.e. the computation phase typically requires the most processing time. In embarrassingly parallel applications, there is no communication required during computation phase, leading to almost linear speedup [1].
Problem decomposition
Here, we use two approaches for parameter decomposition [1].

The first approach exploits the data parallel nature of parameter estimation problem [31]. The parameter state space is partitioned among available processors. We refer to this as coarsegrained parallelism that employs high level data parallelism.

The second approach harnesses finegrained parallelism available in parallel model checkers [26, 41–43] where the underlying algorithms partitions a state graph for verification of biological behaviors encoded in temporal logic.
Coarse grained parallelism
The first partitioning scheme we use in our study divides parameter space into mutually exclusive regions—explored by different worker processes. Since a new model needs to be constructed for each parameter combination, these regions can be explored in parallel by a collection of processes referred to as processing elements (PE). Each PE inspects only a subset of parameter space; and for each combination in that space, a state graph/model is generated. The verification is performed by invoking model checker as an external process to determine whether CTL observations are true. Finally, a reduction operation involves a communication step for receiving accepted sets of parameters. Treating each valuation of parameter separately allows to formulate task of parameter estimation as high level data parallel problem and the decomposition is embarrassingly parallel without any significant communication.
In this study, we use master/worker model of computation to implement high level data parallelism. An important step in parallelizing the code is to perform domain decomposition or partitioning of the input data at the master process. We employ primitive block domain decomposition to produce equallysized independent chunks of input data for each worker process.
We implement our partitioning strategy using the current implementation of SMBioNet. Figure 3 shows pseudocode that uses coarse grained decomposition. The for loop (line 11) in Fig. 3 shows that each worker process performs a block decomposition in order to identify a subset of the total parameter space it needs to explore. For each combination in that space, a new model is generated and provided to symbolic model checker NuSMV [19, 44] to determine correctness of CTL properties. If model checker satisfies the formula, the parameter estimation algorithm appends the model in the list of selected models.
Once the computation phase is over, the reduction step is required to write selected model parameters in a single output file. At this point, each worker process sends its list of selected parameters to the master process that receives selected parameters and produces a single output file. The complexity of the communication involved in reduction step is linear in terms of number of models satisfying the CTL property.
Fine grained parallel decomposition
Although a highlevel decomposition scheme provides a good partitioning strategy for distributed memory architectures (HPC Clusters) due to low communication cost, a lower level decomposition of parameter estimation is motivated by increasing computational capabilities of shared memory multicore computers. Moreover, the maximum speedup achieved by highlevel data parallelism is bounded by serial factor to evaluate one set of parameters. In theory, the integration of a multithreaded model checker implementation such as Java Temporal Logic Framework (JTLV) [45] at multicore level can further reduce processing time. But the potential speedup depends on the granularity of tasks. In practice, the parameter estimation problem comprises of large number of small ’work units’ such that the complexity of each task is O(S.ψ). The symbolic model checking algorithm uses Binary Decision Diagrams (BDDs) as internal data structure for state representation. The multithreaded packages such as JTLV do not parallelize the core operations involved in BDD computation. In turn, a taskparallel simulation using JTLV can provide relatively better performance when several CTL formulas are to be verified independently. One such case is the use of temporal logic patterns to check all the potential qualitative states for a specific property [36]. The related work on parallelization of core BDD operations for multicore processors is notably absent [46]. van Dijk and van de Pol [47] introduce a BDD package Sylvan that demonstrates a nonlinear speedup up to 12X on large models.