CytoASP: a Cytoscape app for qualitative consistency reasoning, prediction and repair in biological networks
© Kittas et al. 2015
Received: 23 January 2015
Accepted: 11 June 2015
Published: 11 July 2015
Qualitative reasoning frameworks, such as the Sign Consistency Model (SCM), enable modelling regulatory networks to check whether observed behaviour can be explained or if unobserved behaviour can be predicted. The BioASP software collection offers ideal tools for such analyses. Additionally, the Cytoscape platform can offer extensive functionality and visualisation capabilities. However, specialist programming knowledge is required to use BioASP and no methods exist to integrate both of these software platforms effectively.
We report the implementation of CytoASP, an app that allows the use of BioASP for influence graph consistency checking, prediction and repair operations through Cytoscape. While offering inherent benefits over traditional approaches using BioASP, it provides additional advantages such as customised visualisation of predictions and repairs, as well as the ability to analyse multiple networks in parallel, exploiting multi-core architecture. We demonstrate its usage in a case study of a yeast genetic network, and highlight its capabilities in reasoning over regulatory networks.
We have presented a user-friendly Cytoscape app for the analysis of regulatory networks using BioASP. It allows easy integration of qualitative modelling, combining the functionality of BioASP with the visualisation and processing capability in Cytoscape, and thereby greatly simplifying qualitative network modelling, promoting its use in relevant projects.
Biological networks can be modelled at different levels of abstraction. Whereas quantitative models rely on the availability of kinetic parameters, qualitative models are primarily based on the network structure, rendering them generally applicable to large-scale analyses . Among qualitative reasoning platforms, the Sign Consistency Model (SCM)  is a framework for modelling influence graphs by confronting a network of labelled interactions with quantitative data, imposing a collection of constraints. In the context of regulatory networks, SCM can be used to check whether an observed behaviour can be explained or if unobserved behaviour can be predicted.
Given genes, proteins, or metabolites and a graph of labelled interactions among these entities, cases are identified where experimental observations are inconsistent with known regulation patterns to indicate either unreliable data or missing regulations. It is assumed that data originates from steady-state shift experiments where, given a perturbation, the difference between two steady states corresponds to protein or metabolite changes in concentration. Logical modelling may then be employed to determine viable states of molecular interactions when the network is confronted with experimental data by using the SCM framework. Further specific experiments can also be suggested to validate the system behaviour through reconciling with the regulatory conditions needed to achieve it .
The BioASP software collection 1 implements methods for modelling metabolic and gene regulatory networks , using logical rules with Answer Set Programming (ASP) . ASP is a form of declarative programming that can address difficult, NP-hard search problems. The building blocks for ASP programs are atoms, literals, and rules. Atoms are elementary propositions (factual statements) that may be true or false, literals are atoms and their negations, rules are expressions composed of atoms and programs are finite collections of rules . The consistency problem is thus encoded by a collection of rules, such that its intended models, called answer sets, represent solutions to the problem.
A publicly available Python script, ingranalyze 2, facilitates influence graph modelling without knowledge of ASP. However, it provides no graphical user interface and specialist user input is required (i.e. file formatting, syntactic restrictions, pre/post-processing through command line etc.). A web service3 also exists, which provides a point-and-click interface but no visualisation of the results.
Cytoscape  offers an ideal platform for network analysis encompassing visualisation, layout and processing capabilities . A Cytoscape 3.x app, named CytoASP, is implemented and reported here, that uses ingranalyze as a back-end for influence graph consistency checking, prediction and repair operations through Cytoscape. User input is provided through a graphical interface, while integration with Cytoscape provides additional functionality features for visualisation and further analysis.
The interaction between Cytoscape and CytoASP was implemented in Java. This part of the program starts by displaying the graphical interface where the user selects the desired options. Once the calculation is requested by the user, it executes parallel threads, one for each of the different networks. Each of these threads writes two text files, one describing the network, and another one containing the corresponding observations. It then runs a Python script that will make use of those two text files and interact with the BioASP ingranalyze package.
The Python script plays the intermediary role between information in the text files and the ingranalyze Python package. It processes raw data from the text files and passes them on (along with the options chosen by the user) to ingranalyze, which executes the binaries for the clasp and gringo ASP solvers. Once the predictions are made, it obtains the results and writes them to files in the chosen directory. The Java threads finally read the result files when the Python processing is over and update the visualisation in Cytoscape accordingly.
The main functionality of CytoASP is the analysis of sign influence graphs, which are directed graphs where vertices are the input and state variables of a system and edges express their effects on each other. An edge i→j means that the variation of i through time influences the level of j. The edges are labeled with positive and negative signs indicating activations and inhibitions respectively.
On this basis, CytoASP provides the following functionality through Cytoscape: i) checking the network for consistency, ii) identifying minimal inconsistent cores (MICs), iii) computing repair sets, and iv) predicting node variation under consistency and repair. Assuming a steady state and some form of perturbation, the difference between initial and final states is used as input. The algorithm proceeds to determine consistency and predict unknown regulation effects and repair sets. By overlaying experimental data on the network, some of the system variables are fixed, allowing checks on whether observed data agree with underlying graph interactions.
Consistency and repair
The SCM defines the general rule to determine network consistency: “The variation of each node (which is not considered as input) must be explained by an influence received from at least one of its predecessors” . A network is considered consistent if there exists a total labelling of its edges and nodes which overrides the current labelling and which is consistent with regard to the SCM rule. This can be formulated as a boolean satisfiability problem which is known to be NP-hard and may be efficiently tackled by the ASP approach . A particular instance of network, with overlaid data, is consistent if its (potentially partial) labelling is consistent.
BioASP defines different rules that allow checking the consistency of a network. These rules are then handled by a grounder and then by a solver. If the network is consistent, there will be at least one answer set. In this case, predictions may be calculated, which are the set of values for the unobserved nodes in the intersection of all possible solutions (i.e. answer sets of predicted node variations).
Flipping observations, i.e. changing the signs for the variation of each node.
Defining certain nodes as input, namely nodes which are not influenced by their predecessors.
Flipping influences, i.e. changing the signs of the network edges.
Adding new influences in the network.
Each repair set is output as a different file. As it is possible to have many repair sets, these cannot be visualised simultaneously. However, it maybe the case that some repairs are common in all repair sets, and these are visualised on the network as nodes with alternating fill and border colour (see e.g. Fig 2), or edges with alternating shape and colour. Predictions can also be made under different repair modes, which are the deductions for the variation of unobserved nodes that hold true under all repair sets.
When calculating repairs, it is often useful to provide concise explanations when an experimental profile is inconsistent with an influence graph, by isolating parts of the network responsible for the inconsistency. These are minimal sub-graphs of an influence graph with a given partial labelling, such that the vertices and edges cannot be labelled consistently. These sub-graphs are called Minimal Inconsistent Cores (MICS)s , and CytoASP can calculate them, by checking the appropriate setting in its options.
After nodes and edges are assigned, the user defines repair options: flip variations, flip influences, turn nodes into inputs (i.e. not having their variation defined by any predecessors) or add edges. In the next step, the colours for visualising the results from computations are chosen. Once all options are set, the app calls ingranalyze to perform the computations and suggest predictions.
CytoASP is packaged as standalone software including solver binaries, Python libraries and Python itself. Therefore, it runs with no dependencies on any system where the BioASP solvers can be compiled. It is currently offered for Linux 64-bit systems as standalone software, due to its dependency on Python and availability of the BioASP solvers. We are considering a future implementation (CytoASP 2.0) providing execution via a web service to offer platform independence.
Results and discussion
We demonstrate the application of CytoASP on the yeast transcription regulatory network provided in , a genetic network comprising sets of genes that interact through directed transcriptional regulation. The yeast regulatory network contains 935 genetic or biochemical regulations, all of which have been established experimentally, among 447 genes. On this network, gene expression data from whole-genome microarray analysis of SNF2 knock-out mutants from  is overlaid. It is noted that SNF2 is a catalytic subunit of the SWI/SNF chromatin remodelling complex, that controls transcription by perturbing the structure of nucleosomes.
Genes that exert a regulatory role encode dedicated transcription factors can bind to specific DNA control regions of regulated genes to activate or inhibit their transcription. Regulated genes may themselves act in a regulatory manner, in which case they participate in a causal pathway. This transcriptional regulatory network is represented as a graph where vertices are genes and directed edges denote activating or repressing effects on transcription.
Such visualisation capability facilitates spotting repairs that hold under all repair sets, since these are visualised as coloured nodes with alternating border colour. As an example, consider node (1) HSC82 (YMR186W) (Fig. 2). This gene was observed as up-regulated in the microarray dataset however, as consistency checks have indicated that it is down-regulated in all repair sets, it is highly likely that gene expression measurement may be inaccurate and further scrutiny can be planned for verification. It is noted that similar visualisation is possible if flipping influences (edges) or defining nodes as input are selected as a repair options.
This approach also provides predictions for unknown expression, based on the transcriptomics data and underlying network. Consider e.g. nodes YCL050C, YPL075W, YDR050C (2) in Fig. 2. Expression of APA1 gene (YCL050C) is regulated by the carbon source and requires the protein GCR1 (YPL075W) . GCR1 mutations substantially decrease the expression of glycolytic genes, such as TPI1 (YDR050C), compared to those in the wild-type strain . The observed down-regulation of TPI1 allows to predict the subsequent downregulation of GCR1 and APA1 which is inline with observations from the literature.
Another example are MET4 (YNL103W) and MET16 (YPR167C) which are linked with a positive influence (3). Transcription of MET16 and other genes required for sulfate assimilation is activated in the absence of methionine. The up-regulation of MET16 allows us to predict the up-regulation of MET4. Mutation in the MET4 gene abolishes transcription of MET16, thus MET4 is essential for MET16 expression . Therefore the model provides a plausible explanation for the up-regulation of MET16 via the MET4 transcription factor.
Comparison with existing software
The COMA Cytoscape plugin  implements a method to facilitate consistency checks for gene expression studies given a gene regulatory network. This approach is tailored to transcriptional gene regulatory interactions and can identify local inconsistencies for genes that are controlled by a set of transcription factors. However, it is limited in determining specific consistent patterns of genes regulated by a number of transcription factors, it does not provide global predictions and it is not able to suggest repairs in the case of inconsistency.
Integer linear programming (ILP) has also been recently employed on interaction graphs to encode constraints on the qualitative behaviour of the nodes . This approach determines topology-consistent explanation for responses of signalling nodes measured in a stimulus-response experiment and identifies the optimal subgraph of the given network topology which can best reflect measurements from a set of experimental scenarios. In contrast, our backend (ingranalyze) that uses BioASP does not focus on generating subgraphs, rather it suggests edge/node repairs on the given network structure.
Qualitative influence graph modelling is currently available in Cytoscape 2.x via the BioQuali Cytoscape app . This approach uses ternary decision diagrams (TDDs) to answer the consistency problem. TDDs provide an efficient representation of all the possible solutions of a qualitative system of constraints, however computation time increases significantly for large networks. On the other hand, BioASP only calculates predictions under repair that hold true under all repair sets (i.e. not all possible), however it is capable of dealing problems of much larger size, rendering it ideal for modelling large scale regulatory networks.
A significant advantage of the ASP paradigm is its flexibility and extensibility. Defining a new variation value for a node, (e.g. such as the null-variation), can be done by adding new rules into the program, while in the TDD approach, this requires modification of a significant part of its implementation. BioASP also offers significant capability for repairs not found in BioQuali, such as calculation of all minimal repair sets either as nodes or edges, and identification of minimal inconsistent cores.
Feature comparison of CytoASP and BioQuali 
Remote (Web Service)
ASP solvers (through ingranalyze & BioASP)
Yes, multiple networks may be analysed simultaneously using all available CPU cores
Custom, user can select colors for visualisation
Fixed, colors are pre-selected
i) Flip observations, ii) Flip influences, iii) Define nodes as input, iv) Add influences
Common repairs are visualised for all four repair modes
Yes, all minimal repair sets for nodes / edges are calculated
Minimal Inconsistent Cores
In terms of memory requirements, CytoASP uses 64-bit versions of Cytoscape and ASP solvers and therefore it is practically limited by the theoretical limit of a 64-bit system architecture (16 exabytes). However, it is noted that memory requirement for a graph with n number of nodes increases by O(n 2), therefore large graphs (e.g. >106 nodes) may occupy a large amount of system memory. Since Cytoscape uses adjacency list as its internal representation format, sparse graphs will occupy less space in memory.
The performance of CytoASP depends on BioASP, since the latter is used as back-end to perform calculations and provide the consistency analysis, predictions and repairs. Modern ASP solvers are based on advanced Boolean constraint solving technology and thus provide highly efficient inference engines . The computation of the network in this case study completes within seconds. The average runtime may vary, depending on feasibility of repair modes on consistent as well as inconsistent observations. A detailed analysis on repair and prediction times with respect to the corresponding repair operation is provided elsewhere . The most demanding process in terms of computation time is the identification of MICs, which increases quadratically with respect to grounding times of gringo ASP solver, in line with ground instantiations for MIC encoding growing quadratically with the size of influence graphs. A detailed analysis in the run-times when calculating MICs is provided in .
We have presented CytoASP, proposing a Cytoscape app for determining consistency and making predictions and repairs in influence graphs. The software takes advantage of the extensive BioASP functionality in manipulating influence graphs, allowing for the modelling of large scale regulatory networks and dealing with inconsistency with extensive repair capabilities.
CytoASP provides customised visualisation in predictions and repair options. Common repairs can be visualised in node and edge repair mode, in addition to determining all repair sets through BioASP. In addition, multiple networks may be selected and analysed simultaneously, taking advantage of parallel processing capabilities. This functionality may be easily incorporated into a network analysis pipeline, through Cytoscape, allowing further analysis and visualisation.
CytoASP implements seamless and user-friendly integration of modelling graphs through BioASP with visualisation and processing options in Cytoscape. Therefore, it greatly simplifies qualitative network modelling by enabling non-programming experts to apply logical reasoning in gene regulations networks, facilitating its use in relevant Systems Biology projects.
Availability and requirements
Project name: CytoASP
Project home page: https://bitbucket.org/akittas/cytoasp
Operating system(s): Linux 64-bit
Execution model: Multi-threading implementation, allowing the analysis of multiple networks in parallel using all available CPU cores.
Programming language: Java, Python
Other requirements: Cytoscape 3.x
License: MIT license
The authors would like to acknowledge Dr. Carito Guziolowski and Dr. Sven Thiele for helpful discussions. AK acknowledges funding from the Leverhulme Trust (to ST, RPG-2012-686). Funding from the EU (to ST, HEALTH-F2-2011-261366) is gratefully acknowledged.
- Samaga R, Klamt S. Modeling approaches for qualitative and semi-quantitative analysis of cellular signaling networks. Cell Commun Sign. 2013; 11(1):43. doi:10.1186/1478-811X-11-43.View ArticleGoogle Scholar
- Siegel A, Radulescu O, Le Borgne M, et al.Qualitative analysis of the relation between DNA microarray data and behavioral models of regulation networks. Bio Syst. 2006; 84(2):153–74. doi:10.1016/j.biosystems.2005.10.006.Google Scholar
- Gebser M, König A, Schaub T, et al.The BioASP library: ASP solutions for systems biology. In: 2010 22nd IEEE Int. Conf. on Tools with Art. Intel. (ICTAI). Arras: IEEE: 2010. p. 383–9. Available at: http://goo.gl/Yb3k7D.Google Scholar
- Baral C. Knowledge Representation, Reasoning, and Declarative Problem Solving. New York, NY, USA: Cambridge University Press; 2003.View ArticleGoogle Scholar
- Brewka G, Eiter T, Truszczyński M. Answer set programming at a glance. Commun ACM. 2011; 54(12):92–103. doi:10.1145/2043174.2043195.View ArticleGoogle Scholar
- Shannon P, Markiel A, Ozier O, et al. Cytoscape: A software environment for integrated models of biomolecular interaction networks. Genome Res. 2003; 13(11):2498–504. doi:10.1101/gr.1239303.PubMed CentralPubMedView ArticleGoogle Scholar
- Gebser M, Schaub T, Thiele S, et al. Detecting inconsistencies in large biological networks with answer set programming. In: Logic Programming. Lect. Notes in Comp. Sci.Berlin: Springer: 2008. p. 130–44.Google Scholar
- Guelzim N, Bottani S, Bourgine P, et al. Topological and causal structure of the yeast transcriptional regulatory network. Nat Genet. 2002; 31(1):60–3. doi:10.1038/ng873.PubMedView ArticleGoogle Scholar
- Sudarsanam P, Iyer VR, Brown PO, Winston F. Whole-genome expression analysis of snf/swi mutants of saccharomyces cerevisiae. Proc Nat Acad Sci. 2000; 97(7):3364–369. doi:10.1073/pnas.97.7.3364.PubMed CentralPubMedView ArticleGoogle Scholar
- García-Arranz M, Maldonado AM, Mazón MJ, Portillo F. Transcriptional control of yeast plasma membrane h(+)-ATPase by glucose. cloning and characterization of a new gene involved in this regulation. J Biol Chemi. 1994; 269(27):18076–82. Accessed 2014-12-19.Google Scholar
- Sasaki H, Uemura H. Influence of low glycolytic activities ingcr1 andgcr2 mutants on the expression of other metabolic pathway genes inSaccharomyces cerevisiae. Yeast. 2005; 22(2):111–27. doi:10.1002/yea.1198. Accessed 2014-12-19.PubMedView ArticleGoogle Scholar
- Thomas D, Barbey R, Surdin-Kerjan Y. Gene-enzyme relationship in the sulfate assimilation pathway of saccharomyces cerevisiae. study of the 3’-phosphoadenylylsulfate reductase structural gene. J Biol Chem. 1990; 265(26):15518–24.PubMedGoogle Scholar
- Baumbach J, Apeltsin L. Linking Cytoscape and the corynebacterial reference database CoryneRegNet. BMC Genomics. 2008; 9(1):184. doi:10.1186/1471-2164-9-184. Accessed 2015-05-05.PubMed CentralPubMedView ArticleGoogle Scholar
- Melas IN, Samaga R, Alexopoulos LG, Klamt S. Detecting and removing inconsistencies between experimental data and signaling network topologies using integer linear programming on interaction graphs. PLoS Comput Biol. 2013; 9(9):1003204. doi:10.1371/journal.pcbi.1003204. Accessed 2015-05-05.View ArticleGoogle Scholar
- Guziolowski C, Bourd’e A, Moreews F, et al. BioQuali cytoscape plugin: analysing the global consistency of regulatory networks. BMC Genomics. 2009; 10(1):244. doi:10.1186/1471-2164-10-244.PubMed CentralPubMedView ArticleGoogle Scholar
- Gebser M, Guziolowski C, Ivanchev M, et al.Repair and prediction (under inconsistency) in large biological networks with answer set programming. In: Twelfth International Conference on the Principles of Knowledge Representation and Reasoning 2010. http://aaai.org/ocs/index.php/KR/KR2010/paper/view/1334.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.