NaviCell: a web-based environment for navigation, curation and maintenance of large molecular interaction maps

Background Molecular biology knowledge can be formalized and systematically represented in a computer-readable form as a comprehensive map of molecular interactions. There exist an increasing number of maps of molecular interactions containing detailed and step-wise description of various cell mechanisms. It is difficult to explore these large maps, to organize discussion of their content and to maintain them. Several efforts were recently made to combine these capabilities together in one environment, and NaviCell is one of them. Results NaviCell is a web-based environment for exploiting large maps of molecular interactions, created in CellDesigner, allowing their easy exploration, curation and maintenance. It is characterized by a combination of three essential features: (1) efficient map browsing based on Google Maps; (2) semantic zooming for viewing different levels of details or of abstraction of the map and (3) integrated web-based blog for collecting community feedback. NaviCell can be easily used by experts in the field of molecular biology for studying molecular entities of interest in the context of signaling pathways and crosstalk between pathways within a global signaling network. NaviCell allows both exploration of detailed molecular mechanisms represented on the map and a more abstract view of the map up to a top-level modular representation. NaviCell greatly facilitates curation, maintenance and updating the comprehensive maps of molecular interactions in an interactive and user-friendly fashion due to an imbedded blogging system. Conclusions NaviCell provides user-friendly exploration of large-scale maps of molecular interactions, thanks to Google Maps and WordPress interfaces, with which many users are already familiar. Semantic zooming which is used for navigating geographical maps is adopted for molecular maps in NaviCell, making any level of visualization readable. In addition, NaviCell provides a framework for community-based curation of maps.


Background
One of the most important objectives of systems biology is developing a common language for a formal representation of the rapidly growing molecular biology knowledge [1][2][3]. Currently, most of the information about molecular mechanisms is dispersed in thousands of scientific publications. This limits its formal analysis by bioinformatics and systems biology tools.
One of the approaches to formalize biological knowledge is to collect the information on molecular interactions in the form of pathway databases [4]. Examples of them are Reactome [5], KEGG PATHWAYS [6], Panther [7], SPIKE [8], WikiPathways [9], TransPath [10], BioCyc [11] and others that are created using various frameworks and formalisms [12]. Most of the pathway databases provide ways for exploring the molecular pathways visually, some include analytical tools for analyzing their structure and some have a possibility to collect users' feedback (see Table 1).
A parallel approach for formalizing the biological knowledge consists in creating graphical representations of the biochemical mechanisms in the form of maps of molecular interactions such as [13] and many others. The idea of mapping several aspects of molecular processes onto a two-dimensional image appeared at the dawn of molecular biology. The first large maps of metabolism, cell cycle, DNA repair have been created manually starting from the '60s and were not supported by any database structure [14,15]. Such maps can be considered as a collection of biological diagrams, each depicting a particular cellular mechanism, assembled into a seamless whole, where the molecular players and their groups occupy particular "territories". Molecules put close together on the map are assumed to have similar functional properties (though it is not always possible to achieve in practice). This geographical metaphor has certain advantages over the database representations for which no global visual image of pathways' functional proximity and crosstalk exists.
A significant achievement of systems biology was in combining both approaches for knowledge formalization into one. For this purpose, it was necessary to develop a graphical language (meaningful to humans), a computerreadable language (onto which the graphical language can be mapped) and software that would allow charting a large map of molecular interactions and, simultaneously, creating a computer-readable file (serving as a database). These challenges inspired the creation of tools like CellDesigner [16], SBGN standard for graphical representation of biological diagrams [17] and SBML [18] and BioPAX standards for exchanging the content of biological pathway databases [19]. A number of comprehensive maps of molecular interactions in the form of reaction networks representing parts of molecular mechanisms, often disease-associated, were created in this way, including the map of RB/E2F pathway [13], which is used further in this paper as an example of NaviCell use.
However, the large maps of molecular interactions are difficult to explore, maintain and improve without proper software support for navigation, querying and providing feedback on their content. There is a number of tools recently developed that take care of some of these aspects [20] (Table 1). Such tools as CellPublisher [21], Pathway Projector [22], GenMAPP [23] and PathVisio [24] focus on navigating within the maps, in particular, exploiting the geographical metaphor and using Google Maps in some of them. SBGN-ED [25] supports all SBGN diagram types. WikiPathways [9] and Payao [26] focus on the web-based service for network annotation and curation. Similarly, PathBuilder is an example of a webbased pathway resource including an annotation tool [27].
The BioUML platform supports SBGN, SBML and enables the maps to connect to other databases as well as collective drawing of maps, similar to the principles of Google Docs. Nevertheless, from our practical experience of map creation, maintenance and curation, we identified the lack of a tool that would offer simultaneously a) easy web-based navigation through the content of comprehensive maps of molecular interactions created according to the systems biology standards; b) visualization of the map at different scales in a readable form; c) possibility to collect the feedback of a user about the map's content in an interactive manner. To fill this unoccupied but highlydemanded niche, we have developed NaviCell which uses Google Maps for navigation into the map, semantic zooming principles for exploring the map at various scales and the standard WordPress blogging system for collecting comments on the maps, providing a discussion forum for the community around the map's content. The combination of these three features makes NaviCell useful tool for user-friendly, curation and maintenance of (large) maps of molecular interactions. NaviCell is publicly available at http://navicell.curie.fr.

NaviCell architecture and installation
NaviCell is a bioinformatics environment which allows the conversion of a large CellDesigner xml file into a set of images and html pages, containing Google Maps javascript code ( Figure 1). These pages can be placed onto a web-server or used locally with all major flavors of Internet browsers. The procedure of creating map representations in the form of NaviCell pages is straightforward and, in the simplest case of a browse-only representation, takes only a few clicks and several minutes. Creation of the blog is also straightforward but requires installation of the WordPress server and automatic generation of topics (posts) in the blog ( Figure 1). NaviCell users may have two roles: a) a map manager who creates, updates and annotates the map; and b) a map user who navigates the map through the web-interface, and add comments on the map content through the blog.

Repertoire of NaviCell entities
Since NaviCell uses the CellDesigner files for creating their web-based representations, it adopted the ontology of biological entities implemented in CellDesigner SBML extension. NaviCell distinguishes Proteins, Genes, RNAs, antisense RNAs as distinct biological entities, existing with different modifications (e.g., different phosphorylated forms of the same protein) and in different cell compartments. Moreover, the same modification of an entity can be represented on the map at several places by multiple aliases. NaviCell creates and displays in the selection panel an explicit list of all map elements and groups them by type of biological entities (see Figure 2). The naming of the map elements in NaviCell is adopted from the BiNoM Cytoscape plugin [28][29][30]. More precisely, entity names are combined with other features such as modifications, compartment names and complex components. The different features are indicated by special characters, such as "@" for the compartments, "|" for modifications and ":" to delimit the different components of a complex. For example, the name "Cdc25|Pho@cytoplasm" represents the protein Cdc25 in a phosphorylated state, located in the cytoplasm, while the name "Cdc13:Cdc2|Thr167_pho@cytoplasm" indicates a protein complex located in the cytosplasm, composed of the protein Cdc13 and the protein Cdc2 which is phosphorylated at amino acid position 167 on a threonine residue.

Graphical representation
NaviCell uses maps created with CellDesigner, and thus follows the SBGN standard in the graphical representations of maps as done in CellDesigner software. Note that CellDesigner has an option "Show SBGN Compliants", which creates a "pure" SBGN view of the map, if it is needed or preferred. The callouts used by NaviCell are a part of SBGN standard for annotation [17]. In addition, NaviCell allows using links to all available MIRIAM resources in the form "@[MIRIAM_RESOURCE]:[ID]" (an example of this is provided at the NaviCell website in the annotated "M-Phase" sample map).
NaviCell can visualize the content of BioPAX files relying on the functionality of external BioPAX to CellDesigner converters, such as the one implemented in BiNoM. In the same fashion, any network imported into Cytoscape [31,32] can be visualized using NaviCell.

NaviCell factory
When setting up a NaviCell environment, the user converts CellDesigner map files into the set of html and JavaScript files necessary to build the NaviCell web site (Figure 1). This is done using NaviCell factory, a Java software embedded into BiNoM Cytoscape plugin, as described in the NaviCell manual. In the future, it will be possible to upload a new map onto the web site and automatically add it to the NaviCell collection.

Semantic zooming
Semantic zooming is a visualization principle which is widely used in modern geographical maps and consists in providing readable image of the observed part of the map at each zoom level [33]. In semantic zooming, details of a representation vanish progressively when zooming out, and individual objects or group of objects can be replaced by representations more suitable and informative for the current scale of visualization (see Figure 3 for an example). By contrast, simple mechanistic compression of pixels that is used currently in most biological network viewers makes network images uninformative. Therefore, the idea of semantic zooming in application for visualization of biological networks attracted recently some attention [34,35]. In NaviCell, the map manager can provide as many semantic zooming views of the network as needed, and creation of the corresponding images remains in the map manager's hands. The only requirements that should be met come from Google Maps: a) The size of the each semantic zooming view should be half-size (in pixels) of the previous view; b) Relative object position should not change in different zooming views; c) It is preferable that each image size (in pixels) of the semantic zooming could be divided by 256.
At the most-detailed zoom level, the image contains all details, while at the least-detailed zoom, the image provides a general top-level view of the map, hiding details and simplifying representations of individual objects. Therefore, semantic zooming consists in gradual hiding and transforming the details of information to give a meaningful abstract representation when zooming out from the detailed towards the top level view. These principles are demonstrated in the Results section using the RB/E2F pathway map example.
In the simplest case, a user can provide only the most detailed and the least detailed (top-level) view of the map. NaviCell is able to generate the intermediate images automatically, by reducing the numbers of pixels of the images without semantic zoom views. NaviCell manual contains a guide for preparation of semantic zooming images. This process is facilitated and partially automated by using the BiNoM plugin, as described in details in the manual.

Preparing biological network maps for NaviCell General requirements
There are three necessary elements for generating the NaviCell representation of a comprehensive map of molecular interactions: a) a map file in CellDesigner xml format (the master map); b) a set of semantic zooming views of the map (in PNG format); c) a simple configuration file, specifying several options for generating NaviCell files.

Preparation of map modules
In addition, a user can split the map (master map) into submaps called modules, typically defined on functional or structural basis, though any other criteria might be used. An unlimited number of separate simplified map representations that can contain subsets of the master map objects can accompany the master map in NaviCell. Each module can be represented with its own layout in the most clear and readable form. NaviCell allows accessing and shuttling between the map's modules. This option is of the utmost convenience for facilitating the map exploration as it is demonstrated in the Results section on the example of RB/E2F map.

NaviCell annotation format
NaviCell is capable to process a structured form of annotations of biological entities if the CellDesigner xml file contains such annotations in their appropriate format. In the NaviCell annotation format, annotations are structured in sections, each grouping a specific type of information.
In our examples, we used the sections "Identifiers", containing the entity's IDs, "Maps_Modules" for specifying to which module a particular entity belongs, and "References" for providing links to publication records of the entity and free comments. However, NaviCell user can introduce other (arbitrary) section names as well. Different sections are further highlighted by different colors in the NaviCell interface, making them easier to distinguish in the callouts and in the blog posts. NaviCell annotation template can be automatically inserted into the entity annotations of a CellDesigner xml file using BiNoM plugin.
If the entities of the map are not annotated or annotated using a format different from the NaviCell's, then the callouts and annotations in the blog are generated with annotations without sectioning.

NaviCell map generation
When the necessary files have been prepared, the NaviCell map is generated through a menu "BiNoM/BiNoM I/O/ Produce NaviCell maps files…" of the BiNoM Cytoscape plugin. This is done through a simple dialog window asking to indicate the location of the configuration file. NaviCell files can be generated in two modes. The simple mode produces only a local set of files with annotations as static html files. In this case, no commenting on the map's content is possible. The complete mode requires preinstalled WordPress blogging system, creating and configuring a new blog devoted to the map, and specifying credentials for a user of WordPress with administration user rights, in order to automatically generate new posts in the blog. The source xml file of the map can be made available to users for downloading from NaviCell interface: this will depend on the policy of the map manager.

Collecting user's feedback and expertise
A unique feature of NaviCell is the possibility to create and generate a web-blog for providing a discussion forum around an already existing map. The users browsing the map can view the blog entries without registering, or leave their comments after an automatic registration procedure.
NaviCell implements a mechanism for updating the blog entries when the map is modified. If an entity in the map is removed, then the corresponding blog entry is moved to archive and becomes invisible. If the annotation of an entity in the map is changed, then the corresponding blog entry is updated, and the previous version is moved into the archive together with all previous users' comments. The blog archive is accessible to users, providing traceability of all discussions and changes. This mechanism is implemented to avoid confusion in discussion if an entity's annotation has been significantly modified.

NaviCell documentation
NaviCell is accompanied by two detailed guides downloadable from the NaviCell web site.
The 'Guide for map manager and system administrator' describes in details the NaviCell installation procedure, recommendations for map construction and structuring entity annotations for the most efficient use of NaviCell, instructions for semantic zooming levels creation, recommendations for preparing map modules and using the map in NaviCell (http://navicell.curie.fr/ pages/install.html).
For NaviCell users that are interested to explore and comment the existing maps without installing NaviCell and uploading their map to NaviCell, the explanations of NaviCell layout and instructions for efficient navigation and commenting maps in NaviCell can be found in the 'Guide for user and map curator' (http://navicell.curie.fr/ pages/guide.html).
In addition, a video tutorial provides a fast introduction to NaviCell features and demonstration of maps navigation and commenting in NaviCell (http://navicell. curie.fr/pages/tutorial.html).

Example of RB/E2F map in NaviCell
To illustrate NaviCell's capabilities, we use the map of RB/E2F pathway that we created earlier [13]. We redesigned the global map layout which now reflects the cell cycle organization (Figure 2). We made four levels of semantic zooming (Figure 3) as described below in more details. The RB/E2F map has a hierarchical structure, it is divided into 16 modules as they were described earlier [13]. Each module is also represented as a separate map with a simple and readable layout. The shuttling between the master map and the module maps is provided through internal links in NaviCell ( Figure 4). The map is connected to a web blog with pre-generated posts corresponding to each map's entity or module ( Figure 5). Each post provides a full entity annotation specifying all forms of the entity, reactions in which the entity participates and the role it plays in reactions (reactant, product, catalyzer, etc.). The post can be commented by the map's users. A user can submit comments on the annotation posts in the form of hypertext enriched with images and hyperlinks. This blog is a system for knowledge exchange and active discussion between specialists in the corresponding domain and NaviCell map managers. The RB/E2F map is used for the video tutorial of NaviCell's functions available at the web-site.
Together with the RB/E2F map, NaviCell representations are provided for the map of Notch and P53 pathway crosstalk that was used in one of our projects, as well as for most large CellDesigner's maps that have been published so far. The whole collection of maps is accessible from the NaviCell's web site http://navicell. curie.fr/pages/maps.html.

Navigating a comprehensive map in NaviCell
Navigation through the map of molecular interactions in NaviCell is ensured by the standard and user-friendly engine of Google Maps, allowing scrolling, zooming, dropping down markers and showing callouts (Figure 2). Using Google Maps makes it easy to get started with NaviCell, as it is an intuitive and widely used interface. The content of the map is shown in the right-hand selection panel, which is a list of entities and map objects grouped by their types. The selection panel allows identifying and selecting the desired map element and dropping down a clickable marker. In addition the user can identify a molecular entity by using the full-text search function, which queries molecular entities by any substring in their description (by name, synonym or any word if provided, by word in the annotation). The markers do not disappear when zooming in or out, indicating the positioning of the selected map elements at all zoom levels. Clicking on a marker opens a callout that contains a short description of the selected entity, standard identifiers and hyperlinks to external databases; internal hyperlinks to related module maps; and a link to the corresponding annotation post in the blog. In addition to the information visualized in callouts, annotation posts in the blog represent modifications of the entity, reactions in which the entity participates and the field where the user's comments can be posted (see Figure 5 and below for the blog explanation).

Semantic zooming views of the RB/E2F map in NaviCell
The RB/E2F map can be navigated similarly to geographical maps through Google Maps and visualized at several zoom levels ( Figure 3). The navigation of the map starts from the top-level view ( Figure 3A) where 16 map modules with their names can be identified (analogous to countries or regions), together with the positioning of the most important molecular entities in the modules (analogue of country capitals). In addition, the cell cycle phases are indicated on the map.
In the next level, a more detailed level of zooming called "Pruned" in Figure 3B, the major or canonical cell signaling pathways are visualized. These pathways were specified by intersecting the content of the RB/E2F map with several pathway databases and selecting those entities and reactions that are "canonically" represented in those databases (see the NaviCell guide for semantic zoom levels generation explanation).
The third zoom level, the "Hidden details" level, shows the RB/E2F map with all molecular players and reactions. Small details such as the names of posttranslational modification residues, complex names, and reaction IDs are hidden at this level. In addition, the relative font sizes are increased for better readability ( Figure 3C). The fourth semantic zooming level is the most detailed view where all map elements are present ( Figure 3D).
The module background coloring appears as a context layer in the background of all levels of zooming.

RB/E2F map modules representation in NaviCell
Each molecular entity of the RB/E2F map can be found in one or more of the 16 module of the map. The participation of an entity in various modules is indicated in the protein's annotation callout by a link which leads to a separate module map(s). In addition, each module map can be accessed from the selection panel. The module map represents the module with a simplified and easy-to-read layout ( Figure 4); the right hand panel contains the list of only those entities that are contained in this module. Thus, the master map is connected to a collection of maps of modules, facilitating easily shuttling between maps. For example, the RB module map demonstrates the 'life cycle' of the RB protein (how it gets phosphorylated at different residues, in which complexes it participates, what are the regulators of the main transitions) ( Figure 4C).
Blogging in NaviCell: commenting the RB/E2F map NaviCell uses the WordPress (http://wordpress.org) webbased blog system to collect feedback from the map users. The blog contains pre-generated posts for each entity of the map as genes, proteins, complexes, reactions etc. Each post is composed of a detailed entity annotation (HUGO names, references, etc.), links to other entities in the network (internal hyperlinks) and links to other databases (external hyperlinks).
An example of a pre-generated post is shown in Figure 5 for a molecular complex composed of CDK6, Cyclin D1 and p27Kip1 proteins. The post contains annotation of the complex, list of complex forms (modifications) and reactions in which the complex participates as reactant, product, or catalyzer. Note that each globe icon in the post leads to the map, and allows selecting corresponding objects on it. For example, clicking at the globe icons in the line describing the reaction "DP2*:E2F4:p107*@nucleus → Figure 5 Annotation post in the blog for the complex CDK2: cyclin A2*:p27Kip1*. DP2*:E2F4:p107*|pho@nucleus", the markers will show either "DP2*:E2F4:p107*@nucleus" species or "DP2*:E2F4: p107*|pho@nucleus" species or the reaction itself on the map. Parallel use of the map and the blog facilitates exploring and understanding the map.
The blog system provides a feedback mechanism between the map users and map managers. Updating the map is foreseen in the following scenario. The manager of the map regularly collects the users' comments and updates the map accordingly in a series of releases. In turn, NaviCell can automatically update the blog and archive older versions of posts including users' comments, thus providing traceability of all changes on the map and simplifying map maintenance (see Figure 1).

Discussion
NaviCell is an environment for visualization and simplified usage of large-scale maps of molecular interactions created in CellDesigner. NaviCell allows demonstrating map content in a convenient way, at several scales of complexity or abstraction. In addition in provides an opportunity to comment its content, facilitating the maintenance of the maps. NaviCell is not implemented to cover the functionality of all existing network visualization tools; however, NaviCell combines several essential features together, and therefore fills an important need in the map maintenance and support process.
The use of the Google Maps interface makes it straightforward for the user to get started with NaviCell, as this interface is intuitive and already familiar to most users.
The development and application of semantic zooming principles is a unique feature of NaviCell that allows step-wise exploration of the map and helps to grasp the content of very complex maps of molecular interactions at several levels of complexity from the global map structure, through major, canonical pathways up to the most detailed level.
In addition, we propose to map managers to prepare maps with a hierarchical structure, dividing the map into submaps (modules). NaviCell provides the mechanism of shuttling between these maps, facilitating the exploration of the maps and better grasping the structure and the content of the map, especially in the case when big and complex networks are represented.
Previously CellPublisher [21] used Google Maps to construct a user-friendly environment for large CellDesigner maps, including such elements as markers, callouts, zooming. In NaviCell we significantly improved and extended the navigation functionality. Importantly, the semantic zooming, one of the unique features of NaviCell, does not exist in CellPublisher. Other distinctive features of NaviCell, compared to CellPublisher, are using markers for selecting entities on the map, decomposing the map into interconnected modules, systematic representation of entities' modifications and their roles in reactions, and a possibility to discuss each object on the map separately.
Community-based annotation of CellDesigner maps is possible using Payao, a SBGN-compliant communitybased map curation tool [26]. The tool provides map navigation functions, but the main focus of Payao is annotation of the map content. Payao has original and useful features such as tagging system and pop-up callouts allowing each curator to add comments on any component of the map. The tagsets of all curators can be visualized on the map allowing to trace the curation activity. The exchange of opinions is possible by adding comments representing a forum for discussion. Finally, Payao provides some drawing tools that allow marking an area of interest and graphically represent proposed changes that can be further analyzed by the map managers and included in the map. However the map navigation features are rather limited in Payao, compared to NaviCell, as described in the previous paragraph. In addition, using well-developed blogging software (WordPress) can give advantages in some scenario (a possibility to see the latest discussions, archiving, making RSS feeds and sending e-mail notifications).
The concept of a blog which we deliberately used for collecting the users' feedback in NaviCell provides a different paradigm of map curation and maintenance comparing to community-based map construction represented by WikiPathways [9]. In WikiPathways model, pathway maps can be created and edited by any user, whereas in NaviCell model a map manager is in charge of validating propositions of modifications. In many cases, comprehensive maps are one single person project requiring thoughtful design of the map's layout and resolving contradicting interpretations of biochemical experiments and points of view. NaviCell, unlike WikiPathways, is not designed for collective ab initio construction of the maps but, instead, allows visible and open discussion forum around an already existing map. Later the map can be modified and updated accordingly by the map manager who takes responsibility and interprets the users' comments, preventing uncontrolled map changes. Both paradigms (blog vs wiki) are of interest in the systems biology field, and can be combined in the future.
We believe that NaviCell will reinforce the interest to assemble large-scale maps of molecular interactions and present them to the community for constructive discussion. We hope that in such a way more consensual representations of the knowledge on molecular mechanisms will be achieved.
We currently work on extending NaviCell with an analytical toolbox implementing a set of methods for visualizing high-throughput data (expression measurements, protein activities, mutation profiles, etc.) on top of the molecular maps, and with tools for analyzing the map's structure in the spirit of Google Maps (for example, route finding, suggesting several alternative routes, etc).

Conclusions
NaviCell is a web-based, user-friendly and interactive environment, which can be easily used by molecular biologists. NaviCell functionality has been already tested in several concrete projects for navigation and curation of large maps of molecular interactions.
In the future, we plan to extend NaviCell into an environment under which maps will be visualized, discussed, updated and analyzed using a toolbox that is currently in development.

Availability and requirements
Project name: NaviCell Project homepage: http://navicell.curie.fr Operating system(s): Platform independent Programming language: Java, HTML, JavaScript License: GNU GPL Any restrictions to use by non-academics: license is not needed