- Open Access
BioModels Database: An enhanced, curated and annotated resource for published quantitative kinetic models
BMC Systems Biology volume 4, Article number: 92 (2010)
Quantitative models of biochemical and cellular systems are used to answer a variety of questions in the biological sciences. The number of published quantitative models is growing steadily thanks to increasing interest in the use of models as well as the development of improved software systems and the availability of better, cheaper computer hardware. To maximise the benefits of this growing body of models, the field needs centralised model repositories that will encourage, facilitate and promote model dissemination and reuse. Ideally, the models stored in these repositories should be extensively tested and encoded in community-supported and standardised formats. In addition, the models and their components should be cross-referenced with other resources in order to allow their unambiguous identification.
BioModels Database http://www.ebi.ac.uk/biomodels/ is aimed at addressing exactly these needs. It is a freely-accessible online resource for storing, viewing, retrieving, and analysing published, peer-reviewed quantitative models of biochemical and cellular systems. The structure and behaviour of each simulation model distributed by BioModels Database are thoroughly checked; in addition, model elements are annotated with terms from controlled vocabularies as well as linked to relevant data resources. Models can be examined online or downloaded in various formats. Reaction network diagrams generated from the models are also available in several formats. BioModels Database also provides features such as online simulation and the extraction of components from large scale models into smaller submodels. Finally, the system provides a range of web services that external software systems can use to access up-to-date data from the database.
BioModels Database has become a recognised reference resource for systems biology. It is being used by the community in a variety of ways; for example, it is used to benchmark different simulation systems, and to study the clustering of models based upon their annotations. Model deposition to the database today is advised by several publishers of scientific journals. The models in BioModels Database are freely distributed and reusable; the underlying software infrastructure is also available from SourceForge https://sourceforge.net/projects/biomodels/ under the GNU General Public License.
Advances in molecular and cellular biology over the past few decades have triggered tremendous growth in available experimental data. To generate novel or insightful hypotheses from this enormous quantity of data is a significant challenge. Computational modelling can help meet this challenge by contributing to a deeper understanding of relevant chemical and biological phenomena based on their underlying mechanisms. Simulations of models can help investigate a complete biological process instead of considering smaller segments or aspects, detail a segment of a process or simplify a very large one, suggest or even direct future experiments, and predict the behaviour of a system under given conditions. Supporting these goals requires precise models that accurately represent biological systems in a quantitative manner.
To construct a large-scale comprehensive view of biological systems, several smaller models may need to be integrated. This can be difficult to accomplish, since models can exhibit significant variations even when purporting to cover the same domain space. They can come from different modellers, developed at different times from different perspectives, and be encoded in different formats. Consequently, some models cannot be practically reused, or even worse, may be entirely lost due to a lack of the necessary information that would allow them to be exchanged or converted.
The definition and adoption of standard and machine-readable formats for encoding quantitative models has already been recognised as a crucial first step for efficient exchange and reuse. CellML  and NeuroML  are such examples, but the Systems Biology Markup Language (SBML) , being adopted by more than 180 software systems ranging from simulators to model editors and databases , has so far been the most successful standard model exchange format in this field.
The next stage of infrastructure development for computational modelling is the creation of public repositories where models can be freely deposited and distributed in standardised formats. Models in these repositories should be curated according to agreed-upon standards, and annotated using community-developed controlled vocabularies, for instance with Gene Ontology [5, 6] and Taxonomy [7, 8]. Linking the components to external data resources, such as protein sequences from UniProt  or pathways from Reactome , can allow the unambiguous identification of the components. This in turn can enable members of the biomedical and life science communities to search and retrieve models, or parts of models, relevant to their research topics, whether that topic is a disease, a biological process, a given molecular complex, or something else.
We developed BioModels Database [11–13] precisely with these goals in mind, while other resources, such as ModelDB , JWS Online  or the CellML Model Repository [16, 17], focus on different aspects. The resource is part of the BioModels.net initiative [18, 19], which aims to (1) define community standards for model and simulation curation, (2) provide controlled vocabularies to define and link the terms used in systems biology, and (3) provide a free, centralised, publicly-accessible database for storing, searching and retrieving curated and annotated computational models. Here we describe the current structure of BioModels Database as well as its use.
BioModels Database design and procedures
The BioModels Database server software uses a typical three-tier architecture, in which the data storage, processing and presentation are logically separated. The programming language used for the main development is Java , while some conversion-related processes (described below) are implemented using a combination of Extensible Stylesheet Language Transformations (XSLT)  and shell scripts.
Authentication and User Roles
There are four main types of user roles defined in BioModels Database: The Public user role is assigned by default, and requires no registration. Public users can access the database to search, view and download models, as well as to run simulations. They can also submit models. The remaining roles of Curator, Annotator and Administrator are used by database curators and developers; they provide additional permissions beyond those of Public user and require specific registration.
Quantitative information, kinetic laws and model entities are all stored in SBML files. Model metadata are stored separately in a MySQL database  and not in the SBML file. This simplifies the management of annotations, especially during the annotation phase when the data are updated frequently. Annotations are re-inserted into the SBML files during the release process, allowing users to directly download the fully annotated model files. Moreover, each model's history is tracked using Subversion .
An earlier version of BioModels Database stored model XML  files in Xindice , a native XML database. However, the increasing popularity of BioModels Database as well as an ever-greater number of models exceeding the file-size limit of Xindice forced us to redesign the system. The current version parses and builds indexes of model elements (e.g., name, identifier and notes) using Apache Lucene . Queries based on these indexes are fast and efficient in terms of server memory and CPU usage.
The metadata for all models (submission date, modification date, model format, authors' information, etc.), including references, are stored in a set of relational tables in the MySQL database (Figure 1). The design of these tables also reflects the different stages of the BioModels Database pipeline.
Servers and Mirrors
BioModels Database runs on a server cluster configured with a failover mechanism. Faster access for North American users is provided through a mirror at the California Institute of Technology .
The converters from SBML to CellML  and to SciLab , as well as the converter from CellML to SBML, are written in XSLT . The converters from SBML to XPP, BioPAX [31, 32], Portable Network Graphics (PNG) [33, 34] and Scalable Vector Graphics (SVG)  are written in Java. The Java converter from SBML to the Virtual Cell Markup Language (VCML)  is provided by the Virtual Cell team . Converter details and source code are available online .
From the model overview interface, one can select species, reactions and/or compartments in order to generate a submodel containing these specific elements. This function relies on a Java library developed by the BioModels Database team. The library parses a model and extracts the submodel in a four-step procedure. Firstly, it extracts the species and compartments selected by the user, along with the reactions they are involved in; secondly, it fetches the selected reactions; thirdly, it retrieves the species and compartments involved in all previously obtained reactions; finally, it extracts the compartment types, species types, rules, events, parameters, units, and function definitions needed to build a model which is valid SBML.
The BioModels Database pipeline
The BioModels Database pipeline (Figure 2) manages all models from their submission to their publication. Models submitted to the database are not made publicly visible immediately upon submission; rather, they undergo a series of curation and annotation processes in order to ensure a consistent level of quality. As models pass through the pipeline, additional information is added to facilitate their reuse and the ability of software tools to perform functions such as searching, simulation, conversion or merging.
Model submission is open to the public. BioModels Database currently accepts models encoded in SBML as well as CellML format.
BioModels Database currently only distributes models published in peer-reviewed literature. During the submission process, submitters are required to provide an appropriate publication reference. This reference can be a PubMed Identifier (PMID) , a Digital Object Identifier (DOI) , or a URL. The publication reference helps other users to identify the model; it also improves the accuracy of database search engine. BioModels Database automatically searches for the reference in CiteXplore  and stores the retrieved data, including journal details, authorship, and abstract, in its internal indexes. Models not yet published will not have a publication reference; however, they can still be submitted to the database, and the reference data can be added later by the curators.
BioModels Database performs numerous consistency checks during the model submission procedure. The model must be syntactically valid XML, as well as valid with respect to its encoding schema. Errors detected during submission are reported to the submitter. A submission is successful only after all errors have been corrected. Following successful submission, a confirmation email notifies the submitter of the unique submission identifier assigned to the model. Each such identifier is composed of the character sequence "MODEL" followed by ten digits extracted from the timestamp of submission, and being unique and perennial, the identifier can be quoted in subsequent publications. A notification is also sent to the BioModels Database curator team to inform them that a new model has entered the curation pipeline.
We distinguish several "actors" in the process from model creation to its publication on BioModels Database:
The model's author (s) is (are) the author(s) of the reference citation (i.e., the peer-reviewed article from which the model originates). Concerns regarding the biological basis of the model (e.g., the presence of an interaction not documented in the scientific literature, or behaviour differing from that expected of the biological process) should be directed to the model's author(s).
The model's encoder (s) is (are) the person(s) who actually encoded the model in its present form. There may be several encoders, including the BioModels Database curators if they have to modify a model significantly. The encoder(s) should be contacted if there is a problem with the structure of the model (initial conditions, kinetics parameters, reaction scheme etc.).
BioModels Database also defines the model submitter (s) as the person or persons who submitted the model to the repository. The submitter(s) should be contacted if there is a problem with the original model encoding or annotation.
Any specific concerns about a model can also be reported to the BioModels Database curators through an online form provided for this purpose on the website.
Successfully-submitted models are then queued into the curation pipeline, where several tasks are performed.
If a model is submitted in an old level or version of the SBML format, BioModels Database curators will convert it to the latest level and version of SBML, unless the curators believe that such conversion is likely to cause information loss or inaccuracies. Models submitted in CellML are converted to the latest SBML level and version since the BioModels Database software (annotation interface, simulation tool, etc.) is built around the SBML standard.
Further consistency checks are performed on the model using libSBML  and SBMLeditor . This includes checks for identifier and unit consistency as well as for mathematical expression validity (more specifically, MathML  validity), among others.
Curators manually check that the encoded model faithfully represents the model described in the reference publication. This includes verifying the structure of the model, such as the relationships between variables and mathematical relationships, as well as the nomenclature used in the model components. It is important to emphasise the fact that, during this step, the structure of the submitted models may be modified by the encoders to reflect the structure of the model described in the paper.
Curators download the model and run simulation experiments under the conditions defined in the reference publication. These tasks are performed using several simulation tools, at least one of them being different from the tools used by the original authors of the model. (The latter requirement helps guard against software-specific behaviours or hidden dependencies.) The tools most commonly used are COPASI [45, 46], the SBML ODESolver  or the facilities provided by the Systems Biology Workbench . If the results cannot be reproduced, curators contact the model author(s), for clarification or discussion regarding any issues that have arisen. Once the results correspond to the paper, curators upload a typical results set to the database, together with comments on how, and with which tools, it was obtained.
The curators give the model a consistent and meaningful name following the general scheme Author Year_Topic_Approach. Examples include the names Levchenko2000_MAPK_noScaffold (referring to the model identified by the BioModels identifier "BIOMD0000000011") and Edelstein1996_EPSP_AChEvent (referring to the model "BIOMD0000000001").
After the curation phase, a model is moved into one of two branches depending on the outcome of curation as well as certain other criteria. In the curated branch, models are compliant with the MIRIAM (Minimum Information Required in the Annotation of Models) reporting guidelines . MIRIAM compliance requires models to (1) be encoded in a public standard format, (2) be clearly related to a single reference, (3) correspond to the biological processes listed in the reference publication, and (4) produce the simulation results given in the reference publication using the same values and parameters. Models placed in the curation branch satisfy these requirements because, respectively, (1) each model is converted into SBML and validated, (2) each comes from a peer-reviewed published article, and each is verified by the curators to correspond to its reference description in both (3) structure and (4) results. By contrast, the non-curated branch is reserved for models that are valid SBML but either do not satisfy the full requirements for MIRIAM compliance, or have not been curated fully due to limited resources by the BioModels Database curation team. For example, non-kinetic models such as pathway and interaction maps, as well as steady-state analysis models, are stored in this branch because it is generally not possible to verify their results using simulations. Other models that are placed in the non-curated branch include spatial and boolean models that contain proprietary annotations needed for their interpretation, and models that do not reproduce the required results.
Once a model is moved to the curated branch, a new BioModels Database identifier is generated and assigned to it. This identifier is composed of the character sequence "BIOMD" followed by ten digits reflecting the model's position the branch, for example "BIOMD0000000216" for the 216th model successfully curated. As is the case for submission identifiers, curation identifiers are unique and permanent, and will never be re-assigned to a different model, even if for some reason a particular model must be retracted from the database.
MIRIAM compliance requires a model to have (1) a unique meaningful name, (2) a reference citation linking the model to a unique publication, (3) the name and contact information of the model author(s), (4) the date and time of model creation and last modification, and (5) a precise statement about the terms of distribution. This information is generally the first to be added to a model during the annotation phase.
In publications describing models, the different elements such as specific genes, proteins and metabolites, or the organisms from which the model is derived, are often described in the text or just given convenient or biologically non-meaningful names without any links to reference external resources. Furthermore, the names of elements in the models most often do not allow users to directly relate them to a precise biological function or physical entity. This can greatly diminish model interpretability by both users and software tools. Annotating model elements helps avoid these problems, allowing unambiguous identification through reference to appropriate external resources (using perennial URIs), such as terms from controlled vocabularies (Taxonomy, Gene Ontology, ChEBI ontology , Enzyme Nomenclature , etc.) and links to other databases (UniProt, KEGG , Reactome, etc.). In order to enhance the semantics of models, terms from the Systems Biology Ontology (SBO)  may also be added in the annotation phase. Theoretically, all resources listed in MIRIAM Resources  could be used for annotating model elements. Annotations are used to improve the accuracy of search procedures, as well as provide additional information about the model components. They can also be useful in users' analyses of the models, for instance in clustering  or merging  procedures.
Annotating each model component with the most relevant resource terms requires great efforts, especially since the number of submitted models has grown rapidly. The 17th release of BioModels Database (April 27th 2010), contains 18,950 cross-references (links to external resources contained in the annotations). This is a modest number when compared to the total number of species (37,852) and reactions (44,886) involved in the existing 473 models. This mostly reflects the lack of annotation in the non-curated branch, which is mainly due to limited curator resources (in terms of both time and specific knowledge about the models). Low annotation is also sometimes caused by a lack of adequate or suitable resources, as in the case of molecular entities that exist in a model only for simulation purposes. Moreover, biological data resources are often slightly lagging behind newly generated knowledge, and it is possible that a particular resource does not offer the relevant information at the time the model is annotated. In the case of hierarchical controlled vocabularies, such as Gene Ontology or ChEBI, there is the option to use a term at a higher level of abstraction if the required precise term does not currently exist. Most often, curators nevertheless find ways of adding some information, even if not in an optimal fashion. Model annotation needs to be, and indeed is, a continuous process.
Following the curation and annotation phases, the final stage in the model processing pipeline is model publication. The model is tagged as ready for publication, and becomes publicly available online with the next release of the BioModels Database. New releases of the database are issued two to four times per year.
The large range of features provided by BioModels Database allow users to quickly locate models of relevance for them, analyse them (and understand their structures), simulate them, extract submodels, or download them in various formats (whether text-based or graphical). These facilities are available via a web browser, or can be directly accessed from other tools by using the accompanying web services.
The most basic way of finding a particular model is to identify it from the list of available models. Links to the lists of curated models and non-curated models can be found on the homepage of BioModels Database; the same links are also available from the menu at the top of each page of the site. The lists are presented as tables whose columns display several different model characteristics; within these tabulated views, a user can sort the list of models by model identifier, model name, publication identifier or the date of last modification, by clicking the appropriate column heading.
An alternative to simply browsing the lists of models is available in the form of a tree-structured browser based on the Gene Ontology (GO) terms used in the annotation of models in the database. A navigable, pruned, subtree of GO is automatically generated by the system, allowing users to explore the database thematically. The parenthesised number that appears next to each branch of the GO tree indicates how many models within that branch contain that particular GO term. Expanding the GO tree branch allows a user to drill down to child terms and find models annotated with those more specific GO terms (Figure 3). The extensive GO term coverage within BioModels Database is illustrated in Figure 4.
Search and Retrieval
BioModels Database incorporates a powerful search engine that allows users to quickly locate models of interest. In order to find relevant models, the algorithm performs several searches based on different data, then performs an inclusive disjunction (OR) to combine the results (Figure 5). The searches are performed sequentially as follows: (1) querying metadata, publications and annotations, (2) searching the model bodies, and (3) searching supplementary information from external resource databases. More specifically:
The search begins with the metadata (annotations) of all models in the database. Model metadata is used to facilitate the understanding, characteristics, and management of the model. It may consist of its name, identifier, timestamp, comments from curator(s), etc. The annotations of models include publication information, authors, terms from controlled vocabularies, and links to external resources. Metadata and annotation are supposed to best reflect the nature of a model, since they represent a verified mixture of curator input and algorithmic import.
The next step consists of searching through the SBML files of the models. For example, the 'notes' fields are examined, as they usually contain some information describing the model elements to which they are attached.
Finally, because it is impractical for BioModels Database to duplicate and keep up-to-date all relevant information available from model cross-references, several external databases are searched on demand through direct connection or using web services. During this step, the search engine checks available supplementary information such as synonyms and detailed synopsis.
The system performs some post-processing of the search output in order to deliver better results for user consumption. For example, when the user performs a search using a taxonomic term, the engine traces the whole hierarchy in order to find related models. This means that a search based on the term mammalia will return not only models associated with mammalia, but also models annotated with its descendants and ancestors (Figure 6). The logic of this is that a model describing, say, a system of Homo sapiens, or of Rattus norvegicus, is a model describing a system of mammalia. Similarly, a model that is valid for all metazoa or all vertebrata will be valid for mammalia too.
Models can also be retrieved directly by using either of the two permanent and unique identifiers assigned to the model: the submission identifier, and the curation identifier.
The model presentation page provides access to all of the information stored about a given model, as well as all the system actions available to the user (Figure 7). Elements are hyperlinked between the different views in the presentation of the model. In addition, each annotation is hyperlinked to detailed information about the annotated entity. When an annotation links to an external data resource, the contents of the linked-to resource entry are displayed in a new window in the user's web browser.
Within the model presentation page for a given model, the detailed description is separated into categories organised into a set of six corresponding tabs (area 3 in Figure 7):
The Model tab displays general information about the model and its creation. The uppermost region of the tab summarizes the peer-reviewed, published article that describes the model. In the region below the publication information, a link provides access to the file originally submitted, as well as information about the encoders and the dates and times of model creation and last modification. Annotations displayed with the model refer to the model as a whole and indicate such things as the biological processes being modelled or the taxonomic coverage of the model.
The Overview tab provides quick access to all the model components, that is, the mathematical relationships, physical entities, parameters and other elements comprising the model. Users can select components of interest, and that selection is subsequently reflected when they view the other tabbed panels. Clicking 'Create a submodel with selected elements' generates a model subset containing the selected components and all the components necessary to build a valid SBML model. This submodel is displayed in a new tab, where a link is available to allow the user to download it.
The Math tab lists all of the mathematical constructs used to describe the relationships and the time evolution of the model's variables. These constructs include reactions, events, and explicit mathematical formulae (SBML rules). Each construct is accompanied by a rendering of the mathematical equation, as well as relevant hyperlinked annotations.
The Physical entities tab lists the spatiotemporal entities (i.e., compartments and entity pools) contained in the model, along with their initial quantities and relevant annotations.
The Parameters tab lists all parameters used in mathematical expressions. Parameters whose scope is limited to a reaction are grouped together. Parameters whose values are determined by mathematical expressions are linked to the relevant portion of the Math tab.
The Curation tab displays representative curation results, obtained by the curators by simulating the model under the conditions defined in the reference publication. This tab includes graphical plots and comments from the curator.
The SBML formats menu (area 2 in Figure 7) allows a user to download the model in various versions of SBML . The version used to produce the curation figures is emphasised to indicate it is the only one tested by the curators. The other SBML versions are generated by an automatic conversion process.
The Other formats menu provides access to other (non-SBML) model representation formats, such as CellML , BioPAX [31, 32], and the Virtual Cell Markup Language (VCML) . To permit a given model to be simulated conveniently, BioModels Database also provides downloadable configuration files for open tools such as XPPAUT [57, 58] and SciLab . Finally, a human-readable report in the Portable Document Format (PDF), produced using SBML2LaTeX tool , is also available from the same menu.
The Actions menu provides access to graphical representations of the model's reaction networks, in the form of both static (PNG -Portable Network Graphics- and SVG -Scalable Vector Graphics-) as well as dynamic (interactive Java applet) presentations. A utility to convert graphs into the Systems Biology Graphical Notation (SBGN)  is currently being developed. The Actions menu also provides access to the online simulation tools, described below.
BioModels Database embeds SOSlib  to provide a basic online simulation tool. A given model can be simulated using this facility by selecting the 'BioModels Online Simulation' item from the Actions menu (area 2 in Figure 7). Once the user selects the species to be displayed and the duration for which the simulation should be performed, the simulation task is submitted to a computing cluster on the server side. The results of the simulation are returned in both graphical and textual form. For many models, an additional and more flexible simulation tool is available thanks to a collaboration between BioModels Database and JWS Online . The JWS Online simulation system is available from the Actions menu.
Model of the Month
Every month, a modeller picks a model of his/her choice and writes a short article that elaborates on the model. The article places the model in its biological and theoretical background and discusses its structure and the results of its simulation. This article is then published on the BioModels Database website as a Model of the Month . Such articles make selected models more easily accessible to beginners, and may help them understand their context and significance.
BioModels Database provides web services with a range of features to enable other software to programmatically search and retrieve up-to-date models and their associated data, and to extract submodels . For example, tools such as the Virtual Cell , CellDesigner  or the Systems Biology Workbench  use these services to provide their users direct access (from within their tool) to hundreds of models. The services available are defined in a Web Services Description Language (WSDL)  file that enables software to easily understand available functions and their usage. BioModels Database web services use the Simple Object Access Protocol (SOAP)  to encode requests and responses. This allows standardised communication through HTTP  without the hindrance caused by proxies and firewalls. The complete list of available methods, as well as a Java library and the associated documentation, are provided on the BioModels Database website .
BioModels Database has become a recognised database in the computational systems biology field. It now contains an appreciable number of models, and indeed as far as we are aware, it is the largest public database of its kind today. On April 27, 2010, BioModels Database announced its 17th Release, allowing freely available public access to 249 curated and 224 non-curated models. The number of models deposited in BioModels Database has nearly doubled on a yearly basis (Figure 8) since its inception in 2005. The number of reactions, species and annotations has increased even faster as a consequence of the fact that larger and more complex models are being submitted.
The models stored in BioModels Database come from several sources. Modellers themselves can submit their own models for inclusion in the database. In addition, many models are created from journal articles found in the literature by BioModels Database curators. Other models come from exchange with other collaborative model repositories, such as the former SBML model repository (Caltech, USA), JWS Online [61, 69], the Database Of Quantitative Cellular Signaling (DOQCS) [70, 71], and the CellML repository .
Several publishers of scientific journals recommend model submission to BioModels Database, including Nature Publishing Group, Public Library of Science, and BioMed Central. Following deposition, authors can quote the unique model identifier in their paper, allowing readers to download the model as soon as the paper is published. Some other journals, as part of their peer-review process, advise authors to deposit their computational models into other databases. JWS Online is used for this purpose by the journals Microbiology, FEBS Journal, IET Systems Biology, and Metabolomics. Those models are incorporated into BioModels Database after conversion from their native JWS Online format.
At present, BioModels Database focuses on storing models that can be encoded in SBML. Typically, these models represent activities, interactions or other dynamic phenomena in biochemical networks. BioModels Database also accepts other quantitative approaches such as steady-state models, and qualitative types of approaches, such as logical model; however, these other model types are mostly put into the non-curated branch, because a crucial part of the curation process involves verifying that a model reproduces the exact numerical results reported in the reference article describing the model, and we currently do not have processes for these other model types.
Future development plans
We envision several improvements and additions to BioModels Database and its facilities. Planned developments include:
Implementation of a versioning system to allow users the ability to retrieve and compare different revisions of a given model, including its annotations. This is a much needed feature, specially for efforts like the Minimum Information About A Simulation Experiment (MIASE), which aims at enabling the reproducible description of simulation experiments.
Improvements to the embedded search engine. One such improvement will be the introduction of a relevance ranking scheme for retrieved models, based on their annotations and data stored by external resources.
Introduction of an annotation helper tool that will suggest appropriate annotations to the curator. Such a feature can incorporate tools such as semanticSBML , SAINT , or libAnnotationSBML .
Distribution of more information with the models. We envision providing SED-ML  files in the future. This will allow users to download machine-readable descriptions of the simulation experiments realised during the course of the work that led to the publication of the model.
Computational models are becoming ever more important in various aspects of the life sciences. This is reflected in the vast increases in both the number and the complexity of quantitative kinetic models in BioModels Database (Figure 8). This in turn necessitates the ability to reuse model components, and to build upon pre-existing models. BioModels Database was designed to address these needs.
BioModels Database is a freely available resource for storing curated and annotated versions of peer-reviewed, quantitative models of biological interest. Models are distributed in several forms, ranging from standard model file formats to graphical notations. Besides the analysis tools built into the web interface, BioModels Database offers a variety of useful features and tools to enable other software to programmatically search and retrieve models or submodels, construct large models from components, and access additional up-to-date information. Because the models stored in the database are thoroughly curated by humans, they can be used for teaching purposes, or to study specific biological processes. Moreover, since the models cover a wide range of domains, the whole set can be used for development and testing of simulation tools.
The BioModels Database pipeline, which encompasses the curation and annotation processes, ensures the correctness and quality of the models. The pipeline meticulously ensures syntactic correctness, logical model composition, the accurate capture of biological information, as well as confirmation that the model published will, within reasonable bounds, reproduce the behaviour attributed to it. Together with the cross-references that are embedded into each model, this provides the community with reliable and reusable models.
Availability and licensing
BioModels Database itself is an open-source project; the software is distributed under the GNU General Public License . The database schema and code for both Web Application and Web Services are available from the BioModels SourceForge repository . All converters are also available under the same license. This permits anyone to download and install a local version of the complete system, which may be useful for those who wish to store their own models privately or to integrate part or all of the system into their own software infrastructure.
Lloyd CM, Halstead MDB, Nielsen PF: CellML: its future, present and past. Progress in Biophysics and Molecular Biology. 2004, 85: 433-450. 10.1016/j.pbiomolbio.2004.01.004
Goddard NH, Hucka M, Howell F, Cornelis H, Shankar K, Beeman D: Towards NeuroML: model description methods for collaborative modelling in neuroscience. Philos Trans R Soc Lond B Biol Sci. 2001, 356 (1412): 1209-1228. 10.1098/rstb.2001.0910
Hucka M, Finney A, Sauro HM, Bolouri H, Doyle JC, Kitano H, Arkin AP, Bornstein BJ, Bray D, Cornish-Bowden A, Cuellar AA, Dronov S, Gilles ED, Ginkel M, Gor V, Goryanin II, Hedley WJ, Hodgman TC, Hofmeyr JH, Hunter PJ, Juty NS, Kasberger JL, Kremling A, Kummer U, Le Novère N, Loew LM, Lucio D, Mendes P, Minch E, Mjolsness ED, Nakayama Y, Nelson MR, Nielsen PF, Sakurada T, Schaff JC, Shapiro BE, Shimizu TS, Spence HD, Stelling J, Takahashi K, Tomita M, Wagner J, Wang J: The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models. Bioinformatics. 2003, 19 (4): 524-531. 10.1093/bioinformatics/btg015
Systems Biology Markup Language (SBML). http://sbml.org/
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Sherlock GMRG: Gene Ontology: tool for the unification of biology. Nature Genetics. 2000, 25: 25-29. 10.1038/75556
Consortium GO: The Gene Ontology project in 2008. Nucleic Acids Research. 2008, 36: D440-D444. 10.1093/nar/gkm883
Phan IQ, Pilbout SF, Fleischmann W, Bairoch A: NEWT, a new taxonomy portal. Nucleic Acids Research. 2003, 31: 3822-3823. 10.1093/nar/gkg516
Sayers E, Barrett T, Benson D, Bryant S, Canese K, Chetvernin V, Church D, DiCuccio M, Edgar R, Federhen S, Feolo M, Geer L, Helmberg W, Kapustin Y, Landsman D, Lipman D, Madden T, Maglott D, Miller V, Mizrachi I, Ostell J, Pruitt K, Schuler G, Sequeira E, Sherry S, Shumway M, Sirotkin K, Souvorov A, Starchenko G, Tatusova T, Wagner L, Yaschenko E, Ye J: Database resources of the National Center for Biotechnology Information. Nucleic Acids Research. 2009, D5-15. 10.1093/nar/gkn741.
, : The Universal Protein Resource (UniProt). Nucleic Acids Research. 2009, 37: :D169-D174. 10.1093/nar/gkn664
Matthews L, Gopinath G, Gillespie M, Caudy M, Croft D, de Bono B, Garapati P, Hemish J, Hermjakob H, Jassal B, Kanapin A, Lewis S, Mahajan S, May B, Schmidt E, Vastrik I, Wu G, Birney E, Stein L, D'Eustachio P: Reactome knowledgebase of human biological pathways and processes. Nucleic Acids Research. 2009, 37: D619-D622. 10.1093/nar/gkn863
Le Novère N, Bornstein B, Broicher A, Courtot M, Donizelli M, Dharuri H, Li L, Sauro H, Schilstra M, Shapiro B, Snoep JL, Hucka M: BioModels Database: a free, centralized database of curated, published, quantitative kinetic models of biochemical and cellular systems. Nucleic Acids Research. 2006, D689-D691. 34 Database,
In pursuit of systems. Nature. 2005, 435: http://www.nature.com/nature/journal/v435/n7038/full/435001a.html
BioModels Database. http://www.ebi.ac.uk/biomodels/
Hines ML, Morse T, Migliore M, Carnevale NT, Shepherd GM: ModelDB: A Database to Support Computational Neuroscience. J Comput Neurosci. 2004, 17: 7-11. 10.1023/B:JCNS.0000023869.22017.2e
Snoep JL, Olivier BG: Java Web Simulation (JWS); a web based database of kinetic models. Mol Biol Rep. 2002, 29 (1-2): 259-263. 10.1023/A:1020350518131
Lloyd CM, Lawson JR, Hunter PJ, Nielsen PF: The CellML Model Repository. Bioinformatics. 2008, 24: 2122-2123. 10.1093/bioinformatics/btn390
CellML Model Repository. http://models.cellml.org/
Le Novère N: Model storage, exchange and integration. BMC Neuroscience. 2006, 7 (Suppl 1): S11- 10.1186/1471-2202-7-S1-S11
BioModels.net Initiative. http://biomodels.net/
Sun Microsystems Java programming language. http://java.sun.com/
XSL Transformations (XSLT) Version 1.0. http://www.w3.org/TR/xslt
JavaServer Pages Technology. http://java.sun.com/products/jsp/
Extensible Markup Language (XML) 1.0. http://www.w3.org/TR/xml/
Apache Xindice. http://xml.apache.org/xindice/
Apache Lucene. http://lucene.apache.org/
BioModels Database: mirror at the California Institute of Technology. http://biomodels.caltech.edu
SciLab: the open source platform for numerical computation. http://www.scilab.org/
, : BioPAX-biological pathways exchange language. Level 1, Version 1.0. 2004,
BioPAX: Biological Pathways Exchange. http://www.biopax.org/
Portable Network Graphics (PNG) Specification. http://www.w3.org/TR/PNG/
PNG (Portable Network Graphics) Specification, Version 1.0. http://tools.ietf.org/html/rfc2083
Scalable Vector Graphics (SVG). http://www.w3.org/Graphics/SVG/
Moraru II, Schaff JC, Slepchenko BM, Blinov ML, Morgan F, Lakshminarayana A, Gao F, Li Y, Loew LM: Virtual Cell modelling and simulation software environment. IET Systems Biology. 2008, 2: 352-362. 10.1049/iet-syb:20080102
National Resource for Cell Analysis and Modeling (NRCAM). http://www.nrcam.uchc.edu/
BioModels.net convertors: to and from SBML. http://www.ebi.ac.uk/compneur-srv/sbml/convertors/
DOI: Digital Object Identifier. http://www.doi.org/
CiteXplore literature searching. http://www.ebi.ac.uk/citexplore/
Bornstein BJ, Keating SM, Jouraku A, Hucka M: LibSBML: an API library for SBML. Bioinformatics. 2008, 24 (6): 880-881. 10.1093/bioinformatics/btn051
Rodriguez N, Donizelli M, Le Novère N: SBMLeditor: effective creation of models in the Systems Biology Markup Language (SBML). BMC Bioinformatics. 2007, 8: 79- 10.1186/1471-2105-8-79
Mathematical Markup Language (MathML). http://www.w3.org/Math/
Hoops S, Sahle S, Gauges R, Lee C, Pahle J, Simus N, Singhal M, Xu L, Mendes P, Kummer U: COPASI-a COmplex PAthway SImulator. Bioinformatics. 2006, 22 (24): 3067-3074. 10.1093/bioinformatics/btl485
Mendes P, Hoops S, Sahle S, Gauges R, Dada J, Kummer U: Computational modeling of biochemical networks using COPASI. Methods in Molecular Biology. 2009, 500: 17-59.
Machné R, Finney A, Müller S, Lu J, Widder S, Flamm C: The SBML ODE Solver Library: a native API for symbolic and fast numerical analysis of reaction networks. Bioinformatics. 2006, 22: :1406-1407. 10.1093/bioinformatics/btl086
Bergmann FT, Sauro HM: SBW - a modular framework for systems biology. WSC '06: Proceedings of the 38th conference on Winter simulation, Winter Simulation Conference. 2006, 1637-1645.
Le Novère N, Finney A, Hucka M, Bhalla US, Campagne F, Collado-Vides J, Crampin EJ, Halstead M, Klipp E, Mendes P, Nielsen P, Sauro H, Shapiro B, Snoep JL, Spence HD, Wanner BL: Minimum information requested in the annotation of biochemical models (MIRIAM). Nature Biotechnology. 2005, 23 (12): 1509-1515. 10.1038/nbt1156
Degtyarenko K, de Matos P, Ennis M, Hastings J, Zbinden M, McNaught A, Alcántara R, Darsow M, Guedj M, Ashburner M: ChEBI: a database and ontology for chemical entities of biological interest. Nucleic Acids Research. 2008, 36: D344-D350. 10.1093/nar/gkm791
Fleischmann A, Darsow M, Degtyarenko K, Fleischmann W, Boyce S, Axelsen KB, Bairoch A, Schomburg D, Tipton KF, Apweiler R: IntEnz, the integrated relational enzyme database. Nucleic Acids Research. 2004, D434-D437. 32 Database,
Kanehisa M, Goto S, Kawashima S, Okuno Y, Hattori M: The KEGG resources for deciphering the genome. Nucleic Acids Research. 2004, 32: D277-D280. 10.1093/nar/gkh063
Le Novère N, Courtot M, Laibe C: Adding semantics in kinetics models of biochemical pathways. Proceedings of the 2nd International Symposium on experimental standard conditions of enzyme characterizations. 2007, http://www.beilstein-institut.de/index.php?id=196/
Laibe C, Le Novère N: MIRIAM Resources: tools to generate and resolve robust cross-references in Systems Biology. BMC Systems Biology. 2007, 1: 58- 10.1186/1752-0509-1-58
Krause F, Liebermeister W: A simple clustering of the BioModels database using semanticSBML. BioModels meeting 2009. 2009,
Krause F, Uhlendorf J, Lubitz T, Schulz M, Klipp E, Liebermeister W: Annotation and merging of SBML models with semanticSBML. Bioinformatics. 2009,
Ermentrout B: Simulating, Analyzing, and Animating Dynamical Systems: A Guide to XPPAUT for Researchers and Students. 2002, Society for Industrial Mathematics,
Dräger A, Planatscher H, Wouamba DM, Schröder A, Hucka M, Endler L, Golebiewski M, Müller W, Zell A: SBML2LaTeX: conversion of SBML files into human-readable reports. Bioinformatics. 2009, 25 (11): 1455-1456. 10.1093/bioinformatics/btp170
Le Novère N, Hucka M, Mi H, Moodie S, Schreiber F, Sorokin A, Demir E, Wegner K, Aladjem MI, Wimalaratne SM, Bergman FT, Gauges R, Ghazal P, Kawaji H, Li L, Matsuoka Y, Villéger A, Boyd SE, Calzone L, Courtot M, Dogrusoz U, Freeman TC, Funahashi A, Ghosh S, Jouraku A, Kim S, Kolpakov F, Luna A, Sahle S, Schmidt E, Watterson S, Wu G, Goryanin I, Kell DB, Sander C, Sauro H, Snoep JL, Kohn K, Kitano H: The Systems Biology Graphical Notation. Nature Biotechnology. 2009, 27 (8): 735-741. 10.1038/nbt.1558
Olivier BG, Snoep JL: Web-based kinetic modelling using JWS Online. Bioinformatics. 2004, 20 (13): 2143-2144. 10.1093/bioinformatics/bth200
BioModels Database: model of the month. http://www.ebi.ac.uk/biomodels-main/modelmonth
Li C, Courtot M, Le Novère N, Laibe C: BioModels.net Web Services, a free and integrated toolkit for computational modelling software. Briefings in Bioinformatics. 2009,
Funahashi A, Morohashi M, Kitano H, Tanimura N: CellDesigner: a process diagram editor for gene-regulatory and biochemical networks. BIOSILICO. 2003, 1 (5): 159-162. 10.1016/S1478-5382(03)02370-9.
Web Services Description Language (WSDL). http://www.w3.org/TR/wsdl
SOAP Messaging Framework. http://www.w3.org/TR/soap/
Hypertext Transfer Protocol - HTTP/1.1. http://tools.ietf.org/html/rfc2616
BioModels Database: Web Services. http://www.ebi.ac.uk/biomodels/webservices.html
JWS Online. http://jjj.biochem.sun.ac.za/
Sivakumaran S, Hariharaputran S, Mishra J, Bhalla US: The Database of Quantitative Cellular Signaling: management and analysis of chemical kinetic models of signaling networks. Bioinformatics. 2003, 19: 408-415. 10.1093/bioinformatics/btf860
DOQCS: Database Of Quantitative Cellular Signaling. http://doqcs.ncbs.res.in/
Hucka M, Bergmann F, Hoops S, Keating SM, Sahle S, Wilkinson DJ: The Systems Biology Markup Language (SBML): Language Specification for Level 3 Version 1 Core (Release 1 Candidate). Available from Nature Precedings. 2010, http://dx.doi.org/10.1038/npre.2010.4123.1
Lister AL, Pocock M, Taschuk M, Wipat A: SAINT: A Lightweight Integration Environment for Model Annotation. Bioinformatics. 2009, 25: 3026-3027. 10.1093/bioinformatics/btp523
Swainston N, Mendes P: libAnnotationSBML: a library for exploiting SBML annotations. Bioinformatics. 2009, 25 (17): 2292-2293. 10.1093/bioinformatics/btp392
Köhn D, Le Novère N: SED-ML - An XML Format for the Implementation of the MIASE Guidelines. 6th conference on Computational Methods in Systems Biology, Lecture Notes in Bioinformatics. Edited by: Heiner M, Uhrmacher AM. 2008, 5307: 176-190.
GNU General Public License. http://www.gnu.org/copyleft/gpl.html
BioModels Database project on SourceForge.net. http://sourceforge.net/projects/biomodels/
BioModels Database is being developed by the Computational Systems Neurobiology group (EMBL-European Bioinformatics Institute, United-Kingdom). Collaborators include the SBML Team (California Institute of Technology, USA), the Database Of Quantitative Cellular Signalling (National Center for Biological Sciences, India), the Virtual Cell (University of Connecticut Health Center, USA), JWS Online (Stellenbosch University, ZA) and the CellML team (Auckland Bioengineering Institute, NZ).
The development of BioModels Database is funded by the European Molecular Biology Laboratory (Computational Systems Neurobiology group), the Biotechnology and Biological Sciences Research Council (Computational Systems Neurobiology group, grant BB/F010516/1), the National Institute of General Medical Sciences (SBML Team and Computational Systems Neurobiology group, grant R01 GM070923). BioModels Database also benefited from funds of the DARPA (Herbert Sauro, Washington University, Seattle, USA).
The authors would like to thank the members of the BioModels Database Scientific Advisory Board (SAB): Upinder Bhalla, MH, Pedro Mendes, Ion Moraru, Herbert Sauro and JLS. All the contributors of the models of the month: VC, Ranjita Dutta Roy, LE, EH, Noriko Hiroi, Nick Juty, Christian Knüpfer, NLN, LL, Michele Mattioni, Antonia Mayer, Anika Oellrich, Renaud Schiappa, MIS, Dominic P. Tolle and Judith Zaugg. The authors also thank Nick Juty, who read and corrected this manuscript thoroughly.
The BioModels Database team would also like to express their gratitude to all the people who have given BioModels Database the opportunity to keep improving with their continuous support, including the contribution of models, software tools and constructive comments and criticisms.
The work presented here was carried out by the authors in collaboration: CLi and MD, original developers of BioModels Database; LL and NR, converter and export developers; HD, LE, VC and EH, created, curated and annotated models; MIS, coordinator of the Model of the month; AH, developed the SBML to BioPAX converter; JLS, developed and maintained JWS Online; MH, provided coordination, SBML knowledge and grant support; NLN, curation, project instigation and coordination; CLaibe, feature development and current project coordinator. All authors have read and approved the final manuscript.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.