The most comprehensive database of metabolic reactions is KEGG [14, 15] which is the reason why KEGG was used as a skeleton. As a consequence reactions from other sources as for example METACYC [16] have to be entered manually. The KEGG data is represented as an expandable tree because this is an efficient form to display the complex relationships between pathways, reactions, enzymes and compounds. Reactions are child nodes of enzymes and vice versa because, eventually, one reaction may be catalyzed by several enzymes and, conversely, one enzyme may catalyze different reactions.
Reactions taken over from the KEGG database into the user-defined network carry the KEGG reaction identifier so that any alterations of reactions in the KEGG database can be easily detected and possibly transferred to the corresponding reactions in the user-defined data. For reactions not stored in KEGG the user defines an alphanumeric identifier. Transport processes are not contained in the KEGG database. They have to be entered manually using the same notation as for chemical reactions with the difference that substrates and products have different compartment attributes. The user can type compounds into the text-field of the biochemical equation using common names like "L-Citrulline", "Citrulline" or "Citrullin" and can insert compounds found on a comprehensive list of compounds. Compound names may be converted into the respective KEGG ID as for example "C00327" for L-Citrulline if the compound is contained in KEGG. All transport processes and those reactions that do not have a corresponding entry in KEGG are bundled under the tree branch "orphan" reactions or "transporters", respectively. METANNOGEN allows to specify more than one sub-cellular compartment for one single biochemical reaction which results in the generation of one dataset for each compartment. Labeling of all datasets with the user name and protection against unintended modification allows METANNOGEN to be used by a team of investigators.
Working with METANNOGEN – a case study
In the following the usage of METANNOGEN is explained using as an example the synthesis of carbamoylphosphate. The figure 3 shows the expandable tree with the tree node for Carbamoylphosphate synthesis I (CPS I) and the datasets for CPS I and CPS II as well as the citrulline transporter. The synthesis of carbamoylphosphate is the first step in the pyrimidine biosynthesis, the arginine biosynthesis, and the urea cycle. Formation of mitochondrial carbamoylphosphate used in the urea cycle is catalyzed by the enzyme carbamoylphosphate synthetase I (CPS I). The notation of the reaction in KEGG reads
2ATP + NH3 + CO2 + H2O ⇄ 2ADP + Orthophosphate + Carbamoylphosphate (1)
To add this reaction to the model the respective tree node with the reaction identifier R00149 and the EC number 6.3.4.16 needs to be marked in METANNOGEN. Selecting "new dataset" from the dataset menu creates a new dataset for the carbamoylphosphate synthetase reaction in the user-defined database (Figure 3). The input mask for the dataset appears in the right side of the screen. A toggle button with the traffic light symbols allows to exclude data sets from the stoichiometric matrix without deleting the dataset. This is a useful option if a reaction and its catalyzing enzymes are not reported to exist in the species considered by the user (e.g. human hepatocyte) whereas for a related species (e.g. rat hepatocytes) such evidences are available.
Checking the consistency of the stoichiometric matrices generated with active or inactive toggle button of such "likely" reactions may provide valuable heuristics for further experimental work. For the model of the liver metabolism the reaction of carbamoylphosphate synthetase is activated since the reaction takes place in human liver without doubt. The first two reactions of urea cycle, the formation of carbamoylphosphate and the ligation with ornithine, take place in the mitochondrial matrix. This knowledge is usually obtained from databases such as Brenda [10], UM-BBD [9], SABIO-RK or from scientific articles. For convenience, some databases are linked and the respective pages are opened in the Web browser when the links are clicked.
This set of databases is customizable. For the considered reaction of the mitochondrial carbamoylphosphate synthetase the compartment "MitoMx" (mitochondrial matrix) is selected by the user in the selection box for sub-cellular compartments. The newly created dataset can hold any kind of additional information on the reaction and the catalyzing enzyme. To keep track of the source of knowledge, notes and remarks taken from literature can be entered as free text. Pubmed abstracts can be referred to simply by their ID. The abstracts are automatically downloaded and shown with important keywords highlighted. If, for example, the reference PMID:7915141 for an article on carbamoylphosphate synthetase is contained in the text-field and the mouse is moved over it, the abstract is automatically shown and important search terms like "human" or "liver" are highlighted. In general a link to a database is formed by a database ID followed by colon, in this case PMID:, and an entry ID, in this case 7915141. An alternative syntax with curly brackets allows references to terms containing characters other than digits and letters as for example BRENDAec{6.3.4.16} or GOOGLE{carbamoylphosphate mitochondrial} (Figure 3). The URLs of databases can be edited in the customize menu. In KEGG most reactions are displayed graphically in (so-called KEGG-maps). METANNOGEN can display information within these graphical pathway views. To quickly locate the reaction of the CPS I in the graphical pathway view "urea cycle"' it can be marked in the object tree and is highlighted by a red frame in the pathway view. This allows to quickly locate a reaction, such as CPS I in the context of a certain pathway. Choosing "any" in the compartment selector of the KEGG-map of a pathway, here the urea cycle, all reactions of the user-defined network are highlighted by filled colored rectangles irrespectively of their compartments. Depending on the status of the exclusion toggle mentioned before, a filled green box indicates that it is included into the model in contrast to a red box which would indicate that it is currently excluded. If a specific compartment is selected for the KEGG-map a yellow box points out that this reaction does not exist in this but in another compartment. For our example this view reveals that the compounds citrulline_mitoMx and citrulline_cyto cannot be balanced because both reactants occur in only one reaction (ornithine transcarbamylase in the mitochondrial matrix and argininosuccinate synthetase in the cytosol). In such cases the user needs to search for a possible exchange processes of the corresponding metabolite across the membrane separating the two compartments. For citrulline, a transport across the inner mitochondrial membrane is well-described in the literature.
The corresponding notation in METANNOGEN reads
citrulline_cyto ⇄ citrulline_mitoMx (2)
To find out whether genes, transcripts and proteins have been identified for a particular enzymatic activity in men the ENSEMBL and SWISSPROT branches of the tree are helpful. They can be expanded from the reaction nodes. CPS 1 has the SWISSPROT entry CPSM_HUMAN and the ENSEMBL gene entry ENSG00000021826. A different Carbamoylphosphate synthetase CPS II catalyses the primary step of the purine synthesis. The reaction mechanism is different from that of CPS I in that the nitrogen does not originate from ammonia but from glutamine. Consequently the reaction ID and the enzyme code are different. Because the CPS II is a cytosolic enzyme "cytosol" must be selected. The SBML output for the two reactions is shown in figure 4.
Customization of METANNOGEN for specific projects
Depending on the type of mathematical model that has to be developed for a metabolic network different types of information on the kinetics and thermodynamics of reactions and transport processes are needed. In principle, the text-area for notes can be used for such information. However, to store all information in a unified and structured manner advanced users may also create specific GUI elements such as pull-down menus toggle buttons, text-fields and check-boxes in the Java source code. For this purpose the customize menu offers the possibility to generate a copy of the GUI Java file which than replaces the default GUI Java class in the running application. The user can edit this copy. The source text contains example code for three additional GUI-elements which can be activated by removing the "/*" and "*/" tags. This renders the lines enclosed in "/* ... */" active as soon as the modified source code is saved and the additional GUI elements immediately appear. In this example the additional GUI elements correspond to data fields named "FIELD1", "FIELD2" and "FIELD3'. Meaningful names can be given instead. By default additional data fields are not exported as SBML. Nevertheless, the SBML output can be adapted using the mechanism mentioned above [13].
Again it involves direct manipulate of source code and instant testing of the modified SBML writer at runtime. This is not critical because a copy of the original file is modified by the user. Elimination of this file immediately reverts the program to the original state. In the Java code the text contents of a field can be requested by invocation of the instance method of dataset objects String metannogenDataset#getField(String data_field_name).
Searches with Pubmed and Google
One major obstacle of finding relevant information using search engines is that biochemical enzymes may be named in many different ways. METANNOGEN meets this difficulty by combining synonymous names for the enzymes of interest by logical "or" to form a sensitive search query for Google or Pubmed. The searches typically result in a larger number of hits. For Pubmed abstracts additional aids are available. Moving the mouse-pointer over a Pubmed ID the abstract is downloaded into the local cache and displayed. All user defined keywords are highlighted in the abstract to quickly assess the relevance of the publication. For large numbers of publications this approach is much more efficient than opening the abstracts in the Web browser. Search queries can be entered as a hyper-references into the remark-field using the syntax PUBMED{carbamoylphosphate [TI] AND liver AND human}.