Functional network of glycan-related molecules: Glyco-Net in Glycoconjugate Data Bank
© Hashimoto et al. 2010
Received: 9 November 2009
Accepted: 29 June 2010
Published: 29 June 2010
Skip to main content
© Hashimoto et al. 2010
Received: 9 November 2009
Accepted: 29 June 2010
Published: 29 June 2010
Glycans are involved in a wide range of biological process, and they play an essential role in functions such as cell differentiation, cell adhesion, pathogen-host recognition, toxin-receptor interactions, signal transduction, cancer metastasis, and immune responses. Elucidating pathways related to post-translational modifications (PTMs) such as glycosylation are of growing importance in post-genome science and technology. Graphical networks describing the relationships among glycan-related molecules, including genes, proteins, lipids and various biological events are considered extremely valuable and convenient tools for the systematic investigation of PTMs. However, there is no database which dynamically draws functional networks related to glycans.
We have created a database called Glyco-Net http://www.glycoconjugate.jp/functions/, with many binary relationships among glycan-related molecules. Using search results, we can dynamically draw figures of the functional relationships among these components with nodes and arrows. A certain molecule or event corresponds to a node in the network figures, and the relationship between the molecule and the event are indicated by arrows. Since all components are treated equally, an arrow is also a node.
In this paper, we describe our new database, Glyco-Net, which is the first database to dynamically show networks of the functional profiles of glycan related molecules. The graphical networks will assist in the understanding of the role of the PTMs. In addition, since various kinds of bio-objects such as genes, proteins, and inhibitors are equally treated in Glyco-Net, we can obtain a large amount of information on the PTMs.
Glycans are involved in a wide range of biological process, and they play an essential role in functions such as cell-cell interaction, pathogen-host recognition, toxin-receptor interaction, signal transduction.[1–5] One of their roles are modulating the functions of many proteins and lipids through post-translational modifications (PTMs). Glycomics is the study of the structural and functional aspects of various glycoconjugates, such as glycoproteins, glycolipids, and proteoglycans produced during PTMs in cells and organisms. The field of glycomics has lagged behind that of genomics and proteomics, mainly because of the inherent difficulties in the analysis of glycan structure and function. However, glycomics is now an emerging field due to exceptional progress in the development of modern experimental techniques and equipment including mass spectrometry (MS), high-performance liquid chromatography (HPLC), nuclear magnetic resonance (NMR) and knockout mice.[8–15] It is expected that a large quantity of information concerning glycan structure and function will be accumulated. Bioinformatics of glycans, which used to suffer from a lack of data in early studies, is now becoming a practical field in the biological sciences related to PTMs. Therefore, the construction of a new class glycan database indicating the relationship between structures and their functions and the development of related tools is strongly required from biological, pharmaceutical and medical fields.
There are several groups energetically developing both public and commercial glycan databases. For instance, some of the public databases are KEGG [16–18], SWEET-DB  in the GLYCOSCIENCES.de , the United States Consortium for Functional Glycomics (CFG) , and GlycoSuiteDB in The Expert Protein Analysis System (ExPASy) Proteomics Server . GlycoMinds http://www.glycominds.com is known as the commercial database. The Complex Carbohydrate Structure Database (CCSD) [23, 24] is the first database of glycan structures. The CCSD was developed in the 1980s and 1990s by the CarbBank Project and was discontinued in 1999 due to the lack of funding. The data of the CCSD are currently included in the public glycan databases as mentioned above. Although the web service of GLYCOSCIENCES.de is currently not available, they are trying to organize the new base for their database. The Carbohydrate-Active Enzyme (CAZy) database is known as a database of enzymes relating to glycans, such as glycosyltransferases and lectins . All of these databases with the exception of CAZy are focused on glycan structures. The SWEET-DB mainly develops the tools with which to treat the glycan structures and geometry [26–29]. The CFG is constructing carbohydrate chips to investigate the interaction between carbohydrates and proteins for therapy, and databases for functional glycomics, such as an annotated database of mass spectrometry. The KEGG GLYCAN database also has over 10,000 glycan structures; in addition, a manually drawn graphical pathway for various bio-molecules is included in KEGG PATHWAY. The Expert Protein Analysis System (ExPASy) http://www.expasy.org/ which includes a protein sequence database also holds many graphical figures of biochemical pathways. Krambeck and Betenbaugh  and Liu et al.  have developed a system which dynamically constructs a structural network regarding N- and O-linked glycans, respectively.
Recently, emerging analytical techniques enabled us to obtain a great deal of information about the relationships, not only between the glycan structures and functions, but also among glycans, phenotypes of diseases and expression of glycan-related genes. In this situation, graphical networks describing the relationships among glycan-related molecules, including genes, proteins, lipids and biological events are considered to become potential tools for accelerating the integrated study of PTMs. Although the KEGG PATHWAY and Biochemical Pathways in ExPASy http://us.expasy.org/cgi-bin/show_thumbnails.pl have graphical network figures, these are all manually selected and organized. Since glycomics and glycoproteomics data are expected to increase substantially, it is clear that the network figures generated from the glycan structures should be drawn based on the available updated data in order to give the most current overview of glycan functions.
We have endeavoured to dynamically draw figures of functional networks among glycans, genes, inhibitors, lipids, glycosphingolipids, various biological events, diseases and carbohydrate-binding proteins such as glycosyltransferases and lectins (hereafter, these are denoted as "bio-objects") for several years. Dynamic generation of the network figures within bio-objects is more progressive than networks of biosynthesis with static pictures such as KEGG PATHWAY and ExPASy. Glyco-Net was constructed as a part of the Glycoconjugate Data Bank (GDB) http://www.glycoconjugate.jp/. Each bio-object in Glyco-Net is linked to the other databases to obtain more detailed information, since Glyco-Net has been focusing on the collection of the functional relationships among bio-objects from research articles. In this paper, we describe the concept and status of Glyco-Net.
List of linkages from Glyco-Net
URL and comments
Only whole sugar structures, we can not define the linkage to partial structures.
This linkage is defined through the EC number of protein.
Glyco-Net is expected to be used as an interface between the various biological databases and the functional network of glycan-related bio-objects. The current notation of carbohydrate structures is ad hoc. There are various structural databases, such as KEGG GLYCAN, GLYCOSCIENCES.de, and CFG. Thus, it is only necessary to give the linkage from our glycan data to enter these databases. At the moment, Glyco-Net holds limited linkage to carbohydrate structure databases. We will modify the nomenclature of the carbohydrate structures with a more standard one, such as GLYDE , to make it accessible in other databases with the carbohydrate structure as a key.
Implementation of our database was carried out with a MySQL database system and a Linux environment. The interface web page was written in JavaServer Pages (JSP). The search engine and the drawing method were written in Java Programming language.
Glyco-Net aims to collect binary relations that could be extracted by going through the scientific articles such as research papers, i.e. evidence of functions by specific assays. These data were manually curated from the "Handbook of Glycosyltransferases and Related Genes."  Functions with different experimental conditions in the assay are all recognized as different functions and existed in the network figure at the same time. It is necessary to classify the functions with ontology according to the experimental conditions and/or the environment where the bio-objects are in so that the quantitative discussion can be carried out.
Annotation and data in Glyco-Net
Attribute of data
Number of data
Relationships between two objects.
Description of the object which constructs the functions, such as carbohydrates, related genes, glycosyltransferases, lipids, glycolipids, diseases, biological events, etc.
Experimental information which elucidates the functions.
Currently, Glyco-Net has 3,724 objects (1,149 objects for glycosyltransferases, 2,480 objects for genes, and 95 pieces of data concerning diseases caused by or related to carbohydrate abnormalities), 2,302 pieces of function data, and 1,201 pieces of data concerning the assay that verifies the functions of the glycoconjugates. Records (1,332) are also contained in the "article" category. Data which referenced from any articles that published after Reference 34 will be updated in the future. Furthermore, we have been developing ontology regarding Glyco-Net.
The main page of the Glycoconjugate Data Bank http://www.glycoconjugate.jp provides three links to databases, including 1) "Resources", which is a database of carbohydrate-related compounds, 2) "Structure" , which is a 3D structure database of glycans extracted from the Protein Data Bank and 3) "Glyco-Net", which shows the functional network of carbohydrate-related molecules. We can browse several function lists and network figures by clicking the bio-object type or typing the keyword to see the details of the functions.
In this paper, we describe our new database, Glyco-Net, which shows graphical networks of glycan-related bio-objects such as genes, proteins, glycoproteins, lipids, glycolipids, and glycans. Each bio-object can easily be linked to the available databases such as GenBank, ExPASy, KEGG GLYCAN, GLYCOSCIENCES.de, CFG, and PubMed, though the linkage is limited from the bio-object tables at the present time. Dynamic generation of the functional network figures among bio-objects is expected to have great advantages compared with KEGG PATHWAY and ExPASy which hold static figures for biosynthesis. Since various kinds of bio-objects such as genes, proteins and inhibitors are equally treated in Glyco-Net, a large amount of information on the PTMs can be obtained. Although these characteristics are the novel implementation in the existing glycan databases, figures made by Glyco-Net are still complicated to adapt to a larger HOPS at this stage. In addition, the quantity of total data in Glyco-Net still remains a small. Therefore, we are now constructing ontology for partly automatic curation from web articles. An automatic curation with ontology will become a quite powerful tool, even though collected data should be verified carefully by scientists. We will also develop a routine to clearly draw the functional network figures. Furthermore, the nomenclature of the glycan structure should be standardized in order to search the glycans in other structure-based carbohydrate databases without uncertainty. Use of GLYDE notation is found to be quite feasible, since only our database indicates the relationships among biological objects relating to glycans. As a result, the details of the objects have to be found in other databases, and we will have to increase the linkages from our objects to other databases. Thus, the establishment of the collaboration with researchers in bioinformatics and other biosciences to improve this new type of database is the significant asset for the further development of Glyc-Net.
URL of Glyco-Net is http://www.glycoconjugate.jp/functions/.
This work was supported in part by the Program of Founding Research Centers for Emerging and Reemerging Infectious Diseases and the National Project on Functional Glycoconjugates Research for New Industry, MEXT Japan and a grant for a "Development of System and Technology for Advanced Measurement an Analysis (SENTAN)" from Japan Science and Technology Agency (JST). This study was also supported in part by Grants-in-Aid for Regional R&D Proposal-Based Program from Northern Advancement Center for Science & Technology of Hokkaido Japan. The authors also thank Ms. Chikage Chikaoka and Dr. Yasuko Tanaka for their dedicated help in the curation of the data. The authors also thank Mr. Daisuke Murayama for the technical support in the implementation and the correction of the database. NM especially thanks to Ms. Kana Tosho for her dedicated help in preparation of the manuscript.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.