Volume 6 Supplement 3
A vertex similarity-based framework to discover and rank orphan disease-related genes
© Zhu et al.; licensee BioMed Central Ltd. 2012
Published: 17 December 2012
A rare or orphan disease (OD) is any disease that affects a small percentage of the population. While opportunities now exist to accelerate progress toward understanding the basis for many more ODs, the prioritization of candidate genes is still a critical step for disease-gene identification. Several network-based frameworks have been developed to address this problem with varied results.
We have developed a novel vertex similarity (VS) based parameter-free prioritizing framework to identify and rank orphan disease candidate genes. We validate our approach by using 1598 known orphan disease-causing genes (ODGs) representing 172 orphan diseases (ODs). We compare our approach with a state-of-art parameter-based approach (PageRank with Priors or PRP) and with another parameter-free method (Interconnectedness or ICN). Our results show that VS-based approach outperforms ICN and is comparable to PRP. We further apply VS-based ranking to identify and rank potential novel candidate genes for several ODs.
We demonstrate that VS-based parameter-free ranking approach can be successfully used for disease candidate gene prioritization and can complement other network-based methods for candidate disease gene ranking. Importantly, our VS-ranked top candidate genes for the ODs match the known literature, suggesting several novel causal relationships for further investigation.
In the USA, a rare or orphan disease (OD) is defined as a disease that affects fewer than 200,000 inhabitants . According to an estimate, there are as many as 8000 ODs, many of which are known to be of genetic origin, affect children at a very early age and are life-threatening and/or chronically debilitating [2, 3]. Although, the advent of next-generation sequencing technologies accelerates the disease gene discovery pipeline, the prioritization of candidate genes is still a critical step for disease-gene identification . We , and several other earlier studies [6–9], have shown that genes associated with phenotypically close disorders tend to share molecular signatures which include similar expression profiles, participation in the same biological processes or pathways, protein interactions or complexes, literature co-citation. We have recently completed a global analysis of all ODs that have at least one known mutant gene associated (data from Orphanet  and the OMIM databases ) and show that the relationship between ODs cannot be fully captured by the gene-based network alone. Integrating diverse biomedical and genomic data types can facilitate hypotheses synthesis about disease causing mutant genes. Additionally, it can help in addressing an important question, namely, are there any candidate genes related to known causal genes for a disease? A useful way to approach this question is to rank the genes in a test set based on their similarity to a reference or 'seed' set. Such a "guilt by association" ranking approach has become an important way to prioritize candidate disease genes, such as the candidates found in genome-wide association or linkage studies . The genes within a locus shown to be linked to a particular disease, for example, can be prioritized based on their similarities to a reference set of known genes for that disease. We and others have developed several computational approaches which perform this task automatically [4, 13–23].
Network-based analyses have been equally successful in the identification and prioritization of disease candidate genes [6, 7, 24–31] especially where the genes are relatively less annotated. Network-based candidate gene ranking approaches can be broadly grouped into two categories: parameter-based and parameter-free methods. The parameter-based methods, such as PageRank with Priors (PRP ), Random Walk (RW ) and PRIoritizatioN and Complex Elucidation (PRINCE ), usually require additional auxiliary parameters that need to be trained by using available data sets. The PRP for instance needs a parameter β to control the probability of jumping back to the initial node , and the PRINCE algorithm uses a parameter to describe the relative importance of prior information . Since selecting optimal parameters could be a challenge, parameter-free approaches are preferred and considered as more user-friendly . Additionally, most parameter-based approaches take into account the global information in the entire network which often requires extensive computation. For example, in PRP, scores of all the nodes need to be updated iteratively until they converge. This process typically becomes extremely slow and inefficient especially when the network size is large. The parameter-free methods (e.g. Interconnectedness or ICN ), on the other hand, measure closeness of each candidate gene to known disease genes by taking into account direct link and the shared neighbors between two genes and tend to be less intensive computationally. The performance of parameter-free methods however is usually not comparable to parameter-based ranking approaches. Here, we report a novel network-based parameter-free framework for discovering and prioritizing candidate orphan disease genes. We specifically focus on two aspects: a) enhance prioritizing performance compared to current parameter-free methods and b) achieve a comparable performance to the parameter-based ones. We test, in a leave-one-out cross-validation setting, the utility of our approach in prioritizing genes for 172 ODs with at least five known causal genes (from Orphanet database ). We compare the performance of our method to two approaches, one each from parameter-based and parameter-free methods. To demonstrate the utility of our approach, we rank the immediate neighbors of known OD genes as potential novel candidate genes. The immediate neighboring gene sets were compiled using (a) protein interactions; (b) functional linkage network [32, 33]; and (c) literature co-citations.
Results and discussion
Vertex similarity (VS) based candidate gene ranking
Hypothesizing that genes that are connected to one or more known disease genes ("seed genes") are also probably implicated in the same disease, our goal is to find such novel candidate genes with "strong" associations to the seed genes. Our proposed VS-based candidate gene ranking approach is based on guilt-by-association principle. Two nodes or vertices are considered similar if their immediate neighbors in the network are themselves similar (common biological process, pathway, etc.). This principle is used to build a self-consistent matrix formulation of functional similarity that can be evaluated iteratively using only knowledge of the adjacency matrix of the network (based on functional annotations of genes). To this effect, we consider similarity between two vertices (genes) as a measure of their association strength in a network. Thus, two vertices with a high similarity are likely to be strongly related. In order to find the similarities between the seed and the candidate or test set genes, we introduce a vertex similarity measurement in our algorithm. Vertex similarity which defines the similarity of two vertices based on the structure of network has been used for information retrieval in World Wide Web  and in social network analysis . Similarity measurements, such as cosine similarity, have been successfully applied for computing similarity between documents which are described as vectors of keywords . However, to the best of our knowledge, there have been no reports of using it as a measure to compute similarity between two genes in a functional network and use it for ranking candidate disease genes.
where Γ A and Γ B represent the degree (number of connections or edges the node has to other nodes) of nodes A and B respectively, and σ shared = | Γ A ∩ Γ B | and represents the number of shared neighbor nodes by both A and B.
where C k is the node on the shortest path of A and B, and r is the discovery range that controls the maximum degree of separation (maximum r hops). In other words if the shortest path length between nodes A and B is more than r hops or if there is no shortest path between them, Sim(A, B) equals to 0.
where Sim(i, j) is the connection score between gene i and j. All candidate genes are then ranked based on these scores.
Comparison with other network-based prioritization algorithms
To compare the performance of our VS-based approach in candidate disease gene ranking, we select two methods, one each from parameter-based and parameter-free methods: PageRank with priors (PRP)  and Interconnectedness (ICN) . Parts of implementation of PRP are done using JUNG (Java Universal Network/Graph; jung.sourceforge.net) framework  as described earlier . To evaluate the performance of VS-based approach and compare it with two other methods, we used a leave-one-out cross-validation procedure. In each cross-validation trial, we removed a single OD causal gene ("target gene") from the data, and each of the 3 algorithms was evaluated by its success in assigning the rank to the "target gene" (see Methods for additional details).
When we increased the rank cut-off (k), VS-based approach performed equally well as PRP0.3. Additionally, compared to ICN, another parameter-free method, our VS-based approach performed better. We also note that VS outperformed PRP too when the back probability was set to 0.05.
However, since biological networks tend to have low diameters , we believe that low values of the steps/hops are preferable. Interestingly, a previous study provided examples of two real data applications where the number of hops or steps between disease causal genes (m) were set to two and reported that m = 2 was preferable over m = 1 . Since the edge information between two genes may be noisy or incomplete, we believe that our VS-based approach for novel candidate disease gene ranking is desirable as it takes into account alternative measures of pairwise interconnectedness and is not just limited to direct interactions or having a shared neighbor node.
Identifying and ranking novel OD candidate genes with VS-based approach
Examples of orphan diseases and VS-ranked top 5 candidate genes
No. of known causal genes
VS ranked top 5 candidate genes
Cone rod dystrophy
CRB1, RDH5, USH1C, EFEMP1, CABP4
Severe combined immunodeficiency
CD3G, JAK1, ZAP70, IL2RB, IL4
HES1, SAMD3, CYP19A1, XRCC3, USP1
PEX7, PHEX, ABCD2, ABCD1, ABCD3
Autosomal dominant Charcot-Marie-Tooth disease, type 2
STAT4, FAIM, MARCH5, STAT6, CRYGC
ZFY, ZFX, PTCH2, SOX9, AMH
Hereditary nonpolyposis colon cancer
MRC1, MSH3, CARKD, TRIT1, EXO1
Papillary or follicular thyroid carcinoma
CORO2A, ZBTB33, KIF11, AAAS, SEH1L
KCNE3, MINK1, KCNJ3, ALG10B, KCNJ9
GCKR, IDDM7, MAFA, ST6GAL1, INSRL
Among other examples, HES1, the top ranked gene for Fanconi anemia is a novel interacting protein of the Fanconi anemia core complex and cells depleted of HES1 exhibit a Fanconi anemia-like phenotype . The two top-ranked genes for gonadal dysgenesis, ZFX and ZFY, are known to function in sex differentiation and Zfx mutant mice are reported to have fewer germ cells than wild-type mice . Likewise, maturity-onset diabetes of the young type (MODY syndrome) is linked to kinetic alterations and regulation of glucokinase activity [46, 47] and in our ranking glucokinase receptor is the top ranked gene for MODY syndrome. Interestingly, a recent study in the Japanese families proposes GCKR as a susceptibility gene for familial diabetes . While our ranking provides further support for the involvement of the top-ranked ranked genes in the investigated ODs, it also suggests that the top scoring candidates that are not previously associated with these ODs could be potential candidates for further research.
The vertex similarity method (VS) is parameter-free approach for prioritizing candidate disease genes, where it calculates the similarity between nodes other than updating and training the parameters and data sets in every step. Through cross-validation experiments we show that VS outperforms ICN, another parameter-free method and that it is comparable to parameter-based methods such as PRP. We demonstrate the utility of VS-based parameter-free ranking approach in ranking OD candidate genes and importantly, these top ranked candidate genes for the ODs match the known literature, suggesting several novel causal relationships for further investigation.
Our approach however has some limitations. First, as with any training set dependent candidate gene ranking approaches, we assume that the OD causal genes we have yet to discover will be consistent with what is already known about an OD and/or its genetic basis which may not always be the case. Additionally, this also means that our approach currently cannot be used to rank novel candidate OD genes if an OD lacks known causal genes. Similarly, even if an OD has known causal genes but if there is no protein interactome data available then we cannot use VS for such cases. An alternative approach would be to consider other types of networks (coexpression or functional networks). Second, it is important to note that the prioritization by our approach can only be as accurate as the current protein interactome data are. Third, if a seed gene has only one known interaction then that interactant will be ranked higher.
The ODs and causal gene information was downloaded from Orphanet . We merged some of the OD subtypes of a single disease based on their given disorder names as described previously [5, 8]. From this, we selected 172 ODs that have at least five causal genes. The total number of genes across 172 selected diseases was 1598. The human protein interactome used in this study was compiled from several resources [49–54] with both redundant interactions and self-loops removed.
We performed a leave-one-out cross-validation using the 172 ODs and 1312 OD causing genes that exist in PPI network. We used the human protein interaction network as the global network to evaluate the prioritizing performance of VS and other two methods. The human protein interactome used in our study contains protein-protein interactions from large-scale yeast two-hybrid experiments [49, 50], computational predictions , and curation of the literature [52–54], with both redundant interactions and self-loops removed. The assembled PPI network consists of 11,765 proteins and 69,167 interactions. During each set of a validation trial, one seed gene ("target gene") from one of the selected 172 ODs was picked out and mixed with 99 random genes from PPI network to form a test set of 100 candidate genes. The remaining seed genes of an OD were used as the training set. The test set genes were then prioritized using the three approaches: PRP (with back probabilities 0.3. and 0.05), ICN, and VS-based approach. During each run, the rank of the "target gene" was noted. We evaluated the performance of each algorithm in terms of the success rate versus rank cut-off (k). If the "target gene" is ranked among the top k in a particular validation run, it is considered as a 'success'. The validation runs are repeated until all the seed genes have been used as the target gene and their ranks are obtained. The "success rate" is defined as the ratio of successful validation runs and the total validation runs for all the existing OD genes from 172 ODs. The same strategy was followed for all the three algorithms. In case of PRP which is a parameter-based method, we selected a back probability of 0.3 since we have shown previously that the performance of PRP in ranking candidate disease genes was best at p = 0.3 .
Test set genes for identifying and ranking OD candidate genes
For identifying and ranking novel OD candidate genes, we used the immediate neighbors of known OD genes as the test set. The immediate neighboring genes of selected ODs' causal genes were compiled based on (a) protein interactions; (b) functional linkage network [32, 33]; and (c) literature co-citations. The protein interactome data as described earlier was compiled from several resources [49–54]. The functional linkage network-based candidate gene sets were derived from two resources: (i) HumanNet, a probabilistic functional gene network of Homo sapiens  and (ii) functional protein interaction network built upon expert-curated pathways . The test set genes based on literature co-citations were compiled using the OMIM database. Briefly, for the selected ODs, we identified the corresponding OMIM records, which summarize results from publications about gene-disease relationships. For the OD mapped OMIM mapped records, we first extracted the cited literature (links to PubMed records for the references cited in an OMIM entry) in the OMIM records. Using this OD-related PubMed records, we extracted the related genes from the 'gene2pubmed' file from NCBI . For a given OD with known causal genes, we pooled all neighboring genes (immediate neighbors or direct interactants) of causal genes from different sources and used it as a test set for ranking in the global protein interactome using VS-based approach.
This work was supported in part by Cincinnati Digestive Health Sciences Center (Public Health Service Grant P30 DK078392) and Cincinnati Children's Hospital Medical Center.
This article has been published as part of BMC Systems Biology Volume 6 Supplement 3, 2012: Proceedings of The International Conference on Intelligent Biology and Medicine (ICIBM) - Systems Biology. The full contents of the supplement are available online at http://www.biomedcentral.com/bmcsystbiol/supplements/6/S3.
- Dear JWLP, Webb DJ: Are rare diseases still orphans or happily adopted? The challenges of developing and using orphan medicinal products. Br J Clin Pharmacol. 2006, 62 (3): 264-271. 10.1111/j.1365-2125.2006.02654.x.PubMed CentralView ArticlePubMed
- Schieppati AHJ, Daina E, Aperia A: Why rare diseases are an important medical and social issue. Lancet. 2008, 371 (9629): 2039-2041. 10.1016/S0140-6736(08)60872-7.View ArticlePubMed
- Stolk P, Willemen MJ, Leufkens HG: Rare essentials: drugs for rare diseases as essential medicines. Bull World Health Organ. 2006, 84 (9): 745-751. 10.2471/BLT.06.031518.PubMed CentralView ArticlePubMed
- Piro RM, Di Cunto F: Computational approaches to disease-gene prediction: rationale, classification and successes. FEBS J. 2012
- Zhang M, Zhu C, Jacomy A, Lu LJ, Jegga AG: The orphan disease networks. Am J Hum Genet. 2011, 88 (6): 755-766. 10.1016/j.ajhg.2011.05.006.PubMed CentralView ArticlePubMed
- Wu X, Jiang R, Zhang MQ, Li S: Network-based global inference of human disease genes. Mol Syst Biol. 2008, 4: 189-PubMed CentralView ArticlePubMed
- Vanunu O, Magger O, Ruppin E, Shlomi T, Sharan R: Associating genes and protein complexes with disease via network propagation. PLoS Comput Biol. 2010, 6 (1): e1000641-10.1371/journal.pcbi.1000641.PubMed CentralView ArticlePubMed
- Goh KI, Cusick ME, Valle D, Childs B, Vidal M, Barabasi AL: The human disease network. Proc Natl Acad Sci USA. 2007, 104 (21): 8685-8690. 10.1073/pnas.0701361104.PubMed CentralView ArticlePubMed
- Feldman I, Rzhetsky A, Vitkup D: Network properties of genes harboring inherited disease mutations. Proc Natl Acad Sci USA. 2008, 105 (11): 4323-4328. 10.1073/pnas.0701722105.PubMed CentralView ArticlePubMed
- Ayme S: [Orphanet, an information site on rare diseases]. Soins. 2003, 46-47. 672
- Hamosh A, Scott AF, Amberger J, Valle D, McKusick VA: Online Mendelian Inheritance in Man (OMIM). Hum Mutat. 2000, 15 (1): 57-61. 10.1002/(SICI)1098-1004(200001)15:1<57::AID-HUMU12>3.0.CO;2-G.View ArticlePubMed
- Hardy J, Singleton A: Genomewide association studies and human disease. N Engl J Med. 2009, 360 (17): 1759-1768. 10.1056/NEJMra0808700.PubMed CentralView ArticlePubMed
- Adie EA, Adams RR, Evans KL, Porteous DJ, Pickard BS: Speeding disease gene discovery by sequence based candidate prioritization. BMC Bioinformatics. 2005, 6: 55-10.1186/1471-2105-6-55.PubMed CentralView ArticlePubMed
- Adie EA, Adams RR, Evans KL, Porteous DJ, Pickard BS: SUSPECTS: enabling fast and effective prioritization of positional candidates. Bioinformatics. 2006, 22 (6): 773-774. 10.1093/bioinformatics/btk031.View ArticlePubMed
- Aerts S, Lambrechts D, Maity S, Van Loo P, Coessens B, De Smet F, Tranchevent LC, De Moor B, Marynen P, Hassan B, Carmeliet P, Moreau Y: Gene prioritization through genomic data fusion. Nat Biotechnol. 2006, 24 (5): 537-544. 10.1038/nbt1203.View ArticlePubMed
- Chen J, Bardes EE, Aronow BJ, Jegga AG: ToppGene Suite for gene list enrichment analysis and candidate gene prioritization. Nucleic Acids Res. 2009, W305-311. 37 Web server
- Chen J, Xu H, Aronow BJ, Jegga AG: Improved human disease candidate gene prioritization using mouse phenotype. BMC Bioinformatics. 2007, 8: 392-10.1186/1471-2105-8-392.PubMed CentralView ArticlePubMed
- Freudenberg J, Propping P: A similarity-based method for genome-wide prediction of disease-relevant human genes. Bioinformatics. 2002, 18 (Suppl 2): S110-115. 10.1093/bioinformatics/18.suppl_2.S110.View ArticlePubMed
- Thornblad TA, Elliott KS, Jowett J, Visscher PM: Prioritization of positional candidate genes using multiple web-based software tools. Twin Res Hum Genet. 2007, 10 (6): 861-870. 10.1375/twin.10.6.861.View ArticlePubMed
- Tiffin N, Adie E, Turner F, Brunner HG, van Driel MA, Oti M, Lopez-Bigas N, Ouzounis C, Perez-Iratxeta C, Andrade-Navarro MA, Adeyemo A, Patti ME, Semple CA, Hide W: Computational disease gene identification: a concert of methods prioritizes type 2 diabetes and obesity candidate genes. Nucleic Acids Res. 2006, 34 (10): 3067-3081. 10.1093/nar/gkl381.PubMed CentralView ArticlePubMed
- Tiffin N, Kelso JF, Powell AR, Pan H, Bajic VB, Hide WA: Integration of text- and data-mining using ontologies successfully selects disease gene candidates. Nucleic Acids Res. 2005, 33 (5): 1544-1552. 10.1093/nar/gki296.PubMed CentralView ArticlePubMed
- Turner FS, Clutterbuck DR, Semple CA: POCUS: mining genomic sequence annotation to predict disease genes. Genome Biol. 2003, 4 (11): R75-10.1186/gb-2003-4-11-r75.PubMed CentralView ArticlePubMed
- Zhu M, Zhao S: Candidate gene identification approach: progress and challenges. International journal of biological sciences. 2007, 3 (7): 420-427.PubMed CentralView ArticlePubMed
- Sam L, Liu Y, Li J, Friedman C, Lussier YA: Discovery of protein interaction networks shared by diseases. Pacific Symposium on Biocomputing. 2007, 76-87.
- Goehler H, Lalowski M, Stelzl U, Waelter S, Stroedicke M, Worm U, Droege A, Lindenberg KS, Knoblich M, Haenig C, Herbst M, Suopanki J, Scherzinger E, Abraham C, Bauer B, Hasenbank R, Fritzsche A, Ludewig AH, Bussow K, Coleman SH, Gutekunst CA, Landwehrmeyer BG, Lehrach H, Wanker EE: A protein interaction network links GIT1, an enhancer of huntingtin aggregation, to Huntington's disease. Molecular cell. 2004, 15 (6): 853-865. 10.1016/j.molcel.2004.09.016.View ArticlePubMed
- Lage K, Karlberg EO, Storling ZM, Olason PI, Pedersen AG, Rigina O, Hinsby AM, Tumer Z, Pociot F, Tommerup N, et al: A human phenome-interactome network of protein complexes implicated in genetic disorders. Nat Biotechnol. 2007, 25 (3): 309-316. 10.1038/nbt1295.View ArticlePubMed
- Kohler S, Bauer S, Horn D, Robinson PN: Walking the interactome for prioritization of candidate disease genes. Am J Hum Genet. 2008, 82 (4): 949-958. 10.1016/j.ajhg.2008.02.013.PubMed CentralView ArticlePubMed
- Chen J, Aronow BJ, Jegga AG: Disease candidate gene identification and prioritization using protein interaction networks. BMC Bioinformatics. 2009, 10 (73):
- Hsu C, Huang Y, Hsu C, Yang U: Prioritizing disease candidate genes by a gene interconnectedness-based approach. BMC Genomics. 2011, 12 (3): S25-10.1186/1471-2164-12-S3-S25.PubMed CentralView ArticlePubMed
- Chen X, Yan GY, Liao XP: A novel candidate disease genes prioritization method based on module partition and rank fusion. OMICS. 2010, 14 (4): 337-356. 10.1089/omi.2009.0143.View ArticlePubMed
- Sun PG, Gao L, Han S: Prediction of human disease-related gene clusters by clustering analysis. International journal of biological sciences. 2010, 7 (1): 61-73.
- Wu G, Feng X, Stein L: A human functional protein interaction network and its application to cancer data analysis. Genome Biol. 2010, 11 (5): R53-10.1186/gb-2010-11-5-r53.PubMed CentralView ArticlePubMed
- Lee I, Blom UM, Wang PI, Shim JE, Marcotte EM: Prioritizing candidate disease genes by network-based boosting of genome-wide association data. Genome Res. 2011, 21 (7): 1109-1121. 10.1101/gr.118992.110.PubMed CentralView ArticlePubMed
- Kleinberg J: Authoritative sources in a hyperlinked environment. Journal of the ACM (JACM). 1999, 46 (5): 29-View Article
- Leicht EA, Holme P, Newman MEJ: Vertex similarity in networks. Physical Review E. 2006, 73 (2):
- Banerjee A, Dhillon IS, Ghosh J, Sra S: Clustering on the unit hypersphere using von mises-sher distributions. Journal of Machine Learning Research. 2005, 6: 1345-1382.
- Madadhain J, Fisher D, Smyth P, White S, Boey Y: Analysis and visualization of network data using JUNG. Journal of Statistical Software. 2005, 10 (2): 1-35.
- Gillis J, Pavlidis P: The role of indirect connections in gene networks in predicting function. Bioinformatics. 2011, 27 (13): 1860-1866. 10.1093/bioinformatics/btr288.PubMed CentralView ArticlePubMed
- Newman M: The structure and function of complex networks. SIAM Review. 2003, 45 (2): 167-256. 10.1137/S003614450342480.View Article
- Yip AM, Horvath S: Gene network interconnectedness and the generalized topological overlap measure. BMC Bioinformatics. 2007, 8: 22-10.1186/1471-2105-8-22.PubMed CentralView ArticlePubMed
- Littink KW, Koenekoop RK, van den Born LI, Collin RW, Moruz L, Veltman JA, Roosing S, Zonneveld MN, Omar A, Darvish M, Lopez I, Kroes HY, van Genderen MM, Hoyng CB, Rohrschneider K, van Schooneveld MJ, Cremers FP, den Hollander AI: Homozygosity mapping in patients with cone-rod dystrophy: novel mutations and clinical characterizations. Invest Ophthalmol Vis Sci. 2010, 51 (11): 5943-5951. 10.1167/iovs.10-5797.PubMed CentralView ArticlePubMed
- Littink KW, van Genderen MM, Collin RW, Roosing S, de Brouwer AP, Riemslag FC, Venselaar H, Thiadens AA, Hoyng CB, Rohrschneider K, den Hollander AI, Cremers FP, van den Born LI: A novel homozygous nonsense mutation in CABP4 causes congenital cone-rod synaptic disorder. Invest Ophthalmol Vis Sci. 2009, 50 (5): 2344-2350. 10.1167/iovs.08-2553.View ArticlePubMed
- Pellikka M, Tanentzapf G, Pinto M, Smith C, McGlade CJ, Ready DF, Tepass U: Crumbs, the Drosophila homologue of human CRB1/RP12, is essential for photoreceptor morphogenesis. Nature. 2002, 416 (6877): 143-149. 10.1038/nature721.View ArticlePubMed
- Tremblay CS, Huang FF, Habi O, Huard CC, Godin C, Levesque G, Carreau M: HES1 is a novel interactor of the Fanconi anemia core complex. Blood. 2008, 112 (5): 2062-2070. 10.1182/blood-2008-04-152710.View ArticlePubMed
- Luoh SW, Bain PA, Polakiewicz RD, Goodheart ML, Gardner H, Jaenisch R, Page DC: Zfx mutation results in small animal size and reduced germ cell number in male and female mice. Development. 1997, 124 (11): 2275-2284.PubMed
- Garcia-Herrero CM, Galan M, Vincent O, Flandez B, Gargallo M, Delgado-Alvarez E, Blazquez E, Navas MA: Functional analysis of human glucokinase gene mutations causing MODY2: exploring the regulatory mechanisms of glucokinase activity. Diabetologia. 2007, 50 (2): 325-333. 10.1007/s00125-006-0542-7.View ArticlePubMed
- Galan M, Vincent O, Roncero I, Azriel S, Boix-Pallares P, Delgado-Alvarez E, Diaz-Cadorniga F, Blazquez E, Navas MA: Effects of novel maturity-onset diabetes of the young (MODY)-associated mutations on glucokinase activity and protein stability. Biochem J. 2006, 393 (Pt 1): 389-396.PubMed CentralView ArticlePubMed
- Tanaka D, Nagashima K, Sasaki M, Yamada C, Funakoshi S, Akitomo K, Takenaka K, Harada K, Koizumi A, Inagaki N: GCKR mutations in Japanese families with clustered type 2 diabetes. Mol Genet Metab. 2011, 102 (4): 453-460. 10.1016/j.ymgme.2010.12.009.View ArticlePubMed
- Stelzl U, Worm U, Lalowski M, Haenig C, Brembeck FH, Goehler H, Stroedicke M, Zenkner M, Schoenherr A, Koeppen S, Timm J, Mintzlaff S, Abraham C, Bock N, Kietzmann S, Goedde A, Toksoz E, Droege A, Krobitsch S, Korn B, Birchmeier W, Lehrach H, Wanker EE: A human protein-protein interaction network: a resource for annotating the proteome. Cell. 2005, 122 (6): 957-968. 10.1016/j.cell.2005.08.029.View ArticlePubMed
- Rual JF, Venkatesan K, Hao T, Hirozane-Kishikawa T, Dricot A, Li N, Berriz GF, Gibbons FD, Dreze M, Ayivi-Guedehoussou N, Klitgord N, Simon C, Boxem M, Milstein S, Rosenberg J, Goldberg DS, Zhang LV, Wong SL, Franklin G, Li S, Albala JS, Lim J, Fraughton C, Llamosas E, Cevik S, Bex C, Lamesch P, Sikorski RS, Vandenhaute J, Zoghbi HY, Smolyar A, Bosak S, Sequerra R, Doucette-Stamm L, Cusick ME, Hill DE, Roth FP, Vidal M: Towards a proteome-scale map of the human protein-protein interaction network. Nature. 2005, 437 (7062): 1173-1178. 10.1038/nature04209.View ArticlePubMed
- Ramani AK, Bunescu RC, Mooney RJ, Marcotte EM: Consolidating the set of known human protein-protein interactions in preparation for large-scale mapping of the human interactome. Genome Biol. 2005, 6 (5): R40-10.1186/gb-2005-6-5-r40.PubMed CentralView ArticlePubMed
- Prasad TS, Kandasamy K, Pandey A: Human Protein Reference Database and Human Proteinpedia as discovery tools for systems biology. Methods Mol Biol. 2009, 577: 67-79. 10.1007/978-1-60761-232-2_6.View ArticlePubMed
- Joshi-Tope G, Gillespie M, Vastrik I, D'Eustachio P, Schmidt E, de Bono B, Jassal B, Gopinath GR, Wu GR, Matthews L, Lewis S, Birney E, Stein L: Reactome: a knowledgebase of biological pathways. Nucleic Acids Res. 2005, D428-432. 33 Database
- Alfarano C, Andrade CE, Anthony K, Bahroos N, Bajec M, Bantoft K, Betel D, Bobechko B, Boutilier K, Burgess E, Buzadzija K, Cavero R, D'Abreo C, Donaldson I, Dorairajoo D, Dumontier MJ, Dumontier MR, Earles V, Farrall R, Feldman H, Garderman E, Gong Y, Gonzaga R, Grytsan V, Gryz E, Gu V, Haldorsen E, Halupa A, Haw R, Hrvojic A, Hurrell L, Isserlin R, Jack F, Juma F, Khan A, Kon T, Konopinsky S, Le V, Lee E, Ling S, Magidin M, Moniakis J, Montojo J, Moore S, Muskat B, Ng I, Paraiso JP, Parker B, Pintilie G, Pirone R, Salama JJ, Sgro S, Shan T, Shu Y, Siew J, Skinner D, Snyder K, Stasiuk R, Strumpf D, Tuekam B, Tao S, Wang Z, White M, Willis R, Wolting C, Wong S, Wrong A, Xin C, Yao R, Yates B, Zhang S, Zheng K, Pawson T, Ouellette BF, Hogue CW: The Biomolecular Interaction Network Database and related tools 2005 update. Nucleic Acids Res. 2005, D418-424. 33 Database
- Maglott D, Ostell J, Pruitt KD, Tatusova T: Entrez Gene: gene-centered information at NCBI. Nucleic Acids Res. 2011, D52-57. 39 Database
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.