Systems biological approach on neurological disorders: a novel molecular connectivity to aging and psychiatric diseases
© Ahmed et al; licensee BioMed Central Ltd. 2011
Received: 23 June 2010
Accepted: 12 January 2011
Published: 12 January 2011
Systems biological approach of molecular connectivity map has reached to a great interest to understand the gene functional similarities between the diseases. In this study, we developed a computational framework to build molecular connectivity maps by integrating mutated and differentially expressed genes of neurological and psychiatric diseases to determine its relationship with aging.
The systematic large-scale analyses of 124 human diseases create three classes of molecular connectivity maps. First, molecular interaction of disease protein network generates 3632 proteins with 6172 interactions, which determines the common genes/proteins between diseases. Second, Disease-disease network includes 4845 positively scored disease-disease relationships. The comparison of these disease-disease pairs with Medical Subject Headings (MeSH) classification tree suggests 25% of the disease-disease pairs were in same disease area. The remaining can be a novel disease-disease relationship based on gene/protein similarity. Inclusion of aging genes set showed 79 neurological and 20 psychiatric diseases have the strong association with aging. Third and lastly, a curated disease biomarker network was created by relating the proteins/genes in specific disease contexts, such analysis showed 73 markers for 24 diseases. Further, the overall quality of the results was achieved by a series of statistical methods, to avoid insignificant data in biological networks.
This study improves the understanding of the complex interactions that occur between neurological and psychiatric diseases with aging, which lead to determine the diagnostic markers. Also, the disease-disease association results could be helpful to determine the symptom relationships between neurological and psychiatric diseases. Together, our study presents many research opportunities in post-genomic biomarkers development.
Systems biology is an indispensable approach to study the complex mechanisms of any disease or disorders. After post-genomic era the accumulation of genomics and proteomics data are widely flooded. However, there is an unrealized opportunity remains in the understanding of detailed molecular mechanisms of several neurological disorders [1, 2]. Thus, the molecular diagnosis of most of the neurological disorders remains difficult and mostly carried out by neurological examination . The current molecular connectivity approaches of systems biology are mainly focusing on building large protein networks without probing the interaction mechanisms specific to disorders or disease condition [4, 5]. Hence, the possibility of finding successful biomarkers through systems biology approach is intricate. In order to gain a better understanding of molecular mechanism, disease relationship and biomarkers, the genes implicated within similar disorders are need to be focused.
The systems biological concepts of disease interaction were usually made by collecting signature genes of genetically heterogeneous hereditary diseases and investigating the different mutations in a same gene (allelic heterogeneity) giving rise to different disorders . Similar, trends are followed for differentially regulating genes and linking them to various diseases . Here, we had taken an integrated approach of mutated and differentially regulating genes and exploring diseasome network that corresponds to the neurological and psychiatric diseases. Such integrative approach will improve the confidence of finding specific markers for diseases. The reasons that we choose an integrative approach on neurological disorders are two-fold. First, the understanding of neurological disorder is considerably less, because of difficulty in obtaining brain tissue for many cases. Second, there is an increasing prevalence rate [8, 9] and lack of molecular diagnosis for most of the neurological disorders [10, 11].
In this study, we propose an integrative, network-based model of mutated and differentially regulating genes of 100 neurological and 24 psychiatric diseases (see Additional File 1 for a disease category), that identifies the neurological and psychiatric relationship and their association with aging. Furthermore, this network model helps to understand the common mechanism between diseases through common pathway network (CPN). Overall, our findings highlight the importance of integrating the gene/protein data of neurological diseases into future molecular biomarkers and drug target discovery.
Results and Discussion
APLP2 (4), NEU2 (1.4), PCDH11X (0.5), SUMF1 (0.5), TOMM40 (0.5)
Amyotrophic Lateral Sclerosis
ALS2CR8 (0.5), DERL1 (0.5), FUS (1), HOPX (0.5), KIF1A (0.5), MOBKL2B(0.5), NIF3L1 (0.5), SCN7A (0.5), SEMA6A (1.4), SLC39A11(0.5), STRADB (0.5), SUSD1(0.5), UNC13A(0.5), ZFP64(0.5)
ARID4A (1.4), ARID4B (0.5), MKRN3 (1), NDNL2 (0.5), NIPA2 (0.5), PHLDA2 (0.5), SLC9A6(0.5), TSPAN32 (0.5), TSSC4 (0.5)
DDX10 (1), HEPACAM (1.4), TCL1A (0.5)
ARFGEF1 (0.5), CCPG1 (0.5), PDGFC (0.5)
Fatal Familial Insomnia
AKAP8L (0.5), ARFIP2 (0.5)
Lambert-Eaton Myasthenic Syndrome
RAPSN (0.5), SOX1 (0.5)
Major Depressive Disorder
FKBP5 (1), PCLO (2)
Manic Depressive Psychosis
CSMD1 (0.5), FCGR1A (2), FCGR1B (2), FCGR2A (3.5), FCGR2B (5), FCGR2C (4.1), FCGR3A (4.1), FCGR3B (0.5), IFNB1 (0.5)
Multiple System Atropy
AGTPBP1 (0.5), EXTL3 (0.5)
CALB1(5), CSF1R (5), MT-CYB(0.5), CHAC1 (0.5), NPFF (0.5)
Restless Legs Syndrome
AP3B2(2), HMBS (0.5), SETD2(1.4), ST6GAL2(3.5)
NINJ2(0.5), PROZ (0.5)
von Hippel-Lindau Disease
Cross-validation of network
To validate our computational approach, the results obtained from this study were compared with the results of Goni et al and Goh et al approaches [19, 4]. Our result was in agreement with Goni et al studies showing the successful interaction between Alzheimer's disease and multiple sclerosis. In addition to our result, several other studies also confirm the molecular relationship between Alzheimer's disease and multiple sclerosis [20–22]. However, similar interaction trend was not been achieved with Goh et al approach. This is because Goh et al approach of molecular connectivity was carried out on mutated genes, while our approach uses both differentially expressed and mutated disease genes for the generation of DDN. Hence, our approach confirms the effectiveness of integrating differential and mutated genes for reliable disease-disease relationships. On the other hand, the proposed biomarkers of our study were cross-validated using genetic association database (GWAS)  to confirm its disease specificity in context to neurological or psychiatric diseases. In our identified 73 biomarkers, only 27 biomarkers were shown to have disease association information, while the information of 46 biomarkers was not available in GWAS database. This is because the genetic associations of few diseases were not been included in GWAS database. However, the precision rate (PPV) was calculated only on these 27 biomarkers. All 27 biomarkers were confirmed to be specific to its diseases in context to the analyzed disorders. Hence, the PPV was calculated to be 100%.
Though, our present approach provides good accuracy in determining the disease-disease interaction and biomarkers, it has limitation in the aspects of biomarkers detection. In medicine, biomarkers are the molecules, specific to its pathological condition. Since, our study is focused on neurological and psychiatric diseases the obtained biomarkers are specific to its diseases of neurological and psychiatric disorders. However, there is a possibility for these 73 biomarkers to have an association with other disorders irrespective neurological and psychiatric diseases. Such limitation can be avoided by including all the disorders in a network and implementing our biomarker strategy for detection of biomarkers. However, with the available information of these 27 biomarkers, we validated across GWAS database. The results confirm that 15 biomarkers are specific to its disease and have no association with any other disorders (Table. 1).
In conclusion, the disease-disease relationships are of great interest because such knowledge not only enhances our understanding of disease mechanisms, but also accelerates many aspects of biomarker and drug target discovery. These results can be interesting to neurologists, and our method can be generalized to other disease biology areas for systems biological investigation. We believe our approach to understand the mechanism involved in neurological disease has given a valuable insight into the relationship of aging and psychiatric illness. Moreover, these combined efforts resulted in identification of biomarkers that will greatly improve in diagnosis of neurological and psychiatric diseases.
Initial collection of disease related genes
Enriched protein network
The Search Tool for Retrieval of Interacting Genes/Protein (STRING) database  was used to collect protein interaction data to construct disease-protein network (DPN) from 1209 seed genes. The STRING database contains experimental and predicted protein interaction data of 630 organisms of both eukaryotes and prokaryotes. This study includes both experimental and predicted interaction of human proteins for the generation of disease-protein network, considering the successfulness of predicted interactions in several disease interaction studies [5, 28]. To build disease-protein network, we pulled out proteins that are interacting to seed genes/proteins, with confidence scores ranging from 0.5 to 1.0. Such expanded set of initial seed proteins were denoted as enriched protein set and the interaction of seed and enriched set of each disease is known as protein sub-network (PSN). The aging genes set were included to the network without enrichment to make a strong correlation with neurological and psychiatric diseases. All genes were mapped to the official gene symbol using HUGO Gene Nomenclature Committee (HGNC)  to avoid false interaction to same genes/proteins and the data curation was carried out using Microsoft Excel and Microsoft Access. From these non-redundant interaction data, disease-protein network (DPN), disease-disease network (DDN), common pathway network (CPN) and disease-biomarker network (DBN) were created and visualized using Cytoscape version 2.7.0 and NAViGaTOR version 2.1software. In DPN, node represents disease proteins. The proteins of two diseases were connected if same proteins are associated with both diseases. In DDN, node represents disease, two diseases are connected to one another if they share at least one protein common to both the disease. Further, CPN was created from the commonly associated genes/protein between the disease pair and DBN was created by pulling out the disease specific seed proteins from DPN.
Statistical significance of network
Randomly select same number of seed proteins as in each PSN from Brain Gene Expression Map database .
Pull out the enriched set for the randomly selected seed proteins from STRING database.
Compute an index of aggregation.
Repeat the above steps for 1000 times to generate index of aggregation.
Compare the index of aggregation of protein-sub network with the distribution of previous steps, to calculate p-value.
Similarly, repeat the above steps for remaining PSN.
Finally, compute the geometric mean to the obtained p-values of 124 PSN.
Disease-disease interaction score
Here, Pi and Pj are the total number of proteins for the disease, i and j, respectively. Pij is the total number of common protein between the two diseases. N is the size of entire proteins involved in the disease protein network. Z is a constant (Z = 1) introduced to avoid out-of bound errors, if Pi = Pj = Pij = 0. The expected result of Φdij is positive, when the disease pair is over-represented and negative, when the disease pair is under-represented.
MeSH based disease interaction mapping
Medical Subject Headings (MeSH) is the National Library of Medicine's controlled vocabulary thesaurus. It consists of sets of terms naming descriptors in a hierarchical structure that permits searching at various levels of specificity. We downloaded the disease tree file from MeSH, which contains 16 categories, including disease, chemicals and drug category, etc. The neurological disease category (C10) was classified into 15 major clusters and psychiatric disorder (F03) was classified into 16 major clusters. Each positively scored disease pair (Φdij) was mapped to the neurological and psychiatric disease category to determine the reliability of disease connectivity. For instance, if each disease pair presents in single major cluster suggest having strong connectivity.
Common Pathway network
In order to understand the common molecular mechanism between diseases, the proteins/genes that associated between each disease pair of disease-disease interaction were mapped to the NCI-Nature Pathway Interaction Database (PID) . PID is a manually curated human pathway database contains 116 human pathways with 6180 interactions. PID provides the p-value based on the probability of occurrence of the proteins in the defined pathway. Lower the p-value the greater the probability of proteins associated towards a given pathway. Hence, we filtered the common pathway between the diseases by p-value 0.05.
The analysis of DPN was carried out to determine the biomarkers for each disease involved in this study. Biomarkers were identified by finding the disease specific seed proteins from the DPN network. This process was carried out by comparing the each seed protein of one PSN with the other PSN. If the seed protein was unique to its PSN, then the identified seed protein was considered as a biomarker (pi) to its disease.
Significant enrichment biomarkers score
The functional enriched biomarkers score for each disease was computed based on the gene ontology. The scores were calculated using Biological Network Gene Ontology (BiNGO) plug-in in Cytoscape software. BiNGO provides p-value statistics based on the probability of occurrence of the genes/proteins in the defined ontological categories . Here, the p-values for each disease biomarkers were calculated on the entire ontological categories such as molecular function, biological processes and cellular localization. Further, the geometric mean of p-values of each disease was calculated and the negative logarithm was performed. The biomarkers relationship to its disease was significant, if the score obtained to be greater than a threshold of 1.3 .
Biomarker scoring for diagnosis from biofluid
The proteomic data of urine was obtained from  and plasma proteome data was obtained from the Human proteome organization database . The CSF proteome and house keeping genes data were obtained from the literature of previous studies [35, 36]. In scoring formula (Ψpi score), μi: scored 1.0, if the protein (pi) is encoded by house keeping gene, else it is scored 0.5; αi = 0.3, if the protein (pi) circulating in CSF; βi = 0.5, if protein (pi) circulating in plasma; γi = 0.7, if the protein(pi) circulating in urine. The absence of protein (pi) in any biofluid indicated as, αi (or) βi (or) γi = 0.
Cross validation of network
TP: Number of True Positive
FP: Number of False Positive
GWAS contains disease associated gene/protein information in terms of gene expression, proteomic expression and mutation data. Cross validation of identified biomarkers with GWAS database will be valuable, to utilize the measurable threshold of our biomarkers for diagnosis.
We thank Drs. Mohamed Ali and Selva Murugan for their intellectual input. This work was supported by SRM University, Tamil Nadu, India.
- Hasegawa T, Mikoda N, Kitazawa M, LaFerla FM: Treatment of Alzheimer's disease with anti-homocysteic acid antibody in 3xTg-AD male mice. PLoS One. 2010, 5: e8593- 10.1371/journal.pone.0008593PubMed CentralView ArticlePubMedGoogle Scholar
- Banno H, Katsuno M, Suzuki K, Iguchi Y, Adachi H, Tanaka F, Sobue G: Molecular-targeted therapy for motor neuron disease. Brain Nerve. 2009, 61: 891-900.PubMedGoogle Scholar
- Miriam D, Sarah H, Bruce AR: Neuroscience Biomarkers and Biosignatures. Institute of Medicine of the National Academies, The National Academies Press 2008
- Goh KI, Cusick ME, Valle D, Childs B, Vidal M, Barabasi AL: The human disease network. Proc Natl Acad Sci USA. 2007, 104: 8685-8690. 10.1073/pnas.0701361104PubMed CentralView ArticlePubMedGoogle Scholar
- Li J, Zhu X, Chen JY: Building Disease-Specific Drug-Protein Connectivity Maps from Molecular Interaction Networks and PubMed Abstracts. PLoS Comput Biol. 2009, 5: e1000450- 10.1371/journal.pcbi.1000450PubMed CentralView ArticlePubMedGoogle Scholar
- Vogelstein B, Lane D, Levine AJ: Surfing the p53 network. Nature. 2000, 408: 307-310. 10.1038/35042675View ArticlePubMedGoogle Scholar
- Hu G, Agarwal P: Human Disease-Drug Network Based on Genomic Expression Profiles. PLoS ONE. 2009, 4: e6536- 10.1371/journal.pone.0006536PubMed CentralView ArticlePubMedGoogle Scholar
- Patel V, Simbine AP, Soares IC, Weiss HA, Wheeler E: Prevalence of severe mental and neurological disorders in Mozambique: a population-based survey. Lancet. 2007, 370: 1055-1060. 10.1016/S0140-6736(07)61479-2View ArticlePubMedGoogle Scholar
- Hirtz D, Thurman DJ, Gwinn-Hardy K, Mohamed M, Chaudhuri AR, Zalutsky R: How common are the "common" neurologic disorders?. Neurology. 2007, 68: 326-337. 10.1212/01.wnl.0000252807.38124.a3View ArticlePubMedGoogle Scholar
- Diagnosis Wrong. http://www.wrongdiagnosis.com/intro/difficult.htm
- Ahmed SS, Santosh W, Kumar S, Christlet HT: Neural network algorithm for the early detection of Parkinson's disease from blood plasma by FTIR micro-spectroscopy. Vib Spectrosc. 2010, 53: 181-188. 10.1016/j.vibspec.2010.01.019.View ArticleGoogle Scholar
- Boeve BF, Silber MH, Ferman TJ: REM sleep behavior disorder in Parkinson's disease and dementia with Lewy bodies. J Geriatr Psychiatry Neurol. 2004, 17: 146-57. 10.1177/0891988704267465View ArticlePubMedGoogle Scholar
- Trenkwalder C: Sleep dysfunction in Parkinson's disease. Clin Neurosci. 1998, 5: 107:114-PubMedGoogle Scholar
- Comella CL, Nardine TM, Diederich NJ, Stebbins GT: Sleep-related violence, injury, and REM sleep behavior disorder in Parkinson's disease. Neurology. 1998, 51: 526-529.View ArticlePubMedGoogle Scholar
- Carl FS, Kira A, Shiva K, Jeffrey B, Matthew D, Timo H, Buetow Kenneth H: PID: The Pathway Interaction Database. Nucleic Acids Res. 2009, 37: D674-679. 10.1093/nar/gkn653View ArticleGoogle Scholar
- Filippo C, Agata C, Ferdinando N, Filippo D: Depression and Alzheimer's disease: Neurobiological links and common pharmacological targets. Eur J Pharmacol. 2010, 626: 64-71. 10.1016/j.ejphar.2009.10.022View ArticleGoogle Scholar
- Ahmed SS, Santosh W, Kumar S, Christlet HT: Metabolic profiling of Parkinson's disease: evidence of biomarker from gene expression analysis and rapid neural network detection. J Biomed Sci. 2009, 16: 63- 10.1186/1423-0127-16-63PubMed CentralView ArticlePubMedGoogle Scholar
- Decramer S, Peredo A, Breuil B, Mischak H, Monsarrat B, Bascands JL, Schanstra JP: Urine in Clinical Proteomics. Mol Cell Proteomics. 2008, 7: 1850-1862. 10.1074/mcp.R800001-MCP200View ArticlePubMedGoogle Scholar
- Goñi J, Esteban FJ, Mendizábal NV, Sepulcre J, Trevijano SA, Agirrezabal SA, Villoslada P: A computational analysis of protein-protein interaction networks in neurodegenerative diseases. BMC Syst Biol. 2008, 2: 52-PubMed CentralView ArticlePubMedGoogle Scholar
- Witte ME, Bol JG, Gerritsen WH, van der Valk P, Drukarch B, van Horssen J, Wilhelmus MM: Parkinson's disease-associated parkin colocalizes with Alzheimer's disease and multiple sclerosis brain lesions. Neurobiol Dis. 2009, 36: 445-452. 10.1016/j.nbd.2009.08.009View ArticlePubMedGoogle Scholar
- Evangelou N, Jackson M, Beeson D, Palace J: Association of the APOE å4 allele with disease activity in multiple sclerosis. J Neurol Neurosurg Psychiatry. 1999, 67: 203-205. 10.1136/jnnp.67.2.203PubMed CentralView ArticlePubMedGoogle Scholar
- Dal B, Bradl M, Frischer J, Kutzelnigg A, Jellinger K, Lassmann H: Multiple sclerosis and Alzheimer's disease. Ann Neurol. 2008, 63: 174-183. 10.1002/ana.21240View ArticleGoogle Scholar
- Becker KG, Barnes KC, Bright TJ, Wang SA: The Genetic Association Database. Nature Genetics. 2004, 36: 431-432. 10.1038/ng0504-431View ArticlePubMedGoogle Scholar
- Medical Subject Headings (MeSH) database. http://www.nlm.nih.gov/mesh/
- Online Mendelian Inheritance in Man (OMIM) database. http://www.ncbi.nlm.nih.gov/omim
- The GenAge database. http://genomics.senescence.info/genes/
- STRING database. http://string-db.org/
- Chen JY, Shen C, Sivachenko AY: Mining Alzheimer Disease Relevant Proteins From Integrated Protein Interactome Data. Pac Symp Biocomput. 2006, 367-378.Google Scholar
- Bruford EA, Lush MJ, Wright MW, Sneddon TP, Povey S, Birney E: The HGNC Database in 2008: a resource for the human genome. Nucleic Acids Res. 2008, 36: 445-448. 10.1093/nar/gkm881.View ArticleGoogle Scholar
- Brain Gene Expression Map. http://www.stjudebgem.org
- Maere S, Heymans K, Kuiper M: BiNGO: a Cytoscape plugin to assess overrepresentation of gene ontology categories in biological networks. Bioinformatics. 2005, 21: 3448-3449. 10.1093/bioinformatics/bti551View ArticlePubMedGoogle Scholar
- Zhu W, Yang L, Du Z: Layered functional network analysis of gene expression in human heart failure. PLoS ONE. 2009, 4: e6288- 10.1371/journal.pone.0006288PubMed CentralView ArticlePubMedGoogle Scholar
- Human Urinary Proteome Database. http://mosaiques-diagnostics.com/
- HUPO plasma proteome project database. http://www.bioinformatics.med.umich.edu/hupo/ppp
- Ferrari LD, Aitken S: Mining housekeeping genes with a Naive Bayes classifier. BMC Genomics. 2006, 7: 277- 10.1186/1471-2164-7-277PubMed CentralView ArticlePubMedGoogle Scholar
- Pan S, Zhu D, Quinn JF, Peskind ER, Montine TJ, Lin B, Goodlett DR, Taylor G, Eng J, Zhang J: A combined dataset of human cerebrospinal fluid proteins identified by multi-dimensional chromatography and tandem mass spectrometry. Proteomics. 2007, 7: 469-473. 10.1002/pmic.200600756View ArticlePubMedGoogle Scholar