- Research
- Open Access
KF-finder: identification of key factors from host-microbial networks in cervical cancer
https://doi.org/10.1186/s12918-018-0566-x
© The Author(s) 2018
- Published: 24 April 2018
Abstract
Background
The human body is colonized by a vast number of microbes. Microbiota can benefit many normal life processes, but can also cause many diseases by interfering the regular metabolism and immune system. Recent studies have demonstrated that the microbial community is closely associated with various types of cell carcinoma. The search for key factors, which also refer to cancer causing agents, can provide an important clue in understanding the regulatory mechanism of microbiota in uterine cervix cancer.
Results
In this paper, we investigated microbiota composition and gene expression data for 58 squamous and adenosquamous cell carcinoma. A host-microbial covariance network was constructed based on the 16s rRNA and gene expression data of the samples, which consists of 259 abundant microbes and 738 differentially expressed genes (DEGs). To search for risk factors from host-microbial networks, the method of bi-partite betweenness centrality (BpBC) was used to measure the risk of a given node to a certain biological process in hosts. A web-based tool KF-finder was developed, which can efficiently query and visualize the knowledge of microbiota and differentially expressed genes (DEGs) in the network.
Conclusions
Our results suggest that prevotellaceade, tissierellaceae and fusobacteriaceae are the most abundant microbes in cervical carcinoma, and the microbial community in cervical cancer is less diverse than that of any other boy sites in health. A set of key risk factors anaerococcus, hydrogenophilaceae, eubacterium, PSMB10, KCNIP1 and KRT13 have been identified, which are thought to be involved in the regulation of viral response, cell cycle and epithelial cell differentiation in cervical cancer. It can be concluded that permanent changes of microbiota composition could be a major force for chromosomal instability, which subsequently enables the effect of key risk factors in cancer. All our results described in this paper can be freely accessed from our website at http://www.nwpu-bioinformatics.com/KF-finder/.
Keywords
- 16s rRNA
- Host-microbial network
- Cervical carcinoma
Background
Cervical cancer is the second most common cancer in women [1]. Over 500,000 women worldwide die of cervical cancer each year [2]. It is known that a persistent human papillomavirus (HPV) infection appears to be one of major causes of cervical carcinoma. HPV-16 or HPV-18 has been found in more than 70% of cases [3–5]. These oncogenic HPVs are also common risk factors in some other cancers, such as head and neck cancers [6]. However, there are still gaps in the knowledge of cervical cancer to answer the question of why HPV is necessary to cause cell carcinoma, although it is not a sufficient requirement [1, 7].
Thanks to the advent of high-throughput technologies, researchers are able to analyze the cervical carcinogenesis at the genomic level using sequencing data [8]. Genome-wide association studies and subsequent meta-analyses showed that differentially expressed genes (DEGs) in cervical cancer are more likely to locate in the region of frequent chromosomal aberration [9–12]. It indicates that cancer may strongly associate with the chromosomal instability [13]. A recent study suggests that microbiota might play important roles in the development of cervical cancer [14]. There exists a significant difference in microbiota’s diversity between non-cervical lesion (NCL) HPV negative women and these with cervical cancer. Further, compared to the microbial community in NCL-HPV negative ones, these in cervical cancer samples have higher variation within groups. All these findings implicate that cervical microbiota is an important clue in the research of the cervical cancer pathology. In order to understand how the microbial community interplay with host genes and cause cell carcinoma in the molecular level, more and more research groups make efforts of identify key factors, also known as cancer-causing agents, which can drive the progress of cervical carcinogenesis.
Microbiota is a possible suspect causing the frequent gains and losses in chromosome. It is abundantly distributed in women cervices. They are involved in many of the host’s normal life processes, but also can destroy the host’s normal gene regulatory network by gene transfer, which may activate oncogene expression and lead to cancer [15]. Therefore, many researchers take efforts to study how the human microbiota cause structural variation of human genomes and alter the immune system and metabolic system to support the development of cervical pathogenesis [16]. Permanent changes of microbiota may be a major cause of chromosomal instability, subsequently discharge the tumor suppressor gene retinoblastoma (RB) and tumor protein TP53. Some association measures can be used to build a covariance network for microbes and host genes [17]. Host-microbial networks provide a systematic way to study the regulation system between microbiota and host genes [18]. However, the role of host response to the change of microbiome in cervical cancer is still unknown. And there are only a few public tools specifically designed for analyzing host-microbial networks [19–21]. Therefore, there is a pressing demand to develop fast and efficient computational tools to examine how microbiota regulate the gene expression, chromosomal instability and cell carcinoma.
As a remedy for these limitations, we proposed a new computational framework to identify the key risk factors using 16s rRNA and gene expression data of 58 squamous and adenosquamous cell carcinoma in uterine cervix. A series of meta-analyses was performed, which include error correction, spearman rank correlation, differential expression analysis, and bi-partite betweenness centrality. A web-based tool KF-finder was developed, which can provide users a fast-and-easy way to query and visualize the knowledge of microbiota and genes in cervical cancer. Further, a set of novel risk factors were identified that may give helpful suggestions for these researchers focusing on drug design and pharmacology.
Methods
In order to investigate gene expression and microbiome composition in cervical cancer, we collected 133 squamous and adenosquamous cell carcinoma samples, 58 out of which were used for microbial DNA library preparation. The 16s rRNA sequencing was performed using Illumina MiSeq. Human gene expression was quantified using WG-6 BeadArray.
OTU assignment
Each 16s sequence was assigned to an operational taxonomic unit (OTU). To count the reads number for each OTU (microbe), 16s sequences obtained from MiSeq were aligned to the reference Greengene OTU builds. The Qiime script assigne_taxonomy.py (see more at http://qiime.org/scripts/assign_taxonomy.html) was performed in the data processing. Reference sequences are pre-assigned with OTU described in the id_to_taxonomy file. Any sequence alignment tools, such as uclust, SortMeRNA, blast, RDP, Mothur etc, can be called by the assign_taxonomy script for the sequence alignment between the 16s sequences and reference sequences. For example, the script will assign taxonomy with the uclust consensus taxonomy assigner by default using the following command, assign_taxonomy.py -i repr_set_seqs.fasta -r ref_seq_set.fna -t id_to_taxonomy.txt. OTU redundancy matrix was normalized from the sequence number of each sample. Since these less abundant microbes are unlikely to be a destroying force for host immune system, we selected the top-259 most abundant OTUs for further studying.
Comparison with the controls
To study the remarkable difference of microbiota between cancer cases and the controls, we compared our 16s raw data to those data from 300 healthy human subjects released by Human Microbiome Project (HMP) [22] (http://www.hmpdacc.org). To find a map between OTUs from our data and OTUs from healthy data, a commonly used alignment tool blastn was performed to compare their representative sequences. These pairs with evalue <1e-5 and pident >80% were used for establishing the map. These OTUs matched with a same OTU in HMP were collapsed into one OTU. The Qiime scripts were performed to analyze the 16s raw data [23].
Calculation of correlation
Abundant microbes and DEGs were selected for reconstructing host-microbial networks. DEGs in cervical cancer were collected from published data [9], which were verified in five cohorts of tumor and normal samples. Hence, the DEGs are more reliable than these obtained from only one cohort. The spearman rank correlation method was employed to calculate the correlation between each pair of nodes. Note that, the gene expression data and 16s rRNA were tested on the same sample. Therefore, the spearman correlation in the network makes sense. In contrast to pearson correlation, spearman correlation coefficient can efficiently avoid the environmental noise and experimental errors caused from the non-uniform samples.
Error correction
A workflow of the reconstruction of host-microbial network. Through the comparison between 16s rRNA and HMP data, each sequence was mapping to an operational taxonomic unit (OTU). Error correction was performed for these false positive and false negative nodes, which were detected according to the coherence of regulation and correlation
Bi-partite betweenness centrality
To search for risk factors from host-microbial network, bi-partite betweenness centrality (BpBC) [24], adapted from betweenness centrality, was used to quantify the risk of a given node, written as g(v). The definition can be formatted as \(g(v)=\sum _{s,t}\delta _{st}(v)/\delta _{st}\). Here, s and t are two nodes from two separate sub-networks. And δ st represents the number of shortest paths from s to t, δ st (v) the number of shortest paths going through node v from s to t. Given a node v, g(v) reflects the probability of how likely a shortest path could go through v from one sub-network to another.
Results and discussion
Composition of the microbiota
The microbial community in cervical carcinoma. Each 16s rRNA sequence was assigned to an operational taxonomic unit (OTU), and all these sequences were grouped into different categories based on their family-level OTU labels
Principal Coordinates Analysis (PCoA) plot of microbial community for samples from cervical carcinoma, skin, mouth and vagina. The red, green, orange and blue dots represent samples from cervical carcinoma, skin, mouth and vagina, respectively
Reconstruction of host-microbial network
An illustration of the host-microbial network. Nodes refer to differentially expressed genes (DEGs) or abundant microbes, edges the regulation relationship between DEGs and microbes. Nodes in pink are up regulated, and these in cyan are down regulated. Edges in grey are positively correlated, and these in green are negatively correlated
Risk factors in cervical cancer
The risk factors in cancer may activate oncogene expression and cause a series of functional disorder in metabolic and immune systems. In the development of cancer, the most remarkable differences between tumor and normal samples are: 1) the up-regulation of viral responses; 2) the speed-up in the progression of cell cycle; 3) the inhibition of epithelial cell differentiation. To study how microbiota regulates the viral response, cell cycle and epithelial cell differentiation, we searched for key risk factors using BpBC. These key factors are thought to be cancer-causing agents that can drive the progress of cervical carcinogenesis. Nodes that organizing communication between two cancer-related groups are more likely to be key factors. Since BpBC is such a measure to evaluate the importance of a node in the network topology, we choose these nodes in the top list of BpBC as candidates of key factors. These key factors with high BpBC value may play crucial roles in the communication between two different sub-networks.
Risk factors in host-microbial network in cervical cancer. The BpBC value of each node was calculated for three pairs of different sub-networks, including microbe-antivirus, microbe-cell cycle and microbe-epithelial cell differentiation
Query and visualization
A graphic view of the induced sub-network of CYP2A7. The subnetwork includes interactions between CYP2A7 and its neighbors, interactions between its neighbors
A case study of PSMB10 in cervical cancer
Most vertebrates express immunoproteasomes (IPs) that possess three IFN- γ-inducible homologues: PSMB8, PSMB9 and PSMB10. Many studies show that expression of IP genes including PSMB10 is up-regulated in most cancer types [26]. IP genes can be expressed by non-immune cell, and that differential cleavage of transcription factors by IPs has pleiotropic effects on cell function. Indeed, IPs modulate the abundance of transcription factors that regulate signaling pathways with prominent roles in cell differentiation, inflammation and neoplastic transformation (e.g., NF-kB, IFNs, STATs and Wnt) [27]. Therefore, PSMB10 is indeed a risk factor involved in the antiviral response of cervical caner.
A case study of KRT13 in cervical cancer
KRT13’s full name is keratin 13 in human, also known as K13 and CK13, located in a region of chromosome 17q21.2. It is a down-regulated gene in cervical carcinoma, and a risk factor that involves in the progress of uncontrolled epithelial cell differentiat,ion. Previous work suggests that the loss of K13 or low K13 mRNA expression is associated with invasive oral squamous cell carcinoma (OSCC) [28, 29]. Epigenetic alteration of K13 is one major reason resulting the inhibition of K13 in OSCC. Besides, K13 was also reported that it played a directive role in prostate cancer bone, brain and soft tissue metastases [30]. More than 1000 single nucleotide polymorphisms of K13 were found in the dbSNP database. Totally, 51 variations mentioned K13 in ClinVar, seven out of which are pathogenic. All these evidences suggest KRT13 is very likely to be a key risk factor involved in cervical cancer.
Conclusions
In this paper, we examined the microbiota composition and gene expression in 58 squamous and adenosquamous cell carcinoma. A host-microbial network was reconstructed from the 16s rRNA and gene expression data. The main contributions of this paper can be concluded in three aspects: (1) microbial community distributed in cervical carcinoma cells is less diverse than that of other body sites; (2) a web-based tool MiteFinder was developed which enables users to query and visualize host-microbial networks, microbes and differentially expressed genes in a fast-and-easy way; (3) a set of key risk factors have been identified, which have proven to have association with cancers in several previous publications. Our results show that six groups of OTU abundantly distributed in cervical cancer samples, including prevotellaceade, tissierellaceae, fusobacteriaceae, porphyromonadaceae, planococcaceae and bacteroidaceae. Besides these six groups of OTU, we found that three differentially expressed genes and three microbes may be key risk factors and play crucial roles in the pathology of cervical carcinoma. All of these results suggest that permanent changes of microbiota composition might be the key driving force in the pathology of cervical carcinoma, which result in the abnormality of epithelial cell differentiation, cell cycle and viral response.
Declarations
Acknowledgements
Not applicable.
Funding
This project has been funded by the National Natural Science Foundation of China (Grant No. 61332014 and 61702420); the China Postdoctoral Science Foundation (Grant No. 2017M613203); the Natural Science Foundation of Shaanxi Province (Grant No. 2017JQ6037); the Fundamental Research Funds for the Central Universities (Grant No. 3102015QD013). The publication charges come from the National Natural Science Foundation of China (Grant No. 61702420).
Availability of data and materials
All the results can be found at http://www.nwpu-bioinformatics.com/KF-finder/.
About this supplement
This article has been published as part of BMC Systems Biology Volume 12 Supplement 4, 2018: Selected papers from the 11th International Conference on Systems Biology (ISB 2017). The full contents of the supplement are available online at https://bmcsystbiol.biomedcentral.com/articles/supplements/volume-12-supplement-4.
Authors’ contributions
JH designed the computational framework, performed all the analyses of the data and wrote the manuscript; YG and YZ developed the web-based tool KF-finder to query and visualize the host-microbial network; XS is the major coordinator, who contributed a lot of time and efforts in the discussion of this project. All authors read and approved the final manuscript.
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
Not applicable.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
Authors’ Affiliations
References
- Roden R, Wu TC. How will hpv vaccines affect cervical cancer?Nat Rev Cancer. 2006; 6(10):753–63.View ArticlePubMedPubMed CentralGoogle Scholar
- Waggoner SE. What is cervical cancer. Lancet. 2003; 361(9376):2217–25.View ArticlePubMedGoogle Scholar
- Castle PE, Stoler MH, Jr WT, Sharma A, Wright TL, Behrens CM. Performance of carcinogenic human papillomavirus (hpv) testing and hpv16 or hpv18 genotyping for cervical cancer screening of women aged 25 years and older: a subanalysis of the athena study. Lancet Oncol. 2011; 12(9):880.View ArticlePubMedGoogle Scholar
- Munoz B, Herrero C. Epidemiologic classification of human papillomavirus types associated with cervical cancer. new england journal of medicine. N Engl J Med. 2003; 348(6):518–27.View ArticlePubMedGoogle Scholar
- Shulzhenko N, Lyng H, Sanson GF, Morgun A. Ménage atrois: an evolutionary interplay between human papillomavirus, a tumor, and a woman. Trends Microbiol. 2014; 22(6):345–53.View ArticlePubMedGoogle Scholar
- Marur S, D’Souza G, Westra WH, Forastiere AA. Hpv-associated head and neck cancer: A virus-related cancer epidemic - a review of epidemiology, biology, virus detection and issues in management. Lancet Oncol. 2010; 11(8):781–9.View ArticlePubMedPubMed CentralGoogle Scholar
- Clayton J. Clinical approval: Trials of an anticancer jab. Nature. 2012; 488(7413):4.View ArticleGoogle Scholar
- Chansaenroj J, Theamboonlers A, Junyangdikul P, Swangvaree S, Karalak A, Poovorawan Y. Whole genome analysis of human papillomavirus type 16 multiple infection in cervical cancer patients. Asian Pac J Cancer Prev Apjcp. 2012; 13(2):599.View ArticlePubMedGoogle Scholar
- Mine KL, Shulzhenko N, Yambartsev A, Rochman M, Sanson GFO, Lando M, Varma S, Skinner J, Volfovsky N, Deng T. Gene network reconstruction reveals cell cycle and antiviral genes as major drivers of cervical cancer. Nat Commun. 2013; 4(5):1806.View ArticlePubMedPubMed CentralGoogle Scholar
- Peng J, Wang Y, Chen J, Shang X, Shao Y, Xue H. A novel method to measure the semantic similarity of hpo terms. Int J Data Min Bioinforma. 2017; 17(2):173.View ArticleGoogle Scholar
- Peng J, Wang H, Lu J, Hui W, Wang Y, Shang X. Identifying term relations cross different gene ontology categories. BMC Bioinformatics. 2017; 18(16):67–74. https://doi.org/10.1186/s12859-017-1959-3.
- Hu J, Shang X. Detection of network motif based on a novel graph canonization algorithm from transcriptional regulation networks. Molecules. 2017; 22(12):2194. https://doi.org/10.3390/molecules22122194.
- Schvartzman JM, Sotillo R, Benezra R. Mitotic chromosomal instability and cancer: mouse modelling of the human disease. Nat Rev Cancer. 2010; 10(2):102–15.View ArticlePubMedPubMed CentralGoogle Scholar
- Audiracchalifour A, Torrespoveda K, Bahenaromán M, Téllezsosa J, Martínezbarnetche J, Cortinaceballos B, Lópezestrada G, Delgadoromero K, Burguetegarcía AI, Cantú D. Cervical microbiome and cytokine profile at various stages of cervical cancer: A pilot study. PloS ONE. 2016; 11(4):0153274.Google Scholar
- Wein AJ. Re: The microbiome of the urinary tract-a role beyond infection. J Urol. 2015; 194(6):1643–5.View ArticlePubMedGoogle Scholar
- Kyrgiou M, Mitra A, Moscicki AB. Does the vaginal microbiota play a role in the development of cervical cancer?. Transl Res J Lab Clin Med. 2017; 179:168.View ArticleGoogle Scholar
- Liu Z-P. Quantifying gene regulatory relationships with association measures: A comparative study. Front Genet. 2017; 8:96. https://doi.org/10.3389/fgene.2017.00096.
- WaltherAntonio MRS, Chen J, Multinu F, Hokenstad A, Distad TJ, Cheek EH, Keeney GL, Creedon DJ, Nelson H, Mariani A. Potential contribution of the uterine microbiome in the development of endometrial cancer. Genome Med. 2016; 8(1):122.View ArticleGoogle Scholar
- Molyneaux PL, Willisowen SAG, Cox MJ, James P, Cowman S, Loebinger M, et al. Host-microbial interactions in idiopathic pulmonary fibrosis. Am J Respir Crit Care Med. 2017; 195(12):1640.View ArticlePubMedPubMed CentralGoogle Scholar
- Li Z, Wright AG, Yang Y, Si H, Li G. Unique bacteria community composition and co-occurrence in the milk of different ruminants. Sci Rep. 2017; 7:40950.View ArticlePubMedPubMed CentralGoogle Scholar
- Liu ZP, Wu C, Miao H, Wu H. Regnetwork: an integrated database of transcriptional and post-transcriptional regulatory networks in human and mouse. Database J Biol Databases Curation. 2015; 2015(224):095.Google Scholar
- Gevers D, Knight R, Petrosino JF, Huang K, Mcguire AL, Birren BW, Nelson KE, White O, Methè BA, Huttenhower C. The human microbiome project: a community resource for the healthy human microbiome. PLoS Biol. 2012; 10(8):1001377.View ArticleGoogle Scholar
- Caporaso eaJG. Qiime allows analysis of high-throughput community sequencing data. Nat Methods. 2010; 7(5):335–41.View ArticlePubMedPubMed CentralGoogle Scholar
- Dong X, Yambartsev A, Ramsey SA, Thomas LD, Shulzhenko N, Morgun A. Reverse engeneering of regulatory networks from big data: A roadmap for biologists. Eprint Arxiv. 2014; 9(9):61–74.Google Scholar
- Desantis TZ, Hugenholtz PL. Greengenes, a chimera-checked 16s rrna gene database and workbench compatible with arb. Appl Environ Microbiol. 2006; 72(7):5069–72.View ArticlePubMedPubMed CentralGoogle Scholar
- Rouette A, Trofimov A, Haberl D, Boucher G, Lavallé VP, D’Angelo G, Hébert J, Sauvageau G, Lemieux S, Perreault C. Expression of immunoproteasome genes is regulated by cell-intrinsic and -extrinsic factors in human cancers. Sci Rep. 2016; 6:34019.View ArticlePubMedPubMed CentralGoogle Scholar
- de Verteuil DA, Rouette A, Hardy MP, Lavallée S, Trofimov A, Gaucher E, Perreault C. Immunoproteasomes shape the transcriptome and regulate the function of dendritic cells. J Immunol. 2014; 193(3):1121–32.View ArticlePubMedGoogle Scholar
- Farrukh S, Syed S, Pervez S. Differential expression of cytokeratin 13 in non-neoplastic, dysplastic and neoplastic oral mucosa in a high risk pakistani population. Asian Pac J Cancer Prev Apjcp. 2015; 16(13):5489–92.View ArticlePubMedGoogle Scholar
- Hartanto FK, Karen-Ng LP, Vincent-Chong VK, Ismail SM, Mustafa WM, Abraham MT, Tay KK, Zain RB. Krt13, faim2 and cyp2w1 mrna expression in oral squamous cell carcinoma patients with risk habits. Asian Pac J Cancer Prev Apjcp. 2015; 16(3):953–8.View ArticlePubMedGoogle Scholar
- Li Q, Yin L, Jones LW, Chu GC, Wu JB, Huang JM, Li Q, You S, Kim J, Lu YT. Keratin 13 expression reprograms bone and brain metastases of human prostate cancer cells. Oncotarget. 2016;7(51).Google Scholar