Computational prediction of CRISPR cassettes in gut metagenome samples from Chinese type-2 diabetic patients and healthy controls
© Mangericao et al. 2015
Published: 11 January 2016
CRISPR has been becoming a hot topic as a powerful technique for genome editing for human and other higher organisms. The original CRISPR-Cas (Clustered Regularly Interspaced Short Palindromic Repeats coupled with CRISPR-associated proteins) is an important adaptive defence system for prokaryotes that provides resistance against invading elements such as viruses and plasmids. A CRISPR cassette contains short nucleotide sequences called spacers. These unique regions retain a history of the interactions between prokaryotes and their invaders in individual strains and ecosystems. One important ecosystem in the human body is the human gut, a rich habitat populated by a great diversity of microorganisms. Gut microbiomes are important for human physiology and health. Metagenome sequencing has been widely applied for studying the gut microbiomes. Most efforts in metagenome study has been focused on profiling taxa compositions and gene catalogues and identifying their associations with human health. Less attention has been paid to the analysis of the ecosystems of microbiomes themselves especially their CRISPR composition.
We conducted a preliminary analysis of CRISPR sequences in a human gut metagenomic data set of Chinese individuals of type-2 diabetes patients and healthy controls. Applying an available CRISPR-identification algorithm, PILER-CR, we identified 3169 CRISPR cassettes in the data, from which we constructed a set of 1302 unique repeat sequences and 36,709 spacers. A more extensive analysis was made for the CRISPR repeats: these repeats were submitted to a more comprehensive clustering and classification using the web server tool CRISPRmap. All repeats were compared with known CRISPRs in the database CRISPRdb. A total of 784 repeats had matches in the database, and the remaining 518 repeats from our set are potentially novel ones.
The computational analysis of CRISPR composition based contigs of metagenome sequencing data is feasible. It provides an efficient approach for finding potential novel CRISPR arrays and for analysing the ecosystem and history of human microbiomes.
The human body is host to this complex community of symbiotic, pathogenic and commensal microorganisms (microbiome), whose abundance is estimated to exceed the number of human cells by at least an order of magnitude . It has been estimated that the microbes in our bodies collectively make up to 100 trillion cells, tenfold the number of human cells . To understand and exploit the impact of the microbes on human health and wellbeing, it is necessary to interpret the content, diversity and function of the microbial community. Metagenomic sequencing has been proven to be a powerful tool for analysing complex microbial communities. And in this context many projects have been developed all over the world to study human microbiomes of multiple body sites. For example, the US-based Human Microbiome Project (HMP)  and the EU-based MetaHIT project  have generated resources that can enable the comprehensive characterization of the human microbiome and analysis of its role in human health and disease.
Among all body sites, the diversity of microorganisms in the human gut is known to be among the highest . This community has been discovered to be associated with human physiology through processes related to development, nutrition, immunity, and resistance to pathogens –. While bacteria are responsible for these functions and most research has focused on the interaction of bacteria with the host, bacteriophages, in turn, influence the composition and abundances of bacteria in the human gut , .
Bacteria and Archaea have evolved defence and regulatory mechanisms to cope with various environmental stressors especially virus attacks. The understanding on this arsenal has been expanded by the discovery of the versatile CRISPR-Cas system. Bacteria can remember their viral invaders by sampling short DNA sequences, known as protospacers, from the genetic materials of viruses or phage. These sequences become integrated into the bacterium’s own DNA, specifically into an array of repeat sequences called clustered regularly interspaced short palindromic repeats (CRISPR). The integrated sequences are called spacers . When these sequences are transcribed and processed into small RNAs, they guide a multifunctional protein complex (Cas proteins – CRISPR associated proteins) to recognize and cleave incoming foreign genetic material . The diversity of Cas genes suggests that multiple pathways have been developed to use the basic information contained in the CRISPR cassettes in diverse defence mechanisms . This adaptive immunity system was first observed in Escherichia coli in 1987, although its significance was not straightaway apparent. Since then, CRISPR arrays have been identified in approximately 40 % of Bacteria and 90 % of Archaea .
CRISPR cassettes were already characterized across human body sites in different individuals in independent projects – and as a part of the Human Microbiome Project (HMP)  with particular focus in the gut metagenome , , . So far there have been two main approaches for the study of CRISPR in metagenomic samples: one focuses more on the analysis of spacers in raw reads, which are then used to search for CRISPR cassettes ; the other is based on the reconstruction of CRISPR arrays, where direct repeat consensus sequences from known CRISPR types identified in reference genomes are used to recruit reads from the dataset and then be assembled into CRISPR loci . For the present work, and since we didn’t have access to raw reads, we decided to follow a third approach, which is based on the identification of CRISPR cassettes in assembled contigs/scaffolds . This approach differentiates from the previous ones because it allows finding novel CRISPR cassettes, as de novo prediction of CRISPRs relies on sequence features of CRISPRs that don’t exist in short reads. Partially following the strategy of Gogleva et al , we analysed the CRISPR composition of the metagenomic data of a set of Chinese individuals . The findings were compared with those on the 3 different datasets used in Gogleva et al .
Materials and methods
We used the gut metagenome data of a set of Chinese individuals published in . The data contain 145 individuals, with their gut microbiota sequenced using whole genome sequencing . The samples include 71 type-2 diabetic individuals and 74 non-diabetic individuals used as controls. Individual metagenomes were assembled in contigs with the average size of 10,687 bp. The the total length of contigs comprised 15.96 Gb. More information about the samples are provided in Additional file 1.
Identification and analysis of CRISPR cassettes
PILER-CR algorithm parameters
Minimum length of a repeat.
Maximum length of a repeat.
Minimum spacer length.
Minimum number of repeats in an array/cassette.
Maximum spacer length.
Repeat length ratio
Spacer length ratio
A run of minarray repeats with at least mincons similarity is required for an array to be reported. The value is specified as fractional identity 0.0… 1.0, with 1.0 meaning identical sequences.
Repeat clustering and the CRISPRmap tool
Repeat consensus sequences were submitted to the standard BLASTClust procedure to form repeat clusters (parameters: -L 0.5 -S 50 -e F -p F -W 15; L is length coverage threshold, S is bit score coverage, default is 1.75) . BLASTClust clusters nucleotide sequences based on pairwise matches found using the MegaBLAST algorithm for DNA/nucleotides. A single MegaBLAST search is performed for all the sequences combined against a database created from the same sequences. BLASTClust finds pairs of sequences that have statistically significant matches and clusters them using single-linkage clustering . We used a first BLASTClust procedure as a filtering step to find identical repeats and construct a set of unique repeat sequences. And a second BLASTClust procedure was used to form clusters between unique repeat sequences previously found.
Another tool used for repeat clustering and analysis was CRISPRmap . This tool allows CRISPR repeats to be divided into types as part of a comprehensive independent clustering analysis to determine conserved sequence families, potential structure motifs for endoribonucleases, and evolutionary relationships . The repeats are divided among 6 superclasses, named A to F, that comprise 24 families and 18 structural motifs.
Repeat BLAST to CRISPR database
Unique repeat consensus sequences were matched against the CRISPR database CRISPRdb  (http://crispr.u-psud.fr/crispr/BLAST/CRISPRsBlast.php) using a standard BLAST with an e-value threshold of 0.01, in order to find hits for known repeats in our findings and to access the possibility of new repeat sequences. We added a step to remove the matches that have continuous indels of more than 5 nucleotides at either ends of the query or target sequences.
Results and discussions
Characteristics of CRISPR
We used data from the Chinese type-2 diabetes gut microbiota study  to search for CRISPR cassettes in gut metagenome sequences using PILER-CR with a similar pipeline as in Gogleva et al . In the original pipeline by Gogleva et. al. , they also applied both the software CRT (CRISPR Recognition Tool)  and CRISPR Finder  to the dataset in order to find CRISPR cassettes. However, there are some issues in using these software: CRISPR Finder is an online tool and it’s not available for download. Trying to upload large data to the website makes the processing time too long; for CRT, on the other hand, although it is available for offline usage, it also has some usage problems since it considers scaffolds of each individual as connected to each other as a single sequence, which is incorrect and can cause unreliable results. In further research, we should find a way for the program to consider each scaffold individually and make CRT results comparable to PILER-CR results. But for the current project, we chose only to use PILER-CR for this step.
Applying PILER-CR with default parameters, we found a total of 3169 candidate CRISPR cassettes. Given the said conditions, a filtering system was not applied, and all the CRISPR cassettes found were kept for further analysis. Among the total number of CRISPRs found, 1624 cassettes belong to the 74 individuals from the control group and the remaining 1545 cassettes belong to the 71 individuals with type-2 diabetes. CRISPR cassettes have been predicted in all single individuals. The full set of identified cassettes is shown and characterized in Additional file 2.
Summary of the results’ comparison between the set of CRISPR cassetes found for type-2 diabetic individuals and healthy individuals
Metagenomic dataset group
Type-2 diabetic individuals
Healthy (non-diabetic) individuals
Identified by PILER-CR
Average Length (in bp)
Average size (in bp)
Comparing our findings with the ones from Gogleva et al.  on gut samples of from 124 European individuals, downloaded as an assembly of 1,889,651 contigs with a total size of 3732 Mb (http://public.genomics.org.cn/BGI/gutmeta/UniSet/), the most similar dataset to the one we used in terms of data size, we found that the number of CRISPR cassettes predicted by PILER-CR in our work was larger. They initially predicted 121 CRISPR cassettes for 124 individuals. After the filtering process, their reliable set of CRISPR had a total of 78 cassettes including 378 spacers and 74 unique repeats. According to Gogleva et al.’s results from the 3 software they used, PILER-CR presented the least number of predicted CRISPR cassettes, which may be explained by its high specificity considering genomic repeats and low-complexity regions as a candidate CRISPR cassette, a process followed by several refinement steps , .
Summary of the results’ comparison between the CRISPR analysis in this present work and Gogleva et al
In Gogleva et al., 2014
The present work
Identified by PILER-CR
Repeat clustering with CRISPRmap
After collecting the CRISPR cassettes that resulted from the PILER-CR software, we applied a standard BLASTClust procedure and also submitted our data to the CRISPRmap web server. BLASTClust was used in the first phase to find and construct a set of unique repeat sequences. In this work, an ID number or the denomination of “RepeatCluster” was attributed to the repeats. The latter refers to the repeat sequences that are identical and shared by more than two CRISPR arrays. As stated above, the set of unique repeats was composed by a total of 1302 repeat sequences. The set of unique repeats is available in Additional file 3A.
In the second phase, we constructed repeat clusters based on sequence similarity. See Additional file 3B. We consider groups of two or more repeats as a cluster. In this way 121 repeat clusters were formed. The majority of the clusters (68 out of 121) are composed by two repeats. Another 37 clusters contained three or four repeats. The biggest cluster contained 12 repeats. Since BLASTClust can only cluster sequences of high similarity (few mismatches) we consider it can be used as a reliable tool to find identical repeats from an “ocean” of sequences.
CRISPR map clustering of the collection of unique repeat sequences from Chinese individuals’ CRISPR cassettes
Number of repeats
Sequence families associated
The superclasses were ordered according to generally decreasing conservation. Superclass A contains highly conserved CRISPRs repeats at the sequence level, but only a few small structure motifs. Superclasses B and C contain sequence families that roughly correspond to one structure motif each; the same is true for half of superclass D. The other half of superclass D and superclass E contain little sequence conservation, but many small conserved motifs. The bacterial repeats in superclass F are divergent . For the repeats to which a superclass was attributed, 78 of those was also attributed to a sequence family. It’s noted that families are associated with superclasses, and to no repeat was assigned a sequence family but not a superclass.
Repeat BLAST to CRISPRdb database
Our set of unique repeat sequences was then compared to the already known repeats from CRISPR cassettes collected in CRISPRdb  (e-value threshold of 0.01). CRISPRdb is a public database which is updated monthly from newly released genome sequences, to display CRISPRs and to generate dictionaries of spacers and repeats. This database is constructed using CRISPRFinder as a CRISPR identifying software. At the moment, they have in their database a total of 2612 bacterial genomes analysed (last updated 2015-08-05) where they found 1176 strains with convincing CRISPR(s) and 126 out of 150 Archae genomes with convincing CRISPR(s).
From our collected set of repeats, 932 out of the 1302 unique repeats had matches in the CRISPRdb. However, 148 of them have continuous indels at either ends of the query or target sequences. They may be overlapping repeats from different CRISPRs. We took them as unmatched repeats to keep higher confidence in the matched repeats. The repeat sequences that weren’t matched to CRISPRdb can possibly be of novel CRISPR cassettes that have not been reported before. The detailed results are given in Additional file 5. In total, a number of 190 different strains were found. The majority of the matches belonged to Megamonas hypermegale ART12/1, although the number of matches isn’t particularly relevant since it only ends up representing 3 % of the total matches. The second most matched strains, within our repeats, were Bacteroides helcogenes P36-108, Eubacterium rectale M104/1 and Faecalibacterium prausnitzii SL3/3, all strains commonly found in the human gut microbiota , . The last two strains are bacteria that are part of the colonic microbiomes responsible for the fermentation of hexose and pentose sugars . The results obtained may not be completely accurate, but they indicated a large diversity of CRISPR-containing strains present in the human gut microbiome.
Comparison of CRISPR repeats found in different data
We compared the findings from the diabetics patient group and the control group. The diabetics group has 3,777,094 contigs. A total of 1624 repeats were identified, with 790 unique repeats. The control group has 4,262,900 contigs. A total of 1545 repeats were found, with only 600 unique repeats. There are 88 unique repeats shared between the two groups. Additional file 3 shows unique repeats in two groups.
We also compared using BLASTclust the repeats we found with those found by Gogleva et. al.  on the three datasets they used: the European data of 124 individuals, the JPN data of 13 healthy Japanese individuals, and the DG data of 2 healthy adults in the Distal Gut metagenomic project. The latter two data were sequenced with different sequencing platforms. In the DG results, 4 out of 14 unique repeats were clustered into the same group with the unique repeats we found in the Chinese dataset. In the results on the European data, 7 out of 74 unique repeats were clustered with those in our data. In the results on the JPN data, 119 out of 296 are clustered with unique repeats we found in the Chinese data, which means 40 % of the repeats are highly similar with the unique repeats we’ve analysed. This may suggest that the two Asian gut metagenome datasets share higher similarity in terms of CRISPR compositions, but further investigation is needed for a systematic comparison.
We analysed CRISPR content in a human gut metagenomic dataset of Chinese individuals of healthy and type-2 diabetes groups. With some newly released tools for CRISPR identification and post-processing, we were able to profile CRISPR cassettes and their corresponding repeats and spacers in the gut microbiome of this sample set. Comparison with the existing database show that the majority of the identified CRISPRs have been reported in the literature, while there are a quarter of the identifications indicate newly discovered CRISPRs that have not been reported before.
The human gut microbiota is one of the most complicated microbial ecosystems in the human body and has important associations with human health. CRISPR is a major system that microbes use to deal with phage. Therefore analysing the CRISPR composition of a microbiome and of a group of microbiomes can be very informative for understanding the history and function of the microbiome. Also, the microbiota is composed of highly diverse bacteria species that cannot be cultured. Identifying new CRISPRs from metagenome data might also provide an efficient approach for finding possible novel CRISPRs that may be used for genome editing applications.
The collection of spacers and repeats constructed in this work constitutes a base for further studies. The preliminary comparison between CRISPRs found in different groups of data and between different studies showed interesting observations. As multiple gut metagenome datasets have been published in the projects like HMP, MetaHIT and other projects that focus on specific groups of people or specific human diseases, it’ll be interesting and promising to conduct systematic comparative study among different datasets, which may lead to better understanding of the forming, shaping, changing and function of gut microbiome populations in different people.
The current strategy for computational profiling of CRISPRs is still preliminary as there has been no stringent control on the possible false discoveries. Some filtering steps may be necessary to reduce false discoveries. For example, Gogleva et al.  suggested three steps to form a reliable set of CRISPR cassettes: (1) Cassettes predicted by more than two CRISPR predicting programs, should be kept. Using more than one algorithm is recommended as they have complementary strengths for precision and recall; (2) Cassettes predicted by less than two programs, i.e. candidate cassettes, which are adjacent to cas genes, are added to the set of cassettes constructed in the first step; (3) candidate cassettes whose repeat consensus sequence are part of a repeat cluster containing repeats from already established as reliable CRISPR cassettes, are added to the set. This strategy can be further calibrated and improved when more data and features are available.
More advanced studies could investigate more on the Cas proteins and their relationship with CRISPR array repeats, for better understanding of the defence mechanism. Methods like CRISPRmap are important for further studies regarding CRISPR-Cas proteins systems. We should also pay more attention to the spacer sequences and, for example, on discovering how this defence mechanism recognizes short sequence motifs, known to be adjacent to the spacer precursor in the invaders genomes.
We thank Hongfei Cui for her help in pre-processing of the data. This project is supported by the National Basic Research Program of China (Project 2012CB316504). The publication costs for this article were partly funded by the Basic Research Fund of Tsinghua National Laboratory for Information Science and Technology. Tatiana Mangericao is supported by the Student Exchange Program between Tsinghua University, China and Instituto Superior Técnico, Portugal.
This article has been published as part of BMC Systems Biology Volume 10 Supplement 1, 2016: Selected articles from the Fourteenth Asia Pacific Bioinformatics Conference (APBC 2016): Systems Biology. The full contents of the supplement are available online at http://www.biomedcentral.com/bmcsystbiol/supplements/10/S1.
- Li K, Bihan M, Yooseph S, Methé BA: Analyses of the microbial diversity across the human microbiome. PLoS One. 2012, 7 (6): e32118-10.1371/journal.pone.0032118.View ArticlePubMedPubMed CentralGoogle Scholar
- Qin J, Li R, Raes J, Arumugam M, Burgdorf KS, Manichanh C, et al: A human gut microbial gene catalogue established by metagenomic sequencing. Nature. 2010, 464: 59-65. 10.1038/nature08821.View ArticlePubMedPubMed CentralGoogle Scholar
- The Human Microbiome Project (HMP). http://hmpdacc.org/overview/about.php
- The MetaHit project. http://www.metahit.eu/index.php?id=453
- Conly JM, Stein K, Worobetz L, Rutledge-Harding S: The contribution of vitamin K2 (menaquinones) produced by the intestinal microflora to human nutritional requirements for vitamin K. Am J Gastroenterol. 1994, 89 (6): 915-923.PubMedGoogle Scholar
- Hill MJ: Intestinal flora and endogenous vitamin synthesis. Eur J Cancer Prev. 1997, 6 (2): S43-5. 10.1097/00008469-199703001-00009.View ArticlePubMedGoogle Scholar
- Cummings JH: Microbial digestion of complex carbohydrates in man. Proc Nutr Soc. 1984, 43 (01): 35-44. 10.1079/PNS19840025.View ArticlePubMedGoogle Scholar
- Black DD, Jianhui D, Heng W: Regulation of apolipoprotein secretion by long-chain polyunsaturated fatty acids in newborn swine intestinal epithelial cells. Pediatr Res. 1998, 43: 98-10.1203/00006450-199804001-00582.View ArticleGoogle Scholar
- Backhed F: Host-bacterial mutualism in the human intestine. Science. 2005, 307 (5717): 1915-920. 10.1126/science.1104816.View ArticlePubMedGoogle Scholar
- Paterson S, Vogwill T, Buckling A, Benmayor R, Spiers AJ, Thomson NR, et al: Antagonistic coevolution accelerates molecular evolution. Nature. 2010, 464 (7286): 275-278. 10.1038/nature08798.View ArticlePubMedPubMed CentralGoogle Scholar
- Riley PA: Bacteriophages in autoimmune disease and other inflammatory conditions. Med Hypotheses. 2004, 62 (4): 493-498. 10.1016/j.mehy.2003.12.016.View ArticlePubMedGoogle Scholar
- Yosef I, Qimron U: Microbiology: how bacteria get spacers from invaders. Nature. 2015, 519 (7542): 166-67. 10.1038/nature14204.View ArticlePubMedGoogle Scholar
- Bhaya D, Davison M, Barrangou R: CRISPR-Cas systems in bacteria and archaea: versatile small RNAs for adaptive defense and regulation. Annu Rev Genet. 2011, 45 (1): 273-97. 10.1146/annurev-genet-110410-132430.View ArticlePubMedGoogle Scholar
- Rho M, Wu Y-W, Tang H, Doak TG, Ye Y: Diverse CRISPRs evolving in human microbiomes. PLoS Genet. 2012, 8 (6): e1002441-10.1371/journal.pgen.1002441.View ArticlePubMedPubMed CentralGoogle Scholar
- Sorek R, Kunin V, Hugenholtz P: CRISPR–a widespread system that provides acquired resistance against phages in bacteria and archaea. Nat Rev Microbiol. 2008, 6: 181-186. 10.1038/nrmicro1793.View ArticlePubMedGoogle Scholar
- Pride DT, Sun CL, Salzman J, Rao N, Loomer P, Armitage GC, et al: Analysis of streptococcal CRISPRs from human saliva reveals substantial sequence diversity within and between subjects over time. Genome Res. 2011, 21 (1): 126-136. 10.1101/gr.111732.110.View ArticlePubMedPubMed CentralGoogle Scholar
- Pride DT, Salzman J, Relman DA: Comparisons of clustered regularly interspaced short palindromic repeats and viromes in human saliva reveal bacterial adaptations to salivary viruses. Environ Microbiol. 2012, 14 (9): 2564-2576. 10.1111/j.1462-2920.2012.02775.x.View ArticlePubMedPubMed CentralGoogle Scholar
- Robles-Sikisaka R, Ly M, Boehm T, Naidu M, Salzman J, Pride DT: Association between living environment and human oral viral ecology. ISME J. 2013, 7 (9): 1710-1724. 10.1038/ismej.2013.63.View ArticlePubMedPubMed CentralGoogle Scholar
- Stern A, Mick E, Tirosh I, Sagy O, Sorek R: CRISPR targeting reveals a reservoir of common phages associated with the human gut microbiome. Genome Res. 2012, 22 (10): 1985-1994. 10.1101/gr.138297.112.View ArticlePubMedPubMed CentralGoogle Scholar
- Mick E, Stern A, Sorek R: Holding a grudge: persisting anti-phage CRISPR immunity in multiple human gut microbiomes. RNA Biol. 2013, 10 (5): 900-906. 10.4161/rna.23929.View ArticlePubMedPubMed CentralGoogle Scholar
- Gogleva AA, Gelfand MS, Artamonova II: Comparative analysis of CRISPR cassettes from the human gut metagenomic contigs. BMC Genomics. 2014, 15: 202-10.1186/1471-2164-15-202.View ArticlePubMedPubMed CentralGoogle Scholar
- Qin J, Li Y, Cai Z, Li S, Zhu J, Zhang F, et al: A metagenome-wide association study of gut microbiota in type 2 diabetes. Nature. 2012, 490 (7418): 55-60. 10.1038/nature11450.View ArticlePubMedGoogle Scholar
- Edgar RC, Myers EW: PILER: Identification and classification of genomic repeats. Bioinformatics. 2005, 21 (Suppl. 1): 152-158. 10.1093/bioinformatics/bti1003.View ArticleGoogle Scholar
- BLASTClust tool. http://www.animalgenome.org/blast/doc/blastclust.html
- Lange SJ, Alkhnbashi OS, Rose D, Will S, Backofen R: CRISPRmap: an automated classification of repeat conservation in prokaryotic adaptive immune systems. Nucleic Acids Res. 2013, 41 (17): 8034-044. 10.1093/nar/gkt606.View ArticlePubMedPubMed CentralGoogle Scholar
- Grissa I, Vergnaud G, Pourcel C: The CRISPRdb database and tools to display CRISPRs and to generate dictionaries of spacers and repeats. BMC Bioinformatics. 2007, 8 (1): 172-10.1186/1471-2105-8-172.View ArticlePubMedPubMed CentralGoogle Scholar
- Bland C, Ramsey TL, Sabree F, Lowe M, Brown K, Kyrpides NC, et al: CRISPR recognition tool (CRT): a tool for automatic detection of clustered regularly interspaced palindromic repeats. BMC Bioinformatics. 2007, 8 (1): 209-10.1186/1471-2105-8-209.View ArticlePubMedPubMed CentralGoogle Scholar
- CRISPR Finder online tool. http://crispr.u-psud.fr/Server/CRISPRfinder.php
- Sun J, Chang EB: Exploring gut microbes in human health and disease: pushing the envelope. Genes Dis. 2014, 1 (2): 132-39. 10.1016/j.gendis.2014.08.001.View ArticlePubMedPubMed CentralGoogle Scholar
- The normal bacterial flora of humans. Online textbook. http://textbookofbacteriology.net/normalflora.html
- Russell WR, Hoyles L, Flint HJ, Dumas M-E: Colonic bacterial metabolites and human health. Curr Opin Microbiol. 2013, 16 (3): 246-54. 10.1016/j.mib.2013.07.002.View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.