- Open Access
A diVIsive Shuffling Approach (VIStA) for gene expression analysis to identify subtypes in Chronic Obstructive Pulmonary Disease
- Jörg Menche†1, 2, 3,
- Amitabh Sharma†1, 2,
- Michael H Cho4,
- Ruth J Mayer5,
- Stephen I Rennard6,
- Bartolome Celli7,
- Bruce E Miller5,
- Nick Locantore8,
- Ruth Tal-Singer5,
- Soumitra Ghosh5,
- Chris Larminie9,
- Glyn Bradley9,
- John H Riley9,
- Alvar Agusti10, 13,
- Edwin K Silverman4 and
- Albert-László Barabási1, 2, 3, 11, 12Email author
© Menche et al; licensee BioMed Central Ltd. 2014
- Published: 13 March 2014
An important step toward understanding the biological mechanisms underlying a complex disease is a refined understanding of its clinical heterogeneity. Relating clinical and molecular differences may allow us to define more specific subtypes of patients that respond differently to therapeutic interventions.
We developed a novel unbiased method called diVIsive Shuffling Approach (VIStA) that identifies subgroups of patients by maximizing the difference in their gene expression patterns. We tested our algorithm on 140 subjects with Chronic Obstructive Pulmonary Disease (COPD) and found four distinct, biologically and clinically meaningful combinations of clinical characteristics that are associated with large gene expression differences. The dominant characteristic in these combinations was the severity of airflow limitation. Other frequently identified measures included emphysema, fibrinogen levels, phlegm, BMI and age. A pathway analysis of the differentially expressed genes in the identified subtypes suggests that VIStA is capable of capturing specific molecular signatures within in each group.
The introduced methodology allowed us to identify combinations of clinical characteristics that correspond to clear gene expression differences. The resulting subtypes for COPD contribute to a better understanding of its heterogeneity.
- Chronic Bronchitis
- gene expression analysis
Chronic obstructive pulmonary disease (COPD) is one of the most prevalent chronic diseases (4th cause of death globally), with increasing incidence worldwide. Understanding of the disease pathobiology is far from complete and only few novel therapeutic mechanisms of action have been identified. Tobacco smoking is the main risk factor for COPD, but only a fraction of all smokers develops the disease . This variable response to smoking, plus the observation that COPD aggregates in families, strongly suggest a genetic component to the disease [2–6]. Yet, COPD is a very heterogeneous and complex disease, with varied pulmonary and extra-pulmonary clinical manifestations . Understanding and characterizing this biological and clinical heterogeneity could help identify subgroups of patients (subtypes) that may benefit from different therapeutic strategies . To investigate the genomic and pathobiological basis of COPD subtypes with distinct clinical manifestations, we applied several novel and complementary computational strategies to differential gene expression analysis. We used expression data from induced sputum samples of former smokers with COPD and varying degree of airflow limitation. The patients are a subset of the large ECLIPSE cohort, which is a multi-center, 3 year observational international study that collected clinical, genetic, proteomic and biomarker measures in a population of COPD patients . Specifically, in the current study we sought to: (i ) compare the gene expression pattern between patient groups with different clinical characteristics; (ii) conversely, assess the clinical characteristics of groups of patients with distinct gene expression patterns identified by a novel diVIsive Shuffling Approach (VIStA) developed specifically for this purpose (see below). Unexpectedly, we found that the reverse approach (ii) showed greater potential to identify specific pathways that may offer novel therapeutic targets  than the traditional approach (i).
Study design, participants and ethics
Summary of the characteristics of 140 subjects with sputum gene expression data from the ECLIPSE Cohort.
Demographics and clinical data
65 ± 5.5
Body mass index, Kg/m2
26.8 ± 5.2
Smoking exposure, pack-yrs.
48.3 ± 29.1
Annual Exacerbation rate, year-1
0.98 ± 1.6
1.26 ± 0.45
FEV1, % revers.
9.5 ± 10.4
43.2 ± 11.5
Emphysema, -950HU %
19.2 ± 12.2
Emphysema, extent code
2.8 ± 1.8
8.24 ± 15.0
7.8 ± 36
9.3 ± 5.2
121.7 ± 46
Fibronogen (mg/dL) -
481.9 ± 107.6
103.2 ± 624
120.6 ± 78
Total cell count, × 106
7.5 ± 1.78
64.8 ± 8.5
3.1 ± 2.04
25.4 ± 7.9
Selection of clinical measures
Summary of the clinical characteristics of COPD patients identified as most relevant by clinical experts.
Continuous Variable for Quantitative Analysis
Differentially expressed genes at FDR < 0.05
Cough with Phlegm for at least 3 mos/yr for at least 2 years
low extreme (Q1 = 64)
neither chronic cough nor chronic phlegm
high extreme (Q4 = 46)
both chronic cough and chronic phlegm
History of Exacerbations
Number of exacerbations per year
2 or more per year and less than 2 per year
low extreme (Q1 = 26)
0 - Never
high extreme (Q4 = 17)
3 - Always
Body Mass Index (Kg/m2)
BMI < 21, 21-30, > 30
low extreme (Q1 = 18)
BMI < 21
high extreme (Q4 = 35)
BMI > 30
Airflow Limitation severity
FEV1 (% predicted)
low extreme (Q1 = 69)
< 2-GOLD stage
high extreme (Q4 = 13)
>4 GOLD stage
6 Minute Walk Distance
< 350 meters and > 350 meters
low extreme (Q1 = 38)
high extreme (Q4 = 101)
Emphysema severity category:
low extreme (Q1 = 40)
0-1.5 -No emphysema
Not affected (N): 0
high extreme (Q4 = 45)
4-5 - severe
Trivial (T): 1
Mild (M) 5-25%: 2
Moderate (O) 25-50%: 3
Severe (S) 50-75%: 4
Very Severe (V) > 75%: 5
Emphysema at -950 HU
Emphysema >10% (Yes/No)
low extreme (Q1 = 37)
Emphysema >10% = No
high extreme (Q4 = 95)
Emphysema >10% = yes
CT Airway Disease
Pi10 (Square root of wall area of 10 mm internal perimeter airways)
GOLD Stages 2-4 with Emphysema < 5% (Yes) or > 5% (No)
low extreme (Q1 = 63)
Trivial (< %5)
high extreme (Q4 = 33)
Severe (50-75%, very severe (>75%))
Note that there were no controls with normal lung function among the subjects. Hence, we cannot compare COPD to normal but only the differences between COPD patients .
Relationship between clinical characteristics and gene expression
To investigate the relationships between differences in gene expression and clinical trait occurrence, we used two complementary analyses:
(i) For each of the clinical characteristics introduced above, we divided the patients into two groups based on clinically relevant cut-points (Table 2, column 5) and computed gene expression differences between the two groups. Gene expression analysis was performed using Significance Analysis of Microarrays (SAM)  with a false discovery rate (FDR) of 5% as cutoff.
(ii) We used VIStA (see below) to identify groups of patients with maximized differential gene expression and then compared their clinical characteristics.
diVIsive Shuffling Approach (VIStA)
We developed a novel unbiased method called diVIsive Shuffling Approach (VIStA) to identify groups of patients with maximal difference in gene expression. The algorithm consists of the following steps:
(iii) n subjects are randomly partitioned into three groups of comparable size (Figure 1A). A SAM analysis is performed and the number of genes differentially expressed between groups 1 and 2 is counted. Group 3 serves as a "reservoir" of individuals for the subsequent steps.
(iv) An individual from group 1 or 2 is randomly swapped with an individual from the reservoir group 3. We repeat the SAM analysis, counting again the new number of differentially expressed genes (Figure 1A). If this count increases, the swap is accepted, otherwise rejected.
(v) Step (ii) is iterated until the number of differentially expressed genes reaches a plateau (Figure 1B), typically after approximately 1000 attempted swaps. The corresponding groups 1 & 2 represent a combination of patients with high differential gene expression.
Starting with different random initial configurations, we repeat the whole procedure (i) through (iii) 500 times, resulting in 500 end configurations, each characterized by a large number of differentially expressed genes. In order to explore the extent to which these 500 subdivisions are clinically relevant and distinct, we analyze them individually for statistically significant differences in clinical characteristics between the members of group 1 and 2. For each subdivision, we identify the set of clinical characteristics (Table 2) that differ significantly between patients in group 1 and group 2 using a Mann-Whitney-U-test (significance threshold of p- value ≤ 0.05) for all continuous characteristics (e.g. BMI) and Fisher's exact test for binary characteristics (e.g. gender) (Figure 1C). We find that with the exception of two subdivisions, all the remaining 498 subdivisions show a statistically significantly difference in at least one clinical characteristic. This suggests that the shuffling algorithm indeed does identify biologically or clinically distinct divisions of patients in most cases. The frequency with which individual clinical characteristics appear as significantly different between the two groups can therefore be used to identify the combinations of clinical characteristics that co-determine gene expression differences.
Note that the VIStA approach is fundamentally different from clustering techniques like hierarchical or k-means clustering. The latter attempt to identify cohesive groups based on similarity, while VIStA, on the contrary, is a divisive algorithm based on maximizing the differences between groups. Another important difference to standard clustering approaches is that by design VIStA is able to identify a large number of locally optimal divisions.
We use a relatively low confidence cut-off of FDR≤ 0.1 for the SAM analysis in steps (i ) and (ii ) in order to facilitate the emergence of an initial "seed"-grouping. Sensitivity of parameter estimates were robust to variation in the exact choice. Within the SAM framework, the FDR is based on a comparison with random permutations, see  for details.
Note that instead of SAM one could also use other approaches to determine the number of differentially expressed genes at each iteration step, for example using the p- values of simple t-tests or a minimal fold-change. As VIStA consists of repeated differential expression analyses, the same limitations as for conventional approaches apply for the minimal number of subjects and general data quality.
We implemented a reservoir of 40 subjects (group 3) in order to resemble a gene expression analysis based on extremes, e.g. the 25% of subjects with the lowest BMI vs the 25% of subjects with the highest BMI. In principle, the third group is not strictly necessary, as shuffling can be performed between two groups. Increasing the size of the reservoir group could affect power through selection of more extreme subjects or by reducing the sample size for the differential expression analysis, so it will depend on the concrete application, whether or not a reservoir is useful.
As detailed below, we find that 500 independent runs of VIStA provided sufficient statistical power for a robust distinction between four different subgroups in this study. Generally, a higher number of independent runs could lead to the discovery of more subtle subgroups. It is important to note, however, that the predictive power of the approach is ultimately limited by the quality and size of the expression data, as well as the clinical characteristics.
The algorithm was implemented in the programming language C. A single run with 2,000 iterations takes around three hours on a standard PC. However, the vast majority of the computing time is used to perform the SAM analysis, so using a simpler technique for the differential gene expression analysis would drastically speed up the execution time if necessary.
Differential gene expression of single clinical characteristics
We first attempted to identify statistically significant gene expression differences between patient groups that differ in a single clinical characteristic. To be specific, we aimed to identify genes that were differentially expressed at FDR < 0.05 using bins of clinical characteristics as presented in Table 2, such as COPD severity, the history of exacerbation or BMI. As shown in Table 2, apart from the severity of airflow limitation as assessed by the GOLD stage, none of the other clinical measures identified significant gene expression changes. This failure suggests that these clinical characteristics are not sufficiently discriminative to capture gene expression variation in COPD. We hypothesized that there are indeed potential molecular drivers to disease heterogeneity, but a single clinical characteristic is unable to capture them. Therefore, we developed an inverse (divisive) clustering methodology to group the 140 COPD patients included in the study based on their gene expression patterns, and then explored the clinical characteristics of the obtained groups (Figure 1).
Combination of clinical traits from VIStA
Figure 2B illustrates how often combinations (pairs) of significant single clinical characteristics (or inflammatory biomarkers) co-occur in the different VIStA subtypes by the width of the links between them. The statistical significance of each co-occurrence (Figure 2C-E) was calculated using a binomial model that assumes independence of the individual characteristics or biomarker levels as the Null hypothesis. In order to quantify the extent to which the VIStA outcomes could reflect spurious associations, we also generated 10,000 random divisions of the patients and analyzed how often the individual characteristics and their combinations appear as significant (Figure 2C-E). We find that the divisions obtained by VIStA show a much higher number of significant clinical characteristics than expected by chance, with the exceptions of the biomarkers CCL18, TNFA and SPD and the variables COUGH and SEX. Similarly, also combinations of significant characteristics appear more frequent than for randomly assigned division. We observed (Figure 2C) that the pairwise co-occurrences of clinical characteristics and inflammatory biomarkers were dominated by airflow limitation severity (GOLDCD). Other characteristics frequently observed in combinations include emphysema (EMPHETCD or FV950), fibrinogen levels, phlegm, BMI and age. Most pairs appear with the frequency expected for the Null hypothesis of independent individual clinical characteristics (see the many non-significant p- values in Figure 2C-E), implying that their association is not significant (e.g. EMPHETCD and GOLDCD). A notable exception is EMPHETCD and FV950, whose statistical association is expected, given that the two variables are not independent but are different measures of the same clinical characteristic (emphysema). Figure 2D, E shows the observed and expected co-occurrence of triplets and quartets of clinical characteristics and inflammatory biomarkers. The most frequent and significant triplet consists of severity of airflow limitation (GOLDCD) and the two emphysema measures EMPHETCD and FV950. GOLDCD and either one of the severity of emphysema measures FV950 or EMPHETCD co-occurred in almost all triplets, which is again expected given their pathobiological relationship in patients with COPD. Figure 2E lists the most frequent combinations of four variables. We find that the most significant combinations are those which include the triple GOLDCD, FV950 and EMPHETCD, together with one additional variable, the most significant being FIBRINOGEN, BMI, PHLEGM, DWALK and age. In the following, however, we have not considered fibrinogen as the basis for a subtype since it is a biomarker rather than a clinical characteristic.
Summary of the clinical measures, biomarkers, and cell counts among the four groups of COPD patients identified from the results of Figure 2: each group combines GOLDCD, EMPHETCD and FV950, with either BMI (Group I), DWALK (Group II), AGE (Group III) or Phlegm (Group IV).
(n = 25)
Group-IB, (n = 23)
(n = 21)
(n = 32)
Group -IIIA (n = 15)
(n = 28)
(n = 20)
(n = 26)
FEV1 reversibility (%)
Emphysema at -950 HU
Body Mass Index
Chronic Bronchitis (ATS_CB)
1 = no-CB
1 = 24%
1 = 30.4%
1 = 85.7%
1 = 62.5%
1 = 6.7%
1 = 37%
1 = 46.2%
1 = no chronic phlegm
1 = 56%
1 = 35%
1 = 62%
1 = 41%
1 = 66.6%
1 = 33%
1 = 0%
6 Minute Walk Distance
Exacerbations 0 = no-Exacerbations
0 = 68%
0 = 34.8%
0 = 71.4%
0 = 28.1%
0 = 60%
0 = 37%
0 = 70%
0 = 38.5%
% Fat (Tissue)
Neutrophils, % Neut_Blq
Eosinophils, % Eos blq
Lymphocytes, % lymhblq
To further characterize these subtypes suggested by VIStA we subdivided the full set of all 140 ECLIPSE patients according to the identified clinical characteristics, resulting in 8 groups of 15 to 28 patients. First, we explored a number of clinical, biomarker and cell count measures of the subjects in each group. We find, for example, that serum levels of the biomarkers IL6, IL8 and SPD are significantly higher in group IIIB than in IIIA, a difference that was not observed in other groups. Similarly, the proportion of neutrophils and lymphocytes in sputum were significantly higher in group IIIB in comparison to IIIA (Table 3).
Specific genes & pathways in the groups from VIStA
The 10 most strongly enriched pathways in the set of genes common among all four groups described in table 3, as well as in the individual gene sets of each group.
Top ten pathways among Common Genes
all pathway genes
Top ten pathways among Group 1 Genes
all pathway genes
Top ten pathways among Group II Genes
all pathway genes
REACTOME_REG ULATION_OF_GENE_EXPRESSIO N_IN_B ETA_CELLS
Top ten pathways among Group III Genes
all pathway genes
Top ten pathways among Group IV Genes
all pathway genes
Top ten upregulated and downregulated unique genes and their fold-change (FC) in each group (In group II, only five unique genes are downregulated).
We have found that with the exception of severity of airflow limitation, categorizing COPD subtypes according to a single clinical characteristic does not yield groups of patients with significant gene expression differences. In this study, we therefore introduced a novel methodology that allowed us to identify combinations of clinical characteristics that correspond to clear gene expression differences.
Our results suggest that while gene expression differences are mainly driven by the severity of airflow limitation and the extent of emphysema, a smaller, yet discriminative contribution is also observed for a set of additional clinical characteristics: BMI, distance walked, age and chronic phlegm production, each defining a subtype of patients. Validation of these groups and the underlying pathways will require replication in a second cohort of COPD subjects. Note that additional differences may also exist for clinical characteristics that have not been considered in the present study.
The observed subgroups with combinations of different clinical characteristics are consistent with the clinical heterogeneity of COPD, where a given patient may manifest more than one measurable feature of COPD, suggesting either that the underlying mechanisms contribute to more than one feature or that multiple mechanisms are maladapted in an individual.
While we focused on COPD in this study, the proposed VIStA method can be more generally applied to any other complex, heterogeneous disease and presents a promising approach to the important problem of disease heterogeneity and subtyping/subgrouping. A better understanding of this problem is invaluable, for example, for improving the selection of patients for evaluating novel agents. To the extent that gene expression reflects genetic and epigenetic variation, the subtypes identified by our method may further suggest different approaches to identifying genetic susceptibility.
This work was partially supported by MapGen grant (1U01HL108630-01), by the EC-FP7 Program, Synergy-COPD, GA n° 270086, as well as by by COST-BMBS, Action BM1006 "Next Generation Sequencing Data Analysis Network", SeqAhead.
Principal investigators and centers participating in ECLIPSE (NCT00292552):
Bulgaria: Yavor Ivanov, Pleven; Kosta Kostov, Sofia. Canada: Jean Bourbeau, Montreal, Que; Mark Fitzgerald, Vancouver, BC; Paul Hernandez, Halifax, NS; Kieran Killian, Hamilton, On; Robert Levy, Vancouver, BC; Francois Maltais, Montreal, Que; Denis O'Donnell, Kingston, On. Czech Republic: Jan Krepelka, Praha. Denmark: JØrgen Vestbo, Hvidovre. Netherlands: Emiel Wouters, Horn. New Zealand: Dean Quinn, Wellington. Norway: Per Bakke, Bergen. Slovenia: Mitja Kosnik, Golnik. Spain: Alvar Agusti, Jaume Sauleda, Palma de Mallorca. Ukraine: Yuri Feschenko, Kiev; Vladamir Gavrisyuk, Kiev; Lyudmila Yashina, Kiev; Nadezhda Monogarova, Donetsk. United Kingdom: Peter Calverley, Liverpool; David Lomas, Cambridge; William MacNee, Edinburgh; David Singh, Manchester; Jadwiga Wedzicha, London. United States of America: Antonio Anzueto, San Antonio, TX; Sidney Braman, Providence, RI; Richard Casaburi, Torrance CA; Bart Celli, Boston, MA; Glenn Giessel, Richmond, VA; Mark Gotfried, Phoenix, AZ; Gary Greenwald, Rancho Mirage, CA; Nicola Hanania, Houston, TX; Don Mahler, Lebanon, NH; Barry Make, Denver, CO; Stephen Rennard, Omaha, NE; Carolyn Rochester, New Haven, CT; Paul Scanlon, Rochester, MN; Dan Schuller, Omaha, NE; Frank Sciurba, Pittsburgh, PA; Amir Sharafkhaneh, Houston, TX; Thomas Siler, St. Charles, MO, Edwin Silverman, Boston, MA; Adam Wanner, Miami, FL; Robert Wise, Baltimore, MD; Richard ZuWallack, Hartford, CT. Steering Committee: Harvey Coxson (Canada), Lisa Edwards (GlaxoSmithKline, USA), David Lomas (UK), William MacNee (UK), Edwin Silverman (USA), Ruth Tal-Singer (Co-chair, GlaxoSmithKline, USA), Jørgen Vestbo (Co-chair, Denmark), Julie Yates (GlaxoSmithKline, USA). Scientific Committee: Alvar Agusti (Spain), Peter Calverley (UK), Bartolome Celli (USA), Courtney Crim (GlaxoSmithKline, USA), Gerry Hagan (GlaxoSmithKline, UK), William MacNee (Chair, UK), Bruce Miller (GlaxoSmithKline, USA), Stephen Rennard (USA), Ruth Tal-Singer (GlaxoSmithKline, USA), Emiel Wouters (The Netherlands), Julie Yates (GlaxoSmithKline, USA).
The publication costs for this article were funded by Northeastern University, MA, USA.
This article has been published as part of BMC Systems Biology Volume 8 Supplement 2, 2014: Selected articles from the High-Throughput Omics and Data Integration Workshop. The full contents of the supplement are available online at http://www.biomedcentral.com/bmcsystbiol/supplements/8/S2.
- Vestbo J, Hurd SS, Agusti AG, Jones PW, Vogelmeier C, Anzueto A, Barnes PJ, Fabbri LM, Martinez FJ, Nishimura M, Stockley RA, Sin DD, Rodriguez-Roisin R: Global strategy for the diagnosis, management, and prevention of chronic obstructive pulmonary disease: Gold executive summary. American journal of respiratory and critical care medicine. 2013, 187 (4): 347-65. 10.1164/rccm.201204-0596PP.View ArticlePubMedGoogle Scholar
- Bhattacharya S, Srisuma S, Demeo DL, Shapiro SD, Bueno R, Silverman EK, Reilly JJ, Mariani TJ: Molecular biomarkers for quantitative and discrete copd phenotypes. American journal of respiratory cell and molecular biology. 2009, 40 (3): 359-67. 10.1165/rcmb.2008-0114OC.PubMed CentralView ArticlePubMedGoogle Scholar
- Singh D, Fox SM, Tal-Singer R, Plumb J, Bates S, Broad P, Riley JH, Celli B: Induced sputum genes associated with spirometric and radiological disease severity in copd ex-smokers. Thorax. 2011, 66 (6): 489-95. 10.1136/thx.2010.153767.View ArticlePubMedGoogle Scholar
- Pierrou S, Broberg P, O'Donnell RA, Pawlowski K, Virtala R, Lindqvist E, Richter A, Wilson SJ, Angco G, Moller S, Bergstrand H, Koopmann W, Wieslander E, Stromstedt PE, Holgate ST, Davies DE, Lund J, Djukanovic R: Expression of genes involved in oxidative stress responses in airway epithelial cells of smokers with chronic obstructive pulmonary disease. American journal of respiratory and critical care medicine. 2007, 175 (6): 577-86. 10.1164/rccm.200607-931OC.View ArticlePubMedGoogle Scholar
- DeMeo D, Mariani T, Lange C, Lake S, Litonjua A, Celedon J, Reilly J, Chapman HA, Sparrow D, Spira A, Beane J, Pinto-Plata V, Speizer FE, Shapiro S, Weiss ST, Silverman EK: The serpine2 gene is associated with chronic obstructive pulmonary disease. Proceedings of the American Thoracic Society. 2006, 3 (6): 502-10.1513/pats.200603-070MS.View ArticlePubMedGoogle Scholar
- Spira A, Beane J, Pinto-Plata V, Kadar A, Liu G, Shah V, Celli B, Brody JS: Gene expression profiling of human lung tissue from smokers with severe emphysema. American journal of respiratory cell and molecular biology. 2004, 31 (6): 601-10. 10.1165/rcmb.2004-0273OC.View ArticlePubMedGoogle Scholar
- Agusti A, Calverley PM, Celli B, Coxson HO, Edwards LD, Lomas DA, MacNee W, Miller BE, Rennard S, Silverman EK, Tal-Singer R, Wouters E, Yates JC, Vestbo J: Characterisation of copd heterogeneity in the eclipse cohort. Respiratory research. 2010, 11: 122-PubMed CentralPubMedGoogle Scholar
- Agusti A, Sobradillo P, Celli B: Addressing the complexity of chronic obstructive pulmonary disease: from phenotypes and biomarkers to scale-free networks, systems biology, and p4 medicine. American journal of respiratory and critical care medicine. 2011, 183 (9): 1129-37. 10.1164/rccm.201009-1414PP.View ArticlePubMedGoogle Scholar
- Vestbo J, Anderson W, Coxson HO, Crim C, Dawber F, Edwards L, Hagan G, Knobil K, Lomas DA, MacNee W, Silverman EK, Tal-Singer R: Evaluation of copd longitudinally to identify predictive surrogate end-points (eclipse). The European respiratory journal: official journal of the European Society for Clinical Respiratory Physiology. 2008, 31 (4): 869-73. 10.1183/09031936.00111707.View ArticleGoogle Scholar
- Barabasi AL, Gulbahce N, Loscalzo J: Network medicine: a network-based approach to human disease. Nature reviews Genetics. 2011, 12 (1): 56-68. 10.1038/nrg2918.PubMed CentralView ArticlePubMedGoogle Scholar
- Coxson HO, Dirksen A, Edwards LD, Yates JC, Agusti A, Bakke P, Calverley PM, Celli B, Crim C, Duvoix A, Fauerbach PN, Lomas DA, MacNee W, Mayer RJ, Miller BE, Müller NL, Rennard SI, Silverman EK, Tal-Singer R, Wouters EF, Vestbo J: The presence and progression of emphysema in copd as determined by ct scanning and biomarker expression: a prospective analysis from the eclipse study. The Lancet Respiratory Medicine. 2013, 1 (2): 129-136. 10.1016/S2213-2600(13)70006-7.View ArticlePubMedGoogle Scholar
- Agusti A, Edwards LD, Rennard SI, MacNee W, Tal-Singer R, Miller BE, Vestbo J, Lomas DA, Calverley PM, Wouters E, Crim C, Yates JC, Silverman EK, Coxson HO, Bakke P, Mayer RJ, Celli B: Persistent systemic inflammation is associated with poor clinical outcomes in COPD: a novel phenotype. PLoS One. 2012, 7 (5): 37483-10.1371/journal.pone.0037483.View ArticleGoogle Scholar
- Larsson O, Wahlestedt C, Timmons JA: Considerations when using the significance analysis of microarrays (sam) algorithm. BMC bioinformatics. 2005, 6: 129-10.1186/1471-2105-6-129.PubMed CentralView ArticlePubMedGoogle Scholar
- Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, et al: Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. PNAS. 2005, 102 (43): 15545-15550. 10.1073/pnas.0506580102.PubMed CentralView ArticlePubMedGoogle Scholar
- Luo J, Chen YJ, Narsavage GL, Ducatman A: Predictors of survival in patients with non-small cell lung cancer. Oncology nursing forum. 2012, 39 (6): 609-16. 10.1188/12.ONF.609-616.View ArticlePubMedGoogle Scholar
- DeRisi J, Penland L, Brown PO, Bittner ML, Meltzer PS, Ray M, Chen Y, Su YA, Trent JM: Use of a cdna microarray to analyse gene expression patterns in human cancer. Nature genetics. 1996, 14 (4): 457-60.View ArticlePubMedGoogle Scholar
- Wellmann A, Thieblemont C, Pittaluga S, Sakai A, Jaffe ES, Siebert P, Raffeld M: Detection of differentially expressed genes in lymphomas using cdna arrays: identification of clusterin as a new diagnostic marker for anaplastic large-cell lymphomas. Blood. 2000, 96 (2): 398-404.PubMedGoogle Scholar
- Maquoi E, Munaut C, Colige A, Collen D, Lijnen HR: Modulation of adipose tissue expression of murine matrix metalloproteinases and their tissue inhibitors with obesity. Diabetes. 2002, 51 (4): 1093-101. 10.2337/diabetes.51.4.1093.View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.