Skip to main content

Advertisement

Table 9 Results of analysis of intersection of relevant SNPs given by the ML models, with GWAS Catalog records associated with LC and Cancer

From: Pipeline design to identify key features and classify the chemotherapy response on lung cancer patients using large-scale genetic data

Pipeline # of ML Rank ML Rank ML Rank
  features cat ALL cat LUNG cat CANCER
RFE-LR + Up-sampling + RF 257 0 0 0
RLR-L1 + SMOTE-sampling + KNN 13 0 0 0
ANOVA + No sampling + RF 144 0 0 0
RFE-LR + SMOTE-sampling + RF 238 1 0 0
ANOVA + No sampling + Linear SVM 193 0 0 0
ANOVA + Up-sampling + Linear SVM 193 0 0 0
ANOVA + SMOTE-sampling + Linear SVM 193 0 0 0
RLR-L1 + SMOTE-sampling + RF 3 0 0 0
ANOVA + No sampling + KNN 95a 0 0 0
RFE-LR + No sampling + RF 305 0 0 0
RFE-LR + No sampling + KNN 148b 2 0 0
RLR-L1 + No sampling + KNN 17 0 0 0
RLR-L1 + Up-sampling + KNN 16 0 0 0
RFE-LR + Down-sampling + KNN 148b 2 0 0
RFE-LR + No sampling + Linear SVM 148b 2 0 0
RFE-LR + Up-sampling + Linear SVM 148b 2 0 0
RFE-LR + SMOTE-sampling + Linear SVM 148b 2 0 0
ANOVA + SMOTE-sampling + RF 193 0 0 0
RLR-L1 + No sampling + RF 17 0 0 0
ANOVA + Up-sampling + RF 193 0 0 0
  1. acorresponds to 5% of the top features selected by the ANOVA feature selection method. bcorresponds to 0,1% of the top features selected by the RFE-LR feature selection method