Skip to main content

Table 4 Parameters tested using grid-search and 5-fold CV. EFD refers to the “Extended Framework Design”

From: Pipeline design to identify key features and classify the chemotherapy response on lung cancer patients using large-scale genetic data

Pipeline step

Parameter options

ANOVA

EFD (Partial analysis): percentile = 2% of total # of variables

 

EFD (Final analysis): percentile = 10% of total # of variables

 

LR penalty = ’l1’

 

C = 1

RFE-LR

RFE EFD (Partial analysis):

 

n_features_to_select = 2% of total # of variables,

 

step = 4%

 

EFD (Final analysis):

 

n_features_to_select = 10% of total # of variables,

 

step = 10%

RLR-L1

penalty = ’l1’

 

EFD (Partial analysis): C = [100, 500, 1000, 1500, 5000, 10000]

 

EFD (Final analysis): C = [100, 500, 1000, 1500, 5000, 10000]

 

threshold = 1e−10

Linear SVM

C = [0.001, 0.01, 0.1, 1, 10, 100, 1000]

RF

n_estimators = [30,47, 75, 119, 189, 299, 475, 753,1194,1892,2999]

KNN

n_neighbors = [5, 20, 35, 50]