Skip to main content

Table 2 Advantages and disadvantages of types of feature selection methods used in the pipeline configuration

From: Pipeline design to identify key features and classify the chemotherapy response on lung cancer patients using large-scale genetic data

FS Methods

 

Advantages

Disadvantages

Filter

They are easily scalable to very high-dimensional data sets.

They do not interact with the classification algorithm.

 

They are computationally fast and simple.

Most of this methods are univariate, this is, they consider features independently or only with regard to the target feature, thereby ignoring feature dependencies.

 

They are independent of the classification algorithm used in the further model construction.

 

Wrapper

They include the interaction between feature subset search and the classification algorithm that is “wrapped”.

They have a higher risk of overfitting, depending on how exhaustive is the feature subset search.

 

They take into account feature dependencies.

They are very computationally intensive, especially if the “wrapped” classifier has a high computational cost.

Embedded

They include the interaction between feature subset search and the final classification model constructed.

They depend on the specific learning method of the final model constructed.

 

They take into account feature dependencies.

 
 

They are computationally faster than wrapper methods.