Pipeline design to identify key features and classify the chemotherapy response on lung cancer patients using large-scale genetic data

Table 2 Advantages and disadvantages of types of feature selection methods used in the pipeline configuration

FS Methods
	Advantages	Disadvantages
Filter	They are easily scalable to very high-dimensional data sets.	They do not interact with the classification algorithm.
	They are computationally fast and simple.	Most of this methods are univariate, this is, they consider features independently or only with regard to the target feature, thereby ignoring feature dependencies.
	They are independent of the classification algorithm used in the further model construction.
Wrapper	They include the interaction between feature subset search and the classification algorithm that is “wrapped”.	They have a higher risk of overfitting, depending on how exhaustive is the feature subset search.
	They take into account feature dependencies.	They are very computationally intensive, especially if the “wrapped” classifier has a high computational cost.
Embedded	They include the interaction between feature subset search and the final classification model constructed.	They depend on the specific learning method of the final model constructed.
	They take into account feature dependencies.
	They are computationally faster than wrapper methods.

ISSN: 1752-0509