A Length-Adaptive Non-Dominated Sorting Genetic Algorithm for Bi-Objective High-Dimensional Feature Selection

Yanlu Gong, Junhai Zhou, Quanwang Wu, Mengchu Zhou, Junhao Wen

Research output: Contribution to journalArticlepeer-review

7 Scopus citations

Abstract

As a crucial data preprocessing method in data mining, feature selection (FS) can be regarded as a bi-objective optimization problem that aims to maximize classification accuracy and minimize the number of selected features. Evolutionary computing (EC) is promising for FS owing to its powerful search capability. However, in traditional EC-based methods, feature subsets are represented via a length-fixed individual encoding. It is ineffective for high-dimensional data, because it results in a huge search space and prohibitive training time. This work proposes a length-adaptive non-dominated sorting genetic algorithm (LA-NSGA) with a length-variable individual encoding and a length-adaptive evolution mechanism for bi-objective high-dimensional FS. In LA-NSGA, an initialization method based on correlation and redundancy is devised to initialize individuals of diverse lengths, and a Pareto dominance-based length change operator is introduced to guide individuals to explore in promising search space adaptively. Moreover, a dominance-based local search method is employed for further improvement. The experimental results based on 12 high-dimensional gene datasets show that the Pareto front of feature subsets produced by LA-NSGA is superior to those of existing algorithms.

Original languageEnglish (US)
Pages (from-to)1834-1844
Number of pages11
JournalIEEE/CAA Journal of Automatica Sinica
Volume10
Issue number9
DOIs
StatePublished - Sep 1 2023

All Science Journal Classification (ASJC) codes

  • Control and Systems Engineering
  • Information Systems
  • Control and Optimization
  • Artificial Intelligence

Keywords

  • Bi-objective optimization
  • feature selection (FS)
  • genetic algorithm
  • high-dimensional data
  • length-adaptive

Fingerprint

Dive into the research topics of 'A Length-Adaptive Non-Dominated Sorting Genetic Algorithm for Bi-Objective High-Dimensional Feature Selection'. Together they form a unique fingerprint.

Cite this