BukaGini: A stability-aware Gini index feature selection algorithm for robust model performance
Loading...
Downloads
2
Date issued
Journal Title
Journal ISSN
Volume Title
Publisher
IEEE
Location
Signature
License
Abstract
Feature interaction is a vital aspect of Machine Learning (ML) algorithms, and gaining a deep
understanding of these interactions can significantly enhance model performance. This paper introduces
the BukaGini algorithm, an innovative and robust approach for feature interaction analysis that capitalizes
on the Gini impurity index. By exploiting the unique properties of the BukaGini index, our proposed
algorithm effectively captures both linear and nonlinear feature interactions, providing a richer and more
comprehensive representation of the underlying data. We thoroughly evaluate the BukaGini algorithm against
traditional Gini index-based methods on various real-world datasets. These datasets include the High School
Students’ Performance (HSSP) dataset, which examines factors affecting student performance; Cancer Data,
which focuses on identifying cancer types based on gene expression; Spambase, which targets spam email
classification; and the UNSW-NB15 dataset, which addresses network intrusion detection. Our experimental
results demonstrate that the BukaGini algorithm consistently outperforms traditional Gini index-based
methods in terms of accuracy. Across the tested datasets, the BukaGini algorithm achieves improvements
ranging from 0.32% to 2.50%, underscoring its effectiveness in handling diverse data types and problem
domains. This performance gain highlights the potential of the BukaGini algorithm as a valuable tool for
feature interaction analysis in various ML applications.
Description
Subject(s)
BukaGini algorithm, Gini index, ensemble learning, feature interaction analysis, data mining
Citation
IEEE Access. 2023, vol. 11, p. 59386-59396.