Issue: 2024/Vol.34/No.4, Pages 157-183

CLASSIFICATION WITH MACHINE LEARNING ALGORITHMS AFTER HYBRID FEATURE SELECTION IN IMBALANCED DATA SETS

Meryem Pulat , İpek Deveci Kocakoç 

Full paper (PDF)    

Cite as: M. Pulat, İ. D. Kocakoç. Classification with machine learning algorithms after hybrid feature selection in imbalanced data sets. Operations Research and Decisions 2024: 34(4), 157-183. DOI 10.37190/ord240410

Abstract
The efficacy of machine learning algorithms significantly depends on the adequacy and relevance of features in the data set. Hence, feature selection precedes the classification process. In this study, a hybrid feature selection approach, integrating filter and wrapper methods was employed. This approach not only enhances classification accuracy, surpassing the results achievable with filter methods alone, but also reduces processing time compared to exclusive reliance on wrapper methods. Results indicate a general improvement in algorithm performance with the application of the hybrid feature selection approach. The study utilized the Taiwanese Bankruptcy and Statlog (German Credit Data) datasets from the UCI Machine Learning Repository. These datasets exhibit an unbalanced distribution, necessitating data preprocessing that considers this unbalance. After acknowledging the datasets’ unbalanced nature, feature selection and subsequent classification processes were executed.

Keywords: machine learning, ensemble learning, classification, feature selection, unbalanced dataset

Received: 27 February 2024    Accepted: 19 October 2024
Published online: 19 December 2024