Abstract We compare performance of six single classifiers trained on German credit dataset, an imbalanced dataset of 1000 instances with binary-valued dependent variable. To improve the performance, we consider resampling the dataset and ensembling the classifiers. The benchmarks are taken from the best performance among six considered classifiers. Resampling the dataset includes oversampling and undersampling. The performance of ensemble classifiers are then analyzed and examined. The experimental results provide three benchmarks, i.e. SVM trained on plain dataset, NB trained on plain dataset, and SVM trained on undersampled dataset. Furthermore, ensemble of kNN, LDA and SVM outperforms the first benchmark for all metrics used in this research, i.e. recall 92.71%, precision 79.14%, F1 84.73%, AUC 79.96%, and accuracy 76.88%. The ensemble of LR, SVM and NB and the ensemble of LDA, SVM, and NB outperforms the second and third benchmark, respectively.
Exhaustive Search for Weighted Ensemble Classifiers to Improve Performance on Imbalanced Dataset
Exhaustive-Search-for-Weighted-Ensemble-Classifiers-to-Improve-Performance-on-Imbalanced-Dataset_watermark