A Comparative Study of XGBoost, LightGBM, and CatBoost Models for Customer Churn Prediction in the Banking Industry
Main Article Content
Abstract
Customer churn is a critical issue in the banking industry, as retaining existing customers is more cost-effective than acquiring new ones. High churn rates can negatively affect profitability and long-term business sustainability, making churn prediction a key focus in customer relationship management. With the rise of digital banking and the availability of large-scale customer data, machine learning techniques have become valuable tools for identifying at-risk customers. In particular, gradient boosting algorithms have shown promising results in classification tasks involving structured data. This study compares the performance of three ensemble machine learning models XGBoost, LightGBM, and CatBoost in classifying churn using a publicly available banking customer dataset consisting of 10,127 records and 23 features. The evaluation is conducted using two data-splitting schemes (80:10:10 and 70:15:15), and four performance metrics: accuracy, precision, recall, and F1-score. The results indicate that XGBoost achieved the highest overall performance (98.3% accuracy in split 1 and 96.4% in split 2). LightGBM demonstrated competitive accuracy with significantly faster training time, while CatBoost offered strong predictive capability but required longer computation. These findings suggest that model selection in churn prediction depends on the trade-off between predictive performance and computational efficiency.
Article Details

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
References
vol. 2, pp. 30–45, Jan. 2022, doi: 10.47153/sss21.2952022.
B. Prabadevi, R. Shalini, and B. R. Kavitha, “Customer churning analysis using machine learning algorithms,” Int. J. Intell. Networks, vol. 4, pp. 145–154, 2023, doi: https://doi.org/10.1016/j.ijin.2023.05.005.
P. P. Singh, F. I. Anik, R. Senapati, A. Sinha, N. Sakib, and E. Hossain, “Investigating customer churn in banking: a machine learning approach and visualization app for data science and management,” Data Sci. Manag., vol. 7, no. 1, pp. 7–16, 2024, doi: https://doi.org/10.1016/j.dsm.2023.09.002.
H. A. Triyafebrianda and N. A. Windasari, “Factors Influence Customer Churn on Internet Service Providers in Indonesia,” TIJAB (The Int. J. Appl. Business), vol. 6, no. 2, pp. 134–144, 2022.
J. Lappeman, M. Franco, V. Warner, and L. Sierra-Rubia, “What social media sentiment tells us about why customers churn,” J. Consum. Mark., vol. 39, no. 5, pp. 385–403, 2022.
S. Akakpo, P. Dambra, R. Paz, T. Smyth, F. Torre, and C. Yu, “Optimization of the K-Nearest Neighbor Algorithm to Predict Bank Churn,” Stat. Optim. Inf. Comput., vol. 12, no. 5, pp. 1397–1408, 2024.
X. Liu, G. Xia, X. Zhang, W. Ma, and C. Yu, “Customer churn prediction model based on hybrid neural networks,” Sci. Rep., vol. 14, no. 1, p. 30707, 2024.
S. K. Hegde, R. Hegde, S. S. Nanda, G. Phatak, P. Hongal, and D. Gowda, “Customer Churn Analysis in Financial Domain using Deep Intelligence Network,” in 2023 International Conference on Intelligent Data Communication Technologies and Internet of Things (IDCIoT), IEEE, 2023, pp. 362–370.
S. A. Fayaz, S. Kaul, M. Zaman, and M. A. Butt, “An adaptive gradient boosting model for the prediction of rainfall using ID3 as a base estimator,” Rev. d’Intelligence Artif., vol. 36, no. 2, p. 241, 2022.
S. E. H. Yulianti, O. Soesanto, and Y. Sukmawaty, “Penerapan Metode Extreme Gradient Boosting (XGBOOST) pada Klasifikasi Nasabah Kartu Kredit,” J. Math. Theory Appl., pp. 21–26, 2022.
Z. A. Ali, Z. H. Abduljabbar, H. A. Taher, A. B. Sallow, and S. M. Almufti, “Exploring the power of eXtreme gradient boosting algorithm in machine learning: A review,” Acad. J. Nawroz Univ., vol. 12, no. 2, pp. 320–334, 2023.
M. D. Maulana, A. I. Hadiana, and F. R. Umbara, “Algoritma Xgboost Untuk Klasifikasi Kualitas Air Minum,” JATI (Jurnal Mhs. Tek. Inform., vol. 7, no. 5, pp. 3251–3256, 2023.
R. Sibindi, R. Mwangi, and A. Waititu, “A boosting ensemble learning based hybrid light gradient boosting machine and extreme gradient boosting model for predicting house prices,” Eng. Reports, vol. 5, Nov. 2022, doi: 10.1002/eng2.12599.
M. Alyami, M. Khan, A. W. A. Hammad, H. Alabduljabbar, R. Nawaz, M. Fawad, and Y. Gamil, “Estimating compressive strength of concrete containing rice husk ash using interpretable machine learning-based models,” Case Stud. Constr. Mater., vol. 20, p. e02901, 2024.
H. Zeng, C. Yang, H. Zhang, Z. Wu, J. Zhang, G. Dai, F. Babiloni, and W. Kong, “A LightGBM-Based EEG Analysis Method for Driver Mental States Classification,” Comput. Intell. Neurosci., vol. 2019, no. 1, p. 3761203, Jan. 2019, doi: https://doi.org/10.1155/2019/3761203.
T. O. Omotehinwa, D. O. Oyewola, and E. G. Moung, “Optimizing the light gradient-boosting machine algorithm for an efficient early detection of coronary heart disease,” Informatics Heal., vol. 1, no. 2, pp. 70–81, 2024, doi: https://doi.org/10.1016/j.infoh.2024.06.001.
J. Zheng, M. Hu, C. Wang, S. Wang, B. Han, and H. Wang, “Spatial patterns of residents’ daily activity space and its influencing factors based on the CatBoost model: A case study of Nanjing, China,” Front. Archit. Res., vol. 11, no. 6, pp. 1193–1204, 2022, doi: https://doi.org/10.1016/j.foar.2022.04.003.
X. Ren, H. Yu, X. Chen, Y. Tang, G. Wang, and X. Du, “Application of the CatBoost Model for Stirred Reactor State Monitoring Based on Vibration Signals,” C. - Comput. Model. Eng. Sci., vol. 140, no. 1, pp. 647–663, 2024, doi: https://doi.org/10.32604/cmes.2024.048782.
D. Boldini, F. Grisoni, D. Kuhn, L. Friedrich, and S. A. Sieber, “Practical guidelines for the use of gradient boosting for molecular property prediction,” J. Cheminform., vol. 15, no. 1, p. 73, 2023, doi: 10.1186/s13321-023-00743-7.
Y. Chen, B. Chen, and A. Shayilan, “Combining categorical boosting and Shapley additive explanations for building an interpretable ensemble classifier for identifying mineralization-related geochemical anomalies,” Ore Geol. Rev., vol. 173, p. 106263, 2024, doi: https://doi.org/10.1016/j.oregeorev.2024.106263.