Customer Churn Prediction Pada Streaming Musics Platform Menggunakan Ensemble Learning

Authors

  • Iqbal Saviola Syah bill haq Telkom University
  • Tjokorda Agung Budi Wirayuda Telkom University

Abstract

Abstrak – Churn prediction sangat penting bagi layanan berbasis subscriptions seperti KKBOX, yang mana merupakan sebuah streaming music platform terkenal di Asia. Meskipun terkenal, KKBOX menghadapi tantangan signifikan dengan churn customer, di mana ketika pelanggan membatalkan subscriptions mereka, yang berdampak langsung pada pendapatan dan pertumbuhan perusahaan. Penelitian ini mengeksplorasi pengembangan model churn prediction menggunakan ensemble machine learning. Churn prediction membantu mengidentifikasi pelanggan yang kemungkinan akan membatalkan subscriptions mereka, memungkinkan perusahaan untuk menerapkan retention strategies. Pentingnya topik ini terletak pada implikasi finansial dan pertumbuhan jangka panjang bagi bisnis. Churn predicition yang efektif dapat secara signifikan meningkatkan retention customers, karena mempertahankan hanya 5% dari pelanggan yang ada dapat meningkatkan keuntungan sebesar 25% hingga 95%. Penelitian ini menggunakan dataset dari KKBOX dan mengimplementasikan berbagai model machine learning, termasuk logistic regression, SVM, XGBoost, dan LightGBM, untuk memprediksi churn. Solusi ini melibatkan data exploration, data preparation, feature engineering, untuk meningkatkan model accuracy. Pada experiment ini LightGBM unggul dibanding model lainnya, dengan mencapai skor log loss terendah. Modelmodel ini menyediakan framework yang kuat untuk churn prediction, dapat meningkatkan retention strategies customers untuk subscription-based services seperti KKBOX. Experiment selanjutnya dapat mengeksplorasi features lainnya dan tuning hyperparameter untuk lebih meningkatkan model performances.

Kata kunci - Churn Prediction, XGBoost, LightGBM, Ensemble learning, SVM, Logistic Regression

References

Verbeke, W., Martens, D., Mues, C., & Baesens, B. (2012). Building comprehensible customer churn prediction models with advanced rule induction techniques. Expert Systems with Applications, 38(3), 2354-2364. Elsevier. DOI: 10.1016/j.eswa.2010.08.023

Huang, S., Ke, W., Chen, J., & Chen, S. (2021). A comprehensive survey on customer churn prediction with big data. Artificial Intelligence Review, 54, 2757-2811. Springer. DOI: 10.1007/s10462-020-09867-1

Nguyen, T., Pham, T., & Cao, T. (2015). Predicting customer churn in subscription-based services using machine learning. International Journal of Information Management, 35(2), 244-253. Elsevier. DOI: 10.1007/978-981-99-8438-1_26

Lariviere, B., & Van den Poel, D. (2005). Predicting customer retention and profitability by using random forests and regression forests techniques. Expert Systems with Applications, 29(2), 472-484. Elsevier. DOI: /10.1016/j.eswa.2005.04.043

Amin, A., Anwar, S., Adnan, A., Nawaz, M., Howard, N., Qadir, J., & Hussain, A. (2016). Customer churn prediction in the telecommunication sector using a rough set approach. Neurocomputing, 237, 242-254. Elsevier. DOI: 10.1016/j.neucom.2016.12.009

Vafeiadis, T., Diamantaras, K. I., Sarigiannidis, G., & Chatzisavvas, K. C. (2015). A comparison of machine learning techniques for customer churn prediction. Simulation Modelling Practice and Theory, 55, 1-9. Elsevier. DOI: 10.1016/j.simpat.2015.03.003

Tsai, C. F., & Lu, Y. H. (2009). Customer churn prediction by hybrid neural networks. Expert Systems with Applications, 36(10), 12547-12553. Elsevier. DOI: 10.1016/j.eswa.2009.05.032

Reichheld, F. F., & Schefter, P. (2000). The Economics of E-Loyalty. Harvard Business School Working Knowledge. Retrieved from https://hbswk.hbs.edu/archive/the-economics-of-eloyalty

Liao, S. H., & Chen, Y. C. (2017). Predicting customer churn in the insurance industry using data mining techniques. Expert Systems with Applications, 83, 89-101. Elsevier.

Verbeke, W., Dejaeger, K., Martens, D., Hur, J., & Baesens, B. (2014). New insights into churn prediction in the telecommunication sector: A profit driven data mining approach. European Journal of Operational Research, 218(1), 211-229. Elsevier. DOI: /10.1016/j.ejor.2011.09.031

Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123-140. Springer. DOI: 10.1007/BF00058655

Friedman, J. H. (2001). Greedy function approximation: a gradient boosting machine. Annals of Statistics, 29(5), 1189-1232. Institute of Mathematical Statistics.DOI:10.1214/aos/1013203451

Fernández-Delgado, M., Cernadas, E., Barro, S., & Amorim, D. (2014). Do we need hundreds of classifiers to solve real world classification problems? Journal of Machine Learning Research, 15(1), 3133-3181.

Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785- 794). ACM. DOI: 10.1145/2939672.2939785

Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., & Liu, T. (2017). LightGBM: A highly efficient gradient boosting decision tree. In Advances in Neural Information Processing Systems (pp. 3146-3154). DOI: 10.5555/3294996.3295074

Niculescu-Mizil, A., & Caruana, R. (2005). Predicting good probabilities with supervised learning. In Proceedings of the 22nd International Conference on Machine Learning (pp. 625-632). ACM. DOI: 10.1145/1102351.1102430

Published

2025-04-10

Issue

Section

Program Studi S1 Informatika