Customer Churn Prediction Pada Streaming Musics Platform Menggunakan Ensemble Learning
Abstract
Abstrak – Churn prediction sangat penting bagi layanan berbasis subscriptions seperti KKBOX, yang mana merupakan sebuah streaming music platform terkenal di Asia. Meskipun terkenal, KKBOX menghadapi tantangan signifikan dengan churn customer, di mana ketika pelanggan membatalkan subscriptions mereka, yang berdampak langsung pada pendapatan dan pertumbuhan perusahaan. Penelitian ini mengeksplorasi pengembangan model churn prediction menggunakan ensemble machine learning. Churn prediction membantu mengidentifikasi pelanggan yang kemungkinan akan membatalkan subscriptions mereka, memungkinkan perusahaan untuk menerapkan retention strategies. Pentingnya topik ini terletak pada implikasi finansial dan pertumbuhan jangka panjang bagi bisnis. Churn predicition yang efektif dapat secara signifikan meningkatkan retention customers, karena mempertahankan hanya 5% dari pelanggan yang ada dapat meningkatkan keuntungan sebesar 25% hingga 95%. Penelitian ini menggunakan dataset dari KKBOX dan mengimplementasikan berbagai model machine learning, termasuk logistic regression, SVM, XGBoost, dan LightGBM, untuk memprediksi churn. Solusi ini melibatkan data exploration, data preparation, feature engineering, untuk meningkatkan model accuracy. Pada experiment ini LightGBM unggul dibanding model lainnya, dengan mencapai skor log loss terendah. Modelmodel ini menyediakan framework yang kuat untuk churn prediction, dapat meningkatkan retention strategies customers untuk subscription-based services seperti KKBOX. Experiment selanjutnya dapat mengeksplorasi features lainnya dan tuning hyperparameter untuk lebih meningkatkan model performances.
Kata kunci - Churn Prediction, XGBoost, LightGBM, Ensemble learning, SVM, Logistic Regression
References
Verbeke, W., Martens, D., Mues, C., & Baesens, B. (2012). Building comprehensible customer churn prediction models with advanced rule induction techniques. Expert Systems with Applications, 38(3), 2354-2364. Elsevier. DOI: 10.1016/j.eswa.2010.08.023
Huang, S., Ke, W., Chen, J., & Chen, S. (2021). A comprehensive survey on customer churn prediction with big data. Artificial Intelligence Review, 54, 2757-2811. Springer. DOI: 10.1007/s10462-020-09867-1
Nguyen, T., Pham, T., & Cao, T. (2015). Predicting customer churn in subscription-based services using machine learning. International Journal of Information Management, 35(2), 244-253. Elsevier. DOI: 10.1007/978-981-99-8438-1_26
Lariviere, B., & Van den Poel, D. (2005). Predicting customer retention and profitability by using random forests and regression forests techniques. Expert Systems with Applications, 29(2), 472-484. Elsevier. DOI: /10.1016/j.eswa.2005.04.043
Amin, A., Anwar, S., Adnan, A., Nawaz, M., Howard, N., Qadir, J., & Hussain, A. (2016). Customer churn prediction in the telecommunication sector using a rough set approach. Neurocomputing, 237, 242-254. Elsevier. DOI: 10.1016/j.neucom.2016.12.009
Vafeiadis, T., Diamantaras, K. I., Sarigiannidis, G., & Chatzisavvas, K. C. (2015). A comparison of machine learning techniques for customer churn prediction. Simulation Modelling Practice and Theory, 55, 1-9. Elsevier. DOI: 10.1016/j.simpat.2015.03.003
Tsai, C. F., & Lu, Y. H. (2009). Customer churn prediction by hybrid neural networks. Expert Systems with Applications, 36(10), 12547-12553. Elsevier. DOI: 10.1016/j.eswa.2009.05.032
Reichheld, F. F., & Schefter, P. (2000). The Economics of E-Loyalty. Harvard Business School Working Knowledge. Retrieved from https://hbswk.hbs.edu/archive/the-economics-of-eloyalty
Liao, S. H., & Chen, Y. C. (2017). Predicting customer churn in the insurance industry using data mining techniques. Expert Systems with Applications, 83, 89-101. Elsevier.
Verbeke, W., Dejaeger, K., Martens, D., Hur, J., & Baesens, B. (2014). New insights into churn prediction in the telecommunication sector: A profit driven data mining approach. European Journal of Operational Research, 218(1), 211-229. Elsevier. DOI: /10.1016/j.ejor.2011.09.031
Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123-140. Springer. DOI: 10.1007/BF00058655
Friedman, J. H. (2001). Greedy function approximation: a gradient boosting machine. Annals of Statistics, 29(5), 1189-1232. Institute of Mathematical Statistics.DOI:10.1214/aos/1013203451
Fernández-Delgado, M., Cernadas, E., Barro, S., & Amorim, D. (2014). Do we need hundreds of classifiers to solve real world classification problems? Journal of Machine Learning Research, 15(1), 3133-3181.
Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785- 794). ACM. DOI: 10.1145/2939672.2939785
Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., & Liu, T. (2017). LightGBM: A highly efficient gradient boosting decision tree. In Advances in Neural Information Processing Systems (pp. 3146-3154). DOI: 10.5555/3294996.3295074
Niculescu-Mizil, A., & Caruana, R. (2005). Predicting good probabilities with supervised learning. In Proceedings of the 22nd International Conference on Machine Learning (pp. 625-632). ACM. DOI: 10.1145/1102351.1102430