Prediksi Retweet Berdasarkan User-Based dan Content-Based Menggunakan Metode Ensemble Stacking

Penulis

  • Muhammad Rizqi Akbar Telkom University
  • Jondri Jondri Telkom University
  • Indwiarti Indwiarti Telkom University

Abstrak

Abstrak-Twitter merupakan salah satu social media yang sangat popular dan mudah digunakan untuk mendapatkan informasi secara cepat. Fitur Retweet merupakan salah satu alasan mengapa penyebaran informasi tersebut dapat tersebar dengan cepat. Retweet terjadi jika seorang follower men-tweet ulang tweet dari followee-nya. Pada penelitian ini dilakukan pemodelan untuk prediksi retweet berdasarkan feature user-based dan content-based dengan menggunakan metode Ensemble Stacking melalui proses K-fold Cross Validation. Ensemble Stacking ini dibentuk dengan 3 base-learner yaitu Random Forest, Gradient Boosting, dan Support Vector Machine(SVM). Sedangkan meta-learner yang digunakan adalah Support Vector Machine(SVM) Pemodelan ini menunjukan hasil terbaik ketika sudah dilakukan Imbalanced Class Handling menggunakan Teknik SMOTE dan K-fold Cross Validation dengan k=10. Hasil F1-score menunjukkan 86.46%. Dengan hasil demikian, bisa disimpulkan bahwa pemodelan yang dibentuk mampu meningkatkan hasil prediksi dari base-learnernya.
Kata kunci-twitter, retweet, ensemble stacking, k-fold cross validation, oversampling

Referensi

S. N. Firdaus, C. Ding, and A. Sadeghian, “Retweet: A popular information diffusion mechanism – A survey paper,” Online Soc. Networks Media, vol. 6, pp. 26–40, Jun. 2018, doi: 10.1016/j.osnem.2018.04.001.

S. N. Firdaus, C. Ding, and A. Sadeghian, “Retweet prediction considering user’s difference as an author and retweeter,” in 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), Aug. 2016, pp. 852–859. doi: 10.1109/ASONAM.2016.7752337.

T. B. N. Hoang and J. Mothe, “Predicting information diffusion on Twitter – Analysis of predictive features,” J. Comput. Sci., vol. 28, pp. 257–264, Sep. 2018, doi: 10.1016/j.jocs.2017.10.010.

I. Daga, A. Gupta, R. Vardhan, and P. Mukherjee, “Prediction of Likes and Retweets Using Text Information Retrieval,” Procedia Comput. Sci., vol. 168, pp. 123–128, 2020, doi: 10.1016/j.procs.2020.02.273.

P. P. Tribhuvan, S. G. Bhirud, and R. R. Deshmukh, “STACKING ENSEMBLE MODEL FOR POLARITY CLASSIFICATION IN FEATURE BASED OPINION MINING,” Indian J. Comput. Sci. Eng., vol. 9, no. 3, pp. 91–95, Jun. 2018, doi: 10.21817/indjcse/2018/v9i3/180903004.

Y. Sopianti, E. R. Kaburuan, and A. A. Suryani, “Personality prediction using indonesian twitter data with modified stacking method,” Int. J. Adv. Sci. Technol., vol. 29, no. 5, pp. 2525–2534, 2020.

S. Mootha, S. Sridhar, and M. S. K. Devi, “A Stacking Ensemble of Multi Layer Perceptrons to Predict Online Shoppers’ Purchasing Intention,” 2020 3rd Int. Semin. Res. Inf. Technol. Intell. Syst. ISRITI 2020, pp. 721–726, 2020, doi: 10.1109/ISRITI51436.2020.9315447.

K. Leartpantulak and Y. Kitjaidure, “Music Genre Classification of audio signals Using Particle Swarm Optimization and Stacking Ensemble,” in 2019 7th International Electrical Engineering Congress (iEECON), Mar. 2019, pp. 1–4. doi: 10.1109/iEECON45304.2019.8938995.

Y. Xiong, M. Ye, and C. Wu, “Cancer Classification with a Cost-Sensitive Naive Bayes Stacking Ensemble,” Comput. Math. Methods Med., vol. 2021, pp. 1–12, Apr. 2021, doi: 10.1155/2021/5556992.

F. Tempola, M. Muhammad, and A. Khairan, “Perbandingan Klasifikasi Antara KNN dan Naive Bayes pada Penentuan Status Gunung Berapi dengan K-Fold Cross Validation,” J. Teknol. Inf. dan Ilmu Komput., vol. 5, no. 5, p. 577, Oct. 2018, doi: 10.25126/jtiik.201855983.

G. Canbek, S. Sagiroglu, T. T. Temizel, and N. Baykal, “Binary classification performance measures/metrics: A comprehensive visualized roadmap to gain new insights,” in 2017 International Conference on Computer Science and Engineering (UBMK), Oct. 2017, pp. 821–826. doi: 10.1109/UBMK.2017.8093539.

, JustAnotherArchivist, “GitHub - {JustAnotherArchivist}/snscrape: A social networking service scraper in {Python}.” github.com, Jun. 2022.

R. Cobos, F. Jurado, and A. Blazquez-Herranz, “A Content Analysis System That Supports Sentiment Analysis for Subjectivity and Polarity Detection in Online Courses,” IEEE Rev. Iberoam. Tecnol. del Aprendiz., vol. 14, no. 4, pp. 177–187, Nov. 2019, doi: 10.1109/RITA.2019.2952298.

T. E. Tallo and A. Musdholifah, “The Implementation of Genetic Algorithm in Smote (Synthetic Minority Oversampling Technique) for Handling Imbalanced Dataset Problem,” in 2018 4th International Conference on Science and Technology (ICST), Aug. 2018, pp. 1–4. doi: 10.1109/ICSTC.2018.8528591.

##submission.downloads##

Diterbitkan

2023-04-26

Terbitan

Bagian

Program Studi S1 Informatika