Classification of Public Sentiment towards the Performance of the Ministry of Communication and Digital regarding Online Gambling

Main Article Content

Ika Rahma Alia
Favorisen Rosyking Lumbanraja
Aristoteles Aristoteles
Rico Andrian

Abstract

Online gambling is a social issue currently in the spotlight in Indonesia. Although the government, particularly the Ministry of Communication and Digital (Kemkomdigi), has taken various measures, such as blocking websites and conducting digital literacy campaigns, online gambling remains rampant and has sparked various public reactions. Social media, particularly Instagram, has become a public space where people express their opinions and sentiments regarding government performance. This study aims to classify public sentiment based on comments directed at the official Kemkomdigi Instagram account regarding the issue of online gambling. This study uses two machine learning algorithms, Random Forest and XGBoost, to compare the effectiveness of the models in classifying positive and negative sentiment. A total of 724 comments were collected and manually labeled by three annotators using a voting method. Preprocessing included cleaning, case folding, tokenization, normalization, stopword removal, and stemming. Feature representation was performed using the TF-IDF method. The data was split with a 70:30 ratio and balanced using Random Oversampling. Model training used 10-fold cross-validation and hyperparameter tuning through GridSearchCV. The evaluation results showed that the tuned Random Forest performed the best, with an accuracy of 0.7082. These findings demonstrate that machine learning approaches, particularly Random Forest, are effective in automatically identifying public sentiment toward emerging public policy issues on social media.

Article Details

How to Cite
Alia, I. R., Lumbanraja, F. R., Aristoteles, A., & Andrian, R. (2025). Classification of Public Sentiment towards the Performance of the Ministry of Communication and Digital regarding Online Gambling. Jurnal Pepadun, 6(3), 264–275. https://doi.org/10.23960/pepadun.v6i3.295

References

Kemkomdigi, “Transaksi Judi Online Kuartal Pertama 2025 Turun Hingga 80 persen,” KEMKOMDIGI, 2025. https://www.komdigi.go.id/berita/siaran-pers/detail/transaksi-judi-online-kuartal-pertama-2025-turun-hingga-80-persen (accessed Jul. 23, 2025).

N. Julius, “Data Jumlah Pengguna Instagram di Indonesia 2025 Bagaimana data demografis pengguna Instagram di tingkat global? Data Jumlah Pengguna Instagram di Indonesia 2025,” Upgraded.id, 2025. https://upgraded.id/data-jumlah-pengguna-instagram-di-indonesia (accessed Jul. 23, 2025).

E. Salim and M. Syafrullah, “Jakarta Barat Menggunakan Algoritme K-Nearest Neighbor,” vol. 20, no. 1, pp. 58–65, 2023, [Online]. Available: https://kemsalim.space/ulasan_dukcapil/

R. Obiedat, D. Al-Darras, E. Alzaghoul, and O. Harfoushi, “Arabic Aspect-Based Sentiment Analysis: A Systematic Literature Review,” IEEE Access, vol. 9, pp. 152628–152645, 2021, doi: 10.1109/ACCESS.2021.3127140.

N. Anggraini, S. J. Putra, L. K. Wardhani, F. D. U. Arif, N. Hakiem, and I. M. Shofi, “A Comparative Analysis of Random Forest, XGBoost, and LightGBM Algorithms for Emotion Classification in Reddit Comments,” J. Tek. Inform., vol. 17, no. 1, pp. 88–97, 2024, doi: 10.15408/jti.v17i1.38651.

I. Afdhal, R. Kurniawan, I. Iskandar, R. Salambue, E. Budianita, and F. Syafria, “Penerapan Algoritma Random Forest Untuk Analisis Sentimen Komentar Di YouTube Tentang Islamofobia,” J. Nas. Komputasi dan Teknol. Inf., vol. 5, no. 1, pp. 122–130, 2022, [Online]. Available: http://ojs.serambimekkah.ac.id/jnkti/article/view/4004/pdf

I. S. K. Idris, Y. A. Mustofa, and I. A. Salihi, “Analisis Sentimen Terhadap Penggunaan Aplikasi Shopee Mengunakan Algoritma Support Vector Machine (SVM),” Jambura J. Electr. Electron. Eng., vol. 5, no. 1, pp. 32–35, 2023, doi: 10.37905/jjeee.v5i1.16830.

K. C. Astuti, A. Firmansyah, and A. Riyadi, “Implementasi Text Mining Untuk Analisis Sentimen Masyarakat Terhadap Ulasan Aplikasi Digital Korlantas Polri pada Google Play Store,” REMIK Ris. dan E-Jurnal Manaj. Inform. Komput., vol. 8, no. 1, pp. 383–394, 2024.

X. Zhang, R. Gao, Z. Xiao, K. Wang, T. Liu, M. Liang, and J. Zhan, “Natural language processing and text mining in transportation: Current status, challenges, and future roadmap,” Expert Syst. Appl., vol. 296, no. November 2024, 2026, doi: 10.1016/j.eswa.2025.129050.

R. Blum, M. Hiabu, E. Mammen, and J. T. Meyer, “Pure interaction effects unseen by Random Forests,” Comput. Stat. Data Anal., vol. 212, no. July, p. 108237, 2025, doi: 10.1016/j.csda.2025.108237.

X. Wei, Y. Xu, X. Li, G. Fan, X. Cheng, T. Yu, and B. Jiang, “Study on prediction model of nitrogen oxide concentration in reprocessing plant based on random forest,” Int. J. Adv. Nucl. React. Des. Technol., vol. 7, no. 2, pp. 63–69, 2025, doi: 10.1016/j.jandt.2025.04.011.

Y. Jiang, G. Tong, H. Yin, and N. Xiong, “A Pedestrian Detection Method Based on Genetic Algorithm for Optimize XGBoost Training Parameters,” IEEE Access, vol. 7, pp. 118310–118321, 2019, doi: 10.1109/ACCESS.2019.2936454.

J. Zhang and Z. Zhao, “Corporate ESG rating prediction based on XGBoost-SHAP interpretable machine learning model,” Expert Syst. Appl., vol. 295, no. July 2025, p. 128809, 2026, doi: 10.1016/j.eswa.2025.128809.

Z. Mustaffa and M. H. Sulaiman, “Advanced forecasting of building energy loads with XGBoost and metaheuristic algorithms integration,” Energy Storage Sav., pp. 0–55, 2025, doi: 10.1016/j.enss.2025.03.005.

S. Yadav and S. Shukla, “Analysis of k-Fold Cross-Validation over Hold-Out Validation on Colossal Datasets for Quality Classification,” Proc. - 6th Int. Adv. Comput. Conf. IACC 2016, no. Cv, pp. 78–83, 2016, doi: 10.1109/IACC.2016.25.

A. Shebl, D. Abriha, M. Dawoud, M. Ali Hussein Ali, and A. Csamer, “PRISMA vs. Landsat 9 in lithological mapping - a K-fold Cross-Validation implementation with Random Forest,” Egypt. J. Remote Sens. Sp. Sci., vol. 27, no. 3, pp. 577–596, 2024, doi: 10.1016/j.ejrs.2024.07.003.

M. T. R. Mahesh, V. V. Kumar, V. D. Kumar, O. Geman, M. Margala, and M. Guduri, “The stratified K-folds cross-validation and class-balancing methods with high-performance ensemble classifiers for breast cancer classification,” Healthc. Anal., vol. 4, no. August, pp. 1–10, 2023, doi: 10.1016/j.health.2023.100247.

G. S. K. Ranjan, A. Kumar Verma, and S. Radhika, “K-Nearest Neighbors and Grid Search CV Based Real Time Fault Monitoring System for Industries,” 2019 IEEE 5th Int. Conf. Converg. Technol. I2CT 2019, pp. 9–13, 2019, doi: 10.1109/I2CT45611.2019.9033691.

S. M. Malakouti, M. B. Menhaj, and A. A. Suratgar, “The usage of 10-fold cross-validation and grid search to enhance ML methods performance in solar farm power generation prediction,” Clean. Eng. Technol., vol. 15, no. July, 2023, doi: 10.1016/j.clet.2023.100664.

I. S. Mangkunegara and P. Purwono, “Analysis of DNA Sequence Classification Using SVM Model with Hyperparameter Tuning Grid Search CV,” Proc. - 2022 IEEE Int. Conf. Cybern. Comput. Intell. Cybern. 2022, pp. 427–432, 2022, doi: 10.1109/CyberneticsCom55287.2022.9865624.

Imamah and F. H. Rachman, “Twitter sentiment analysis of Covid-19 using term weighting TF-IDF and logistic regresion,” Proceeding - 6th Inf. Technol. Int. Semin. ITIS 2020, pp. 238–242, 2020, doi: 10.1109/ITIS50118.2020.9320958.

B. A. Kuncoro and B. H. Iswanto, “TF-IDF method in ranking keywords of Instagram users’ image captions,” 2015 Int. Conf. Inf. Technol. Syst. Innov. ICITSI 2015 - Proc., pp. 1–5, 2016, doi: 10.1109/ICITSI.2015.7437705.

C. Yang, E. A. Fridgeirsson, J. A. Kors, J. M. Reps, and P. R. Rijnbeek, “Impact of random oversampling and random undersampling on the performance of prediction models developed using observational health data,” J. Big Data, vol. 11, no. 1, 2024, doi: 10.1186/s40537-023-00857-7.

N. G. Pedrajas, “Partial random under/oversampling for multilabel problems,” Knowledge-Based Syst., vol. 302, no. 112355, pp. 1–17, 2024, doi: https://doi.org/10.1016/j.knosys.2024.112355.

A. Arifuddin, G. S. Buana, R. A. Vinarti, and A. Djunaidy, “Performance Comparison of Decision Tree and Support Vector Machine Algorithms for Heart Failure Prediction,” Procedia Comput. Sci., vol. 234, pp. 628–636, 2024, doi: 10.1016/j.procs.2024.03.048.

A. Luque, A. Carrasco, A. Martin, and A. de las Heras, “The impact of class imbalance in classification performance metrics based on the binary confusion matrix,” Pattern Recognit., vol. 91, pp. 216–231, 2019, doi: 10.1016/j.patcog.2019.02.023.