Implementation of Ensemble Machine Learning Classifier and Synthetic Minority Oversampling Technique for Sentiment Analysis of Sustainable Development Goals in Indonesia

Acep Irham Gufroni - Universitas Siliwangi, Tasikmalaya, Indonesia
Irani Hoeronis - Universitas Siliwangi, Tasikmalaya, Indonesia
Nur Fajar - Universitas Siliwangi, Tasikmalaya, Indonesia
Andi Rachman - Universitas Siliwangi, Tasikmalaya, Indonesia
Cecep Muhamad Sidik Ramdani - Universitas Siliwangi, Tasikmalaya, Indonesia
Heni Sulastri - Universitas Siliwangi, Tasikmalaya, Indonesia


Citation Format:



DOI: http://dx.doi.org/10.62527/joiv.8.2.1949

Abstract


As part of the Sustainable Development Goals (SDGs), governments worldwide have committed to improving people's lives to improve the quality of life for all, including the 17 such goals that were agreed upon in 2015 to benefit the human race as a whole. It would be interesting to see how society responds to the SDGs after approximately half of them have been achieved. This public response was analyzed in terms of sentiment. Within the total number of internet users in Indonesia, there are 18.45 million Twitter users. The platform enables anyone to write about anything they are experiencing in their lives, such as what is happening in their environment, what is happening in their education system, what is happening in the food industry, how people feel, and many more. The platform enables anyone to write about anything they are experiencing in their lives, such as what is happening in their environment, what is happening in their education system, what is happening in the food industry, how people feel, and many more. To model the data collected, the researchers used Ensemble Machine Learning Classifiers (EMLC) to model the data by using a machine learning classifier that uses machine learning techniques. The best model in this study is EMLC-Stacking with a data splitting of 80:20 and using SMOTE, which obtains an accuracy of 91%. This accuracy results from a 5% increase compared to when not using SMOTE. From 15,698 tweets, this research found that 47% were positive sentiments, 28% were negative sentiments, and 25% were neutral sentiments. The results that we measured offer hope that there will be a positive trend in the journey of the SDGs until 2030 if these findings are true.

Keywords


Data Science; Machine Learning; SDGs; Sentiment Analysis; SMOTE

References


J. Wu, S. Guo, H. Huang, W. Liu, and Y. Xiang, “Information and communications technologies for sustainable development goals: State-of-the-art, needs and perspectives,” IEEE Communications Surveys and Tutorials, vol. 20, no. 3, pp. 2389–2406, Jul. 2018, doi: 10.1109/COMST.2018.2812301.

United Nations, “Transforming our world: the 2030 Agenda for Sustainable Development Transforming our world: the 2030 Agenda for Sustainable Development Preamble,” 2015.

A. Rahmatulloh, R. N. Shofa, I. Darmawan, and Ardiansah, “Sentiment Analysis of Ojek Online User Satisfaction Based on the Naïve Bayes and Net Brand Reputation Method,” in 2021 9th International Conference on Information and Communication Technology, ICoICT 2021, Institute of Electrical and Electronics Engineers Inc., Aug. 2021, pp. 337–341. doi: 10.1109/ICoICT52021.2021.9527466.

C. Villavicencio, J. J. Macrohon, X. A. Inbaraj, J. H. Jeng, and J. G. Hsieh, “Twitter sentiment analysis towards covid-19 vaccines in the Philippines using naïve bayes,” Information (Switzerland), vol. 12, no. 5, May 2021, doi: 10.3390/info12050204.

S. S. Aljameel et al., “A sentiment analysis approach to predict an individual’s awareness of the precautionary procedures to prevent covid-19 outbreaks in Saudi Arabia,” Int J Environ Res Public Health, vol. 18, no. 1, pp. 1–12, Jan. 2021, doi: 10.3390/ijerph18010218.

T. E. Tarigan, R. C. Buwono, and S. Redjeki, “Extraction Opinion of Social Media in Higher Education Using Sentiment Analysis,” vol. 2, no. 1, 2019, [Online]. Available: http://jurnal.kdi.or.id/index.php/bt

T. Pawar, P. Kalra, and D. Mehrotra, “Relevance Feedback on Mobile Data Using RapidMiner,” in 4th International Conference on Applied and Theoretical Computing and Communication Technology iCATccT 2018, 2018, pp. 166–169.

A. Bayhaqy, S. Sfenrianto, K. Nainggolan, and E. R. Kaburuan, “Sentiment Analysis about E-Commerce from Tweets Using Decision Tree, K-Nearest Neighbor, and Naïve Bayes.” [Online]. Available: http://dlvr.it/Qb83n8pic.twitter.com/8MucIMhUMO,

S. Warjiyono, S. Aji, R. Fandhilah, N. Hidayatun, H. Faqih, and T. Liesnaningsih, “The Sentiment Analysis of Fintech Users Using Support Vector Machine and Particle Swarm Optimization Method.” [Online]. Available: https://play.google.com/store/apps/details?id=ovo.id&show

M. Bertolotti and P. Catellani, “Effects of message framing in policy communication on climate change,” Eur J Soc Psychol, vol. 44, no. 5, pp. 474–486, 2014, doi: 10.1002/ejsp.2033.

L. K. John, D. Mochon, Ol. Emrich, and J. Schwartz, “What’s the Value of a Like? Social Media Endorsements Don’t Work the Way You Might Think,” 2017.

A. Aggarwal, S. Kumar, K. Bhargava, and P. Kumaraguru, “The follower count fallacy: Detecting Twitter users with manipulated follower count,” in Proceedings of the ACM Symposium on Applied Computing, Association for Computing Machinery, Apr. 2018, pp. 1748–1755. doi: 10.1145/3167132.3167318.

A. Mittal and S. Patidar, “Sentiment analysis on twitter data: A survey,” in ACM International Conference Proceeding Series, Association for Computing Machinery, Jul. 2019, pp. 91–95. doi: 10.1145/3348445.3348466.

S. Redjeki, S. Widyarto, and A. Suwasto, “DEEP LEARNING APPROACH FOR IDENTIFICATION OF POVERTY THROUGH SENTIMENT ANALYSIS,” in International Multidisciplinary Postgraduate Virtual Conference 2020 (IMPC20), Elsevier Inc., Jun. 2020, pp. 59–64. doi: 10.1016/j.bdr.2015.01.006.

S. Zivanovic, J. Martinez, and J. Verplanke, “Capturing and mapping quality of life using Twitter data,” GeoJournal, vol. 85, no. 1, pp. 237–255, Feb. 2020, doi: 10.1007/s10708-018-9960-6.

C. wen Shen, T. Ha Luong, and T. Pham, “Exploration of Social Media Opinions on Innovation for Sustainable Development Goals by Topic Modeling and Sentiment Analysis.” [Online]. Available: https://scholars.ncu.edu.tw/en/publications/exploration-of-social-media-opinions-on-innovation-for-sustainabl

A. Hooda, “Sentiment Analysis of Recent Tweets for Agriculture from BRICS Countries,” 2018, doi: 10.13140/RG.2.2.17830.68166.

S. P. Kristanto, J. A. Prasetyo, and P. Edwin, “Naive Bayes Classifier on Twitter Sentiment Analysis BPJS of HEALTH,” in 2019 2nd International Conference of Computer and Informatics Engineering (IC2IE), 2019.

D. A. Nurdeni, I. Budi, and A. B. Santoso, “Sentiment Analysis on Covid19 Vaccines in Indonesia: From the Perspective of Sinovac and Pfizer,” in 3rd 2021 East Indonesia Conference on Computer and Information Technology, EIConCIT 2021, Institute of Electrical and Electronics Engineers Inc., Apr. 2021, pp. 122–127. doi: 10.1109/EIConCIT50028.2021.9431852.

R. Jayapermana, A. Aradea, and N. I. Kurniati, “Implementation of Stacking Ensemble Classifier for Multi-class Classification of COVID-19 Vaccines Topics on Twitter,” Scientific Journal of Informatics, vol. 9, no. 1, pp. 8–15, May 2022, doi: 10.15294/sji.v9i1.31648.

P. N. Venkit and S. Wilson, “Identification of Bias Against People with Disabilities in Sentiment Analysis and Toxicity Detection Models,” Nov. 2021, [Online]. Available: http://arxiv.org/abs/2111.13259

E. V. Paz and M. C. Puga, “Sentiment Analysis in Children with Neurodevelopmental Disorders in an Ingroup/Outgroup Setting,” 2019. [Online]. Available: https://www.researchgate.net/publication/336170312

R. Ardianto, T. Rivanie, Y. Alkhalifi, F. Septia Nugraha, and W. Gata, “SENTIMENT ANALYSIS ON E-SPORTS FOR EDUCATION CURRICULUM USING NAIVE BAYES AND SUPPORT VECTOR MACHINE,” 2020.

A. C. Flores, R. I. Icy, and C. F. Peña, “An Evaluation of SVM and Naive Bayes with SMOTE on Sentiment Analysis Data Set,” in 2018 International Conference on Engineering, Applied Sciences, and Technology (ICEAST), 2018.

R. Baragash and H. Aldowah, “Sentiment analysis in higher education: A systematic mapping review,” in Journal of Physics: Conference Series, IOP Publishing Ltd, Apr. 2021. doi: 10.1088/1742-6596/1860/1/012002.

X. Yu, S. Wu, W. Chen, and M. Huang, “Sentiment Analysis of Public Opinions on the Higher Education Expansion Policy in China,” Sage Open, vol. 11, no. 3, 2021, doi: 10.1177/21582440211040778.

J. Oyasor, M. Raborife, and P. Ranchod, Sentiment Analysis as an Indicator to Evaluate Gender disparity on Sexual Violence Tweets in South Africa.

E. Alawneh, M. Al-Fawa’Reh, M. T. Jafar, and M. Al Fayoumi, “Sentiment analysis-based sexual harassment detection using machine learning techniques,” in Proceeding - 2021 International Symposium on Electronics and Smart Devices: Intelligent Systems for Present and Future Challenges, ISESD 2021, Institute of Electrical and Electronics Engineers Inc., Jun. 2021. doi: 10.1109/ISESD53023.2021.9501725.

R. Ibar-Alonso, R. Quiroga-García, and M. Arenas-Parra, “Opinion Mining of Green Energy Sentiment: A Russia-Ukraine Conflict Analysis,” Mathematics, vol. 10, no. 14, Jul. 2022, doi: 10.3390/math10142532.

K. Muludi, M. S. Akbar, D. A. Shofiana, and A. Syarif, “Sentiment Analysis Of Energy Independence Tweets Using Simple Recurrent Neural Network,” IJCCS (Indonesian Journal of Computing and Cybernetics Systems), vol. 15, no. 4, p. 339, Oct. 2021, doi: 10.22146/ijccs.66016.

E. Şaşmaz and F. B. Tek, “Tweet Sentiment Analysis for Cryptocurrencies,” in Proceedings - 6th International Conference on Computer Science and Engineering, UBMK 2021, Institute of Electrical and Electronics Engineers Inc., 2021, pp. 613–618. doi: 10.1109/UBMK52708.2021.9558914.

W. Bourequat and H. Mourad, “Sentiment Analysis Approach for Analyzing iPhone Release using Support Vector Machine,” International Journal of Advances in Data and Information Systems, vol. 2, no. 1, pp. 36–44, Apr. 2021, doi: 10.25008/ijadis.v2i1.1216.

M. Van M. Buladaco, “Sentiments Analysis on Public Land Transport Infrastructure in Davao Region using Machine Learning Algorithms,” International Journal of Advanced Trends in Computer Science and Engineering, vol. 9, no. 1, pp. 685–690, Feb. 2020, doi: 10.30534/ijatcse/2020/97912020.

E. Lee, F. Rustam, P. B. Washington, F. El Barakaz, W. Aljedaani, and I. Ashraf, “Racism Detection by Analyzing Differential Opinions Through Sentiment Analysis of Tweets Using Stacked Ensemble GCR-NN Model,” IEEE Access, vol. 10, pp. 9717–9728, 2022, doi: 10.1109/ACCESS.2022.3144266.

S. Kumar, M. Gahalawat, P. P. Roy, D. P. Dogra, and B. G. Kim, “Exploring impact of age and gender on sentiment analysis using machine learning,” Electronics (Switzerland), vol. 9, no. 2, Feb. 2020, doi: 10.3390/electronics9020374.

M. Paolanti et al., “DEEP CONVOLUTIONAL NEURAL NETWORKS for SENTIMENT ANALYSIS of CULTURAL HERITAGE,” in International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences - ISPRS Archives, International Society for Photogrammetry and Remote Sensing, Aug. 2019, pp. 871–878. doi: 10.5194/isprs-archives-XLII-2-W15-871-2019.

P. Sasikala, S. P. #1, L. Mary, and I. Sheela, “Sentiment Analysis of Online Food Reviews using Customer Ratings,” Article in International Journal of Pure and Applied Mathematics, 2018, doi: 10.5281/zenodo.1249390.

A. Rasool, R. Tao, K. Marjan, and T. Naveed, “Twitter Sentiment Analysis: A Case Study for Apparel Brands,” in Journal of Physics: Conference Series, Institute of Physics Publishing, Mar. 2019. doi: 10.1088/1742-6596/1176/2/022015.

N. M. Sham and A. Mohamed, “Climate Change Sentiment Analysis Using Lexicon, Machine Learning and Hybrid Approaches,” Sustainability (Switzerland), vol. 14, no. 8, Apr. 2022, doi: 10.3390/su14084723.

N. Mucha, “SENTIMENT ANALYSIS OF GLOBAL WARMING USING TWITTER DATA.”

S. Ohtani, “How is People’s Awareness of ‘Biodiversity’ Measured Using Sentiment Analysis and LDA Topic Modeling in the Twitter Discourse Space from 2010 to 2020,” 2021, doi: 10.21203/rs.3.rs-922908/v2.

D. Papachristos, N. Nikitakos, and M. Lambrou, “A Neuroscience Approach in User Satisfaction Evaluation in Maritime Education,” TransNav, the International Journal on Marine Navigation and Safety of Sea Transportation, vol. 7, no. 3, pp. 319–326, Sep. 2013, doi: 10.12716/1001.07.03.01.

M. Wongkar and A. Angdresey, “Sentiment Analysis Using Naive Bayes Algorithm of The Data Crawler: Twitter,” in Proceedings of 2019 4th International Conference on Informatics and Computing, ICIC 2019, Institute of Electrical and Electronics Engineers Inc., Oct. 2019. doi: 10.1109/ICIC47613.2019.8985884.

M. Merle, G. Reese, and S. Drews, “#Globalcitizen: An explorative Twitter analysis of global identity and sustainability communication,” Sustainability (Switzerland), vol. 11, no. 12, 2019, doi: 10.3390/SU11123472.

F. Foroughi and P. Luksch, “Data Science Methodology for Cybersecurity Projects,” Academy and Industry Research Collaboration Center (AIRCC), Feb. 2018, pp. 01–14. doi: 10.5121/csit.2018.80401.