Social Media Engineering for Issues Feature Extraction using Categorization Knowledge Modelling and Rule-based Sentiment Analysis

M Tafaquh Fiddin Al Islami - Department of Information and Computer Engineering, Politeknik Elektronika Negeri Surabaya, Surabaya, Indonesia
Ali Ridho Barakbah - Department of Information and Computer Engineering, Politeknik Elektronika Negeri Surabaya, Surabaya, Indonesia
Tri Harsono - Department of Information and Computer Engineering, Politeknik Elektronika Negeri Surabaya, Surabaya, Indonesia

Citation Format:



A company maintains and improves its quality services by paying attention to reviews and complaints from users. The complaints from users are commonly written using human natural language expression so that their messages are computationally difficult to extract and proceed. To overcome this difficulty, in this study, we presented a new system for issues feature extraction from users’ reviews and complaints from social media data. This system consists of four main functions: (1) Data Crawling and Preprocessing, (2) Categorization Knowledge Modelling, (3) Rule-based Sentiment Analysis, and (4) Application Environment. Data Crawling and Preprocessing provides data acquisition from users’ tweets on social media, crawls the data and applies the data preprocessing. Categorization Knowledge Modelling provides text mining of textual data, vector space transformation to create knowledge metadata, context recognition of keyword queries to the knowledge metadata, and similarity measurement for categorization. In the Rule-based Sentiment Analysis, we developed our own rules of computatioal linguistics to measure polarity of sentiment. Application Environment consists of 3 layers: database management, back-end services and front-end services. For applicability of our proposed system, we conducted two kinds of experimental study: (1) categorization performance, and (2) sentiment analysis performance. For categorization performance, we used 8743 tweet data and performed 82% of accuracy. For categorization performance, we made experiments on 217 tweet data and performed 92% of accuracy.


Issues feature extraction; categorization knowledge modelling; context recognition; rule-based sentiment analysis.

Full Text:



S. M. Metev Tim APJII, “Buletin APJII: Saatnya Jadi Pokok Perhatian Pemerintah dan Industriâ€, Asosiasi Pengguna Jasa Internet Indonesia, Jakarta, 2016.

K.C.B. Wicaksono, "Mengukur Efektivitas Social Media Bagi Perusahaan", Binus University, Jakarta, 2013.

S. Vinerean, I. Cetina, L. Dumitrescu, and M. Tichindelean, “The Effects of Social Media Marketing on Online Consumer Behavior,†International Journal of Business and Management, vol. 8, no. 14, Jun. 2013.

Suyatno, “Bahasa Indonesia sebagai Sarana Pengembangan Guru Profesionalâ€, Orasi Ilmiah Ilmu Pendidikan Bahasa, Universitas Muhammadiyah Prof Dr Hamka , 2009.

T. Hashimoto, T. Kuboyama, and B. Chakraborty, “Topic Extraction from Millions of Tweets using Singular Value Decomposition and Feature Selection, Proceedings of APSIPA Annual Summit and Conference 2015â€, Hong Kong, 2015.

H. Takikawa and K. Nagayoshi, “Political Polarization in Social Media: Analysis of the Twitter Political Field in Japanâ€, 2017 IEEE International Conference on Big Data (BIGDATA), USA, 2017.

P. Jotikabukkana, V. Sornlertlamvanich, O. Manabu, and C. Haruechaiyasak, “Social Media Text Classification by Enhancing Well-Formed Text Trained Modelâ€, ITB Journal Publisher, Indonesia, 2016.

A. Purwarianti, A. Andhika, A.F. Wicaksono, I. Afif, and F. Ferdian, “InaNLP: Indonesia Natural Language Processing Toolkit Case Study Complaint Tweet Classificationâ€, Institute of Electrical and Electronics Engineering, 2016.

R. Khan and S. Urolagin, “Airline Sentiment Visualization, Consumer Loyalty Measurement and Prediction using Twitter Dataâ€, International Journal of Advanced Computer Science and Applications, 2018.

Y. Wan and Q. Gao, “An Ensemble Sentiment Classification System of Twitter Data for Airline Services Analysisâ€, 15th International Conference on Data Mining Workshops: Institute of Electrical and Electronics Engineers, 2015.

M. Kamal, A.R. Barakbah, and N.R. Mubtadai, “Temporal Sentiment Analysis for Opinion Mining of ASEAN Free Trade Area on Social Mediaâ€, IES: Knowledge Creation and Intelligent Computing (KCIC), 2016.

B.J.M. Putra, A. Helen, and A.R. Barakbah, “Rule-based Sentiment Degree Measurement of Opinion Mining of Community Participatory in the Government of Surabayaâ€, EMITTER International Journal of Engineering Technology, Indonesia, 2018.

G. Miner, J. Elder, A. Fast, T. Hill, R. Nisbet, and D. Delen, “Practical Text Mining and Statistical Analysis for Non-structured Text Data Applicationsâ€, Academic Press, United States of America, 2012.

W. Ford, Numerical Linear Algebra with Applications using Matlab. Elsevier Inc. First Edition, 2014

R.L. Liu, “Context-Based Term Frequency Assessment for Text Classification†in PRICAI 2008: Trends in Artificial Intelligence, Springer Berlin Heidelberg, 2008, pp. 1004–1009.

R.L. Liu, “Context recognition for hierarchical text classification†Journal of the American Society for Information Science and Technology, vol. 60, no. 4, pp. 803–813, Apr. 2009.

G. Katz, B. Shapira, N. Ofek, Y. Bar-Zev, and I. Negev, “CoBAn: A Context Based Approach for Text Classificationâ€, J. Inf. Sci.: Int. J. Arch., 262(March), pp.137-158, 2014.

K.Z. Aung and N.N. Myo, “Sentiment Analysis of Students’ Comment Using Lexicon Based Approachâ€, International Conference on Computer and Information Science, pp. 149-154, Wuhan, 2017.

R. Asmara, A. Basuki, and M.H.U. AlRasyid, “Gender Based Temporal Sentiment Analysis in Indonesian on Culinary Places in Surabaya Cityâ€, International Journal of Engineering and Technology Innovation, Vol. 7, No. 4, 2017.

A.R. Naradhipa and A. Purwarianti, “Sentiment classification for Indonesian message in social mediaâ€, International Conference on Electrical Engineering and Informatics, pp. 1-5, Bandung, 2011.

A. Neviarouskaya, H. Prendinger, and M. Ishizuka, “Sentiful: A Lexicon for Sentiment Analysisâ€, IEEE Transactions on Affective Computing, Vol. 2, No.1, pp. 22-36, 2011.

Delen, Dursun et al, Practical Text Mining and Statistical Analysis for Non-structured Text Data Applications, Academic Press, United States of America, 2012.

P. Tiwari et al., “Sentiment Analysis for Airlines Services Based on Twitter Dataset,†in Social Network Analytics, Elsevier, 2019, pp. 149–162.

Vue JS, Y. Evan, What is vuex?, Feb. 2016. Accessed on: Jun. 23, 2020. [Online]. Available:

MongoDB Inc, Aggregation Pipeline Limits. 2008. Accessed on: March 23, 2020. Available:

MongoDB Inc, Database Collection Aggregate. 2008. Accessed on: March 23, 2020. Available:

A. Erianda, & I. Rahmayuni "Improvement of Email And Twitter Classification Accuracy Based On Preprocessing Bayes Naive Classifier Optimization In Integrated Digital Assistant," JOIV : International Journal on Informatics Visualization, vol. 1, no. 2, , pp. 53-56, May. 2017

Clark, Alexander & Tim, Issco. (2003). Pre-Processing Very Noisy Text.

M. Zulqarnain, R. Ghazali, M. G. Ghouse, dan M. F. Mushtaq, “Efficient processing of GRU based on word embedding for text classification,†JOIV, vol. 3, no. 4, Nov 2019