Social Media Engineering for Issues Feature Extraction using Categorization Knowledge Modelling and Rule-based Sentiment Analysis

M Tafaquh Fiddin Al Islami - Department of Information and Computer Engineering, Politeknik Elektronika Negeri Surabaya, Surabaya, Indonesia
Ali Ridho Barakbah - Department of Information and Computer Engineering, Politeknik Elektronika Negeri Surabaya, Surabaya, Indonesia
Tri Harsono - Department of Information and Computer Engineering, Politeknik Elektronika Negeri Surabaya, Surabaya, Indonesia

Citation Format:



A company maintains and improves its quality services by paying attention to reviews and complaints from users. The complaints from users are commonly written using human natural language expression so that their messages are computationally difficult to extract and proceed. To overcome this difficulty, in this study, we presented a new system for issues feature extraction from users’ reviews and complaints from social media data. This system consists of four main functions: (1) Data Crawling and Preprocessing, (2) Categorization Knowledge Modelling, (3) Rule-based Sentiment Analysis, and (4) Application Environment. Data Crawling and Preprocessing provides data acquisition from users’ tweets on social media, crawls the data and applies the data preprocessing. Categorization Knowledge Modelling provides text mining of textual data, vector space transformation to create knowledge metadata, context recognition of keyword queries to the knowledge metadata, and similarity measurement for categorization. In the Rule-based Sentiment Analysis, we developed our own rules of computatioal linguistics to measure polarity of sentiment. Application Environment consists of 3 layers: database management, back-end services and front-end services. For applicability of our proposed system, we conducted two kinds of experimental study: (1) categorization performance, and (2) sentiment analysis performance. For categorization performance, we used 8743 tweet data and performed 82% of accuracy. For categorization performance, we made experiments on 217 tweet data and performed 92% of accuracy.


Issues feature extraction; categorization knowledge modelling; context recognition; rule-based sentiment analysis.

Full Text:



S. M. Metev Tim APJII, “Buletin APJII: Saatnya Jadi Pokok Perhatian Pemerintah dan Industri”, Asosiasi Pengguna Jasa Internet Indonesia, Jakarta, 2016.

K.C.B. Wicaksono, "Mengukur Efektivitas Social Media Bagi Perusahaan", Binus University, Jakarta, 2013.

S. Vinerean, I. Cetina, L. Dumitrescu, and M. Tichindelean, “The Effects of Social Media Marketing on Online Consumer Behavior,” International Journal of Business and Management, vol. 8, no. 14, Jun. 2013.

Suyatno, “Bahasa Indonesia sebagai Sarana Pengembangan Guru Profesional”, Orasi Ilmiah Ilmu Pendidikan Bahasa, Universitas Muhammadiyah Prof Dr Hamka , 2009.

T. Hashimoto, T. Kuboyama, and B. Chakraborty, “Topic Extraction from Millions of Tweets using Singular Value Decomposition and Feature Selection, Proceedings of APSIPA Annual Summit and Conference 2015”, Hong Kong, 2015.

H. Takikawa and K. Nagayoshi, “Political Polarization in Social Media: Analysis of the Twitter Political Field in Japan”, 2017 IEEE International Conference on Big Data (BIGDATA), USA, 2017.

P. Jotikabukkana, V. Sornlertlamvanich, O. Manabu, and C. Haruechaiyasak, “Social Media Text Classification by Enhancing Well-Formed Text Trained Model”, ITB Journal Publisher, Indonesia, 2016.

A. Purwarianti, A. Andhika, A.F. Wicaksono, I. Afif, and F. Ferdian, “InaNLP: Indonesia Natural Language Processing Toolkit Case Study Complaint Tweet Classification”, Institute of Electrical and Electronics Engineering, 2016.

R. Khan and S. Urolagin, “Airline Sentiment Visualization, Consumer Loyalty Measurement and Prediction using Twitter Data”, International Journal of Advanced Computer Science and Applications, 2018.

Y. Wan and Q. Gao, “An Ensemble Sentiment Classification System of Twitter Data for Airline Services Analysis”, 15th International Conference on Data Mining Workshops: Institute of Electrical and Electronics Engineers, 2015.

M. Kamal, A.R. Barakbah, and N.R. Mubtadai, “Temporal Sentiment Analysis for Opinion Mining of ASEAN Free Trade Area on Social Media”, IES: Knowledge Creation and Intelligent Computing (KCIC), 2016.

B.J.M. Putra, A. Helen, and A.R. Barakbah, “Rule-based Sentiment Degree Measurement of Opinion Mining of Community Participatory in the Government of Surabaya”, EMITTER International Journal of Engineering Technology, Indonesia, 2018.

G. Miner, J. Elder, A. Fast, T. Hill, R. Nisbet, and D. Delen, “Practical Text Mining and Statistical Analysis for Non-structured Text Data Applications”, Academic Press, United States of America, 2012.

W. Ford, Numerical Linear Algebra with Applications using Matlab. Elsevier Inc. First Edition, 2014

R.L. Liu, “Context-Based Term Frequency Assessment for Text Classification” in PRICAI 2008: Trends in Artificial Intelligence, Springer Berlin Heidelberg, 2008, pp. 1004–1009.

R.L. Liu, “Context recognition for hierarchical text classification” Journal of the American Society for Information Science and Technology, vol. 60, no. 4, pp. 803–813, Apr. 2009.

G. Katz, B. Shapira, N. Ofek, Y. Bar-Zev, and I. Negev, “CoBAn: A Context Based Approach for Text Classification”, J. Inf. Sci.: Int. J. Arch., 262(March), pp.137-158, 2014.

K.Z. Aung and N.N. Myo, “Sentiment Analysis of Students’ Comment Using Lexicon Based Approach”, International Conference on Computer and Information Science, pp. 149-154, Wuhan, 2017.

R. Asmara, A. Basuki, and M.H.U. AlRasyid, “Gender Based Temporal Sentiment Analysis in Indonesian on Culinary Places in Surabaya City”, International Journal of Engineering and Technology Innovation, Vol. 7, No. 4, 2017.

A.R. Naradhipa and A. Purwarianti, “Sentiment classification for Indonesian message in social media”, International Conference on Electrical Engineering and Informatics, pp. 1-5, Bandung, 2011.

A. Neviarouskaya, H. Prendinger, and M. Ishizuka, “Sentiful: A Lexicon for Sentiment Analysis”, IEEE Transactions on Affective Computing, Vol. 2, No.1, pp. 22-36, 2011.

Delen, Dursun et al, Practical Text Mining and Statistical Analysis for Non-structured Text Data Applications, Academic Press, United States of America, 2012.

P. Tiwari et al., “Sentiment Analysis for Airlines Services Based on Twitter Dataset,” in Social Network Analytics, Elsevier, 2019, pp. 149–162.

Vue JS, Y. Evan, What is vuex?, Feb. 2016. Accessed on: Jun. 23, 2020. [Online]. Available:

MongoDB Inc, Aggregation Pipeline Limits. 2008. Accessed on: March 23, 2020. Available:

MongoDB Inc, Database Collection Aggregate. 2008. Accessed on: March 23, 2020. Available:

A. Erianda, & I. Rahmayuni "Improvement of Email And Twitter Classification Accuracy Based On Preprocessing Bayes Naive Classifier Optimization In Integrated Digital Assistant," JOIV : International Journal on Informatics Visualization, vol. 1, no. 2, , pp. 53-56, May. 2017

Clark, Alexander & Tim, Issco. (2003). Pre-Processing Very Noisy Text.

M. Zulqarnain, R. Ghazali, M. G. Ghouse, dan M. F. Mushtaq, “Efficient processing of GRU based on word embedding for text classification,” JOIV, vol. 3, no. 4, Nov 2019


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

JOIV : International Journal on Informatics Visualization
ISSN 2549-9610  (print) | 2549-9904 (online)
Organized by Department of Information Technology - Politeknik Negeri Padang, and Institute of Visual Informatics - UKM and Soft Computing and Data Mining Centre - UTHM
W :
E :,,

View JOIV Stats

Creative Commons License is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.