Estimation of Danger Signs in Regional Complaint Data

Yao Lin; Tsunenori Mine; Kohei Yamaguchi; Sachio Hirokawa

doi:10.30630/joiv.2.4-2.177

Estimation of Danger Signs in Regional Complaint Data

Yao Lin - Kyushu University, Fukuoka, Japan
Tsunenori Mine - Kyushu University, Fukuoka, Japan
Kohei Yamaguchi - Kyushu University, Fukuoka, Japan
Sachio Hirokawa - Kyushu University, Fukuoka, Japan

Citation Format:

DOI: http://dx.doi.org/10.30630/joiv.2.4-2.177

Abstract

Government 2.0 activities have become very attractive and popular. Using the platforms to support the activities, anyone can anytime report issues in a city on the Web and share the reports with other people. Since a variety of reports are posted, officials in the city management section have to give priorities to the reports. However, it is not easy task for the officials to judge the importance of the reports because importance judgments vary depending on the officials, and consequently the agreement rate becomes low. To remedy the low agreement rate problem of human judgment, it is necessary to create an intelligent agent which supports finding reports with high priorities. Hirokawa et al. employed the Support Vector Machine (SVM) with a word Feature Selection method (SVM+FS) to detect signs of danger from posted reports because the signs of danger is one of high priority issues to be dealt with. However they did not compare the SVM+FS method with other conventional machine learning methods and it is not clear if the SVM+FS method has better performance than the other methods. This paper explores methods for detecting the signs of danger through comprehensive experiments to develop an intelligent agent which supports officials in the city management sections. We explores conventional machine learning methods: SVM, Random Forest, NaÃ¯ve Bayse using conventional word vectors, an LDA-based document vector, and word embedding by Word2Vec and compared the best method with SVM+FS. Experimental results illustrate the superiority of SVM+FS and invoke the importance of using multiple data sets when evaluating the methods of detecting signs of danger.

Keywords

complement report; signs of danger detection; government 2.0; machine learning.

Full Text:

PDF

References

Adachi, Y., Onimura, N., Yamashita, T., Hirokawa, S.: â€œStandard measure and svm measure for feature selection and their performance effect for text classification.â€ Proc. of the 18th International Conference on Information Integration and Web-based Applications and Services. pp. 262--266. 2016

Blei, D.M., Ng, A.Y., Jordan, M.I.: â€œLatent dirichlet allocation.â€ Journal of machine Learning research 3(Jan), pp. 993â€”1022, 2003

Breiman, L.: â€œRandom forests.â€ Machine learning 45(1), pp.5â€”32, 2001

Cresci, S., Cimino, A., Dellâ€™Orletta, F., Tesconi, M.: â€œCrisis mapping during natural disasters via text analysis of social media messages.â€ Proc. of International Conference on Web Information Systems Engineering. pp. 250--258. 2015

Hirokawa, S., Suzuki, T., Mine, T.: â€œMachine learning is better than human to satisfy decision by majority.â€ Proc. of Web Intelligence 2017. pp. 694â€”701, 2017

Imran, M., Castillo, C., Diaz, F., Vieweg, S.: â€œProcessing social media messages in mass emergency: A survey.â€ ACM Computing Surveys (CSUR) 47(4), 67, 2015

Imran, M., Elbassuoni, S.M., Castillo, C., Diaz, F., Meier, P.: â€œExtracting information nuggets from disaster-related messages in social media.â€ Proc. of the 10th International Conference on Information Systems for Crisis Response and Management, 2013

Joachims, T.: â€œText categorization with support vector machines: Learning with many relevant features.â€ Machine Learning: ECML98. pp. 137â€”142, 1998

Joachims, T.: â€œLearning to Classify Text Using Support Vector Machines.â€ The Kluwer International Series in Engineering and Computer Science, Springer US, 2002

McCallum, A., Nigam, K., et al.: â€œA comparison of event models for naive bayes text classification.â€ Proc. of AAAI-98 workshop on learning for text categorization. vol. 752, pp. 41--48. 1998

Mikolov, T.: â€œStatistical language models based on neural networks.â€ Presentation at Google, Mountain View, 2nd April, 2012

Sakai, T., Hirokawa, S.: â€œFeature words that classify problem sentence in scientific article.â€ Proc. of the 14th International Conference on Information Integration and Web-based Applications & Services. pp. 360â€”367, 2012

Sano, Y., Yamaguchi, K., Mine, T.: â€œAutomatic classification of complaint reports about city park.â€ Information Engineering Express 1(4), pp.119â€”130, 2015, [Online] Available: http://www.iaiai.org/journals/index.php/ IEE/article/view/35

Username
Password
Remember me