An Improved Hybrid GRU and CNN Models for News Text Classification

Inteasar Khudhair - University of Diyala, Diyala, Iraq
Sundus Majeed - University of Baghdad, Baghdad, Iraq
Ali Ahmed - University of Diyala, Diyala, Iraq
Mokhalad Kadhim Alsaeedi - University of Diyala, Diyala, Iraq
Firas Aswad - University of Diyala, Diyala, Iraq


Citation Format:



DOI: http://dx.doi.org/10.62527/joiv.9.1.2658

Abstract


 Due to the continuous growth and advancement of technology, an enormous volume of text data is generated daily across various sources including social media platforms, websites, search engines, healthcare records, and news articles. Extracting meaningful patterns from text data, such as viewpoints, related theories, journal distribution, facts, and the development of online news text, is a challenging task due to the varying lengths of the texts. One issue arises from the length of the text data itself, and another challenge lies in extracting valuable features, especially in news articles. In the deep learning models, the convolutional neural networks (CNNs) are capable of capturing local features in text data, but unable to capture the structural information or semantic relationships between words. Consequently, a sole CNN network often yields poor performance in text classification tasks, whereas the Gated Recurrent Unit (GRU) is adept at effectively extracting semantic information and understanding the global structural relationships present in textual data. This paper presents a solution to the problem by introducing a new text classification that integrates the strengths of CNN and GRU. The proposed hybrid models incorporate word vectorization and word dispersion in parallel. Initially, the model trains word vectors using the Word2vec model and then leverages the GRU model to capture semantic information from text sentences. Subsequently, the CNN method is employed to capture crucial semantic features, leading to classification using the SoftMax layer. Experimental findings demonstrated that the proposed hybrid GRU_CNN model outperformed and achieved accuracy 97.73% as compared to individual CNN, LSTM, and GRU models in terms of classification effectiveness and accuracy.


Keywords


GRU, CNN, Modelling Mechanism, Word Embedding, News Text Classification

Full Text:

PDF

References


J. Zhang, Q. Wang, Y. Li, D. Li, and Y. Hao, “A method for Chinese text classification based on three-dimensional vector space model”, in Proceedings of the 2012 International Conf. on Computer Science and Service System, Nanjing, China, August, pp. 1324-1327,2012.

M. Zhang, “Applications of deep learning in news text classification”, Scientific Programming, 1-9, 2021.

M. N. Asim, M. U. G. Khan, M. I. Malik et al., “A robust hybrid approach for textual document classification”, in Proceedings of the Document analysis and Recognition (ICDAR), Sydney, NSW, Australia, pp. 1390-1396, September 2019.

N. Bansal, A. Sharma, & R. K. Singh, “An Evolving Hybrid Deep Learning Framework for Legal Document Classification”, Ingénierie des Systèmes d'Information, 24(4) (2019).

F. Miao, P. Zhang, L. Jin, & H. Wu, “Chinese news text classification based on machine learning algorithm”, In 2018 10th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC), IEEE, Vol. 2, pp. 48-51, 2018.

P. Barberá, A. E. Boydstun, S. Linn, R. McMahon, & J. Nagler, “Automated text classification of news articles: A practical guide. Political Analysis”, 29(1), 19-42, 2021.

W. Zhang, Y. Du, T. Yoshida, and Q. Wang, “DRI-RCNN: an approach to deceptive review identification using recurrent convolutional neural network”, Information Processing & Management, vol. 54, no. 4, pp. 576-592, 2018.

M. Zulqarnain, R. Ghazali, Y. M. M. Hassim, & M. Aamir “An enhanced gated recurrent unit with auto-encoder for solving text classification problems”, Arabian Journal for Science and Engineering, 46(9), 8953-8967, 2021.

S. Minaee, N. Kalchbrenner, E. Cambria, N. Nikzad, M. Chenaghlu, & J. Gao, “Deep learning-based text classification: a comprehensive review”, ACM computing surveys (CSUR), 54(3), 1-40, 2021.

A. Abbe, C. Grouin, P. Zweigenbaum, and B. Falissard, “Text mining applications in psychiatry: a systematic literature review”, International Journal of Methods in Psychiatric Research, vol. 25, no. 2, pp. 86-100, 2016.

S. U. Hassan, J. Ahamed, & K. Ahmad, “Analytics of machine learning-based algorithms for text classification”, Sustainable Operations and Computers, 3, 238-248, 2022.

R. Wang & X. Dai, “Contrastive learning-enhanced nearest neighbor mechanism for multi-label text classification”, In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) (pp. 672-679), May, 2022.

A. Falasari & M. A. Muslim, “Optimize naïve bayes classifier using chi square and term frequency inverse document frequency for amazon review sentiment analysis”, Journal of Soft Computing Exploration, 3(1), 31-36, 2022.

K. Shah, H. Patel, D. Sanghvi, & M. Shah, “A comparative analysis of logistic regression, random forest and KNN models for the text classification”, Augmented Human Research, 5, 1-16, 2020.

M. Zulqarnain, R. Ghazali, Y. M. M. Hassim, & M. Rehan, “A comparative review on deep learning models for text classification”, Indones. J. Electr. Eng. Comput. Sci, 19(1), 325-335, 2020.

C. Schröder & A. Niekler, “A survey of active learning for text classification using deep neural networks”, arXiv preprint arXiv:2008.07267, 2020.

M. N. Ashtiani, & B. Raahmei, “News-based intelligent prediction of financial markets using text mining and machine learning: A systematic literature review”, Expert Systems with Applications, 119509, 2023.

S. Soni, S. S. Chouhan, & S. S. Rathore, “TextConvoNet: A convolutional neural network-based architecture for text classification”, Applied Intelligence, 53(11), 14249-14268, 2023.

V. Dogra, S. Verma, P. Chatterjee, J. Shafi, J. Choi, & M. F. Ijaz, “A complete process of text classification system using state-of-the-art NLP models”, Computational Intelligence and Neuroscience, 2022.

M. Zulqarnain, A. K. Z. Alsaedi, R. Ghazali, M. G. Ghouse, W. Sharif, & N. A. Husaini, “A comparative analysis on question classification task based on deep learning approaches”, PeerJ Computer Science, 7, e570, 2021.

A. Mohammed, & R. Kora, “An effective ensemble deep learning framework for text classification”, Journal of King Saud University-Computer and Information Sciences, 34(10), 8825-8837, 2022.

Y. Kim, “Convolutional neural networks for sentence classification”, in Proceedings of the 2014 Conf. on Empirical Methods in Natural Language Processing, pp. 1746-1751, Doha, Qatar, October 2014.

X. Zhang, J. J. Zhao, Y. LeCun, “Character-level convolutional networks for text classification”, In: Advances in Neural Information Processing Systems 28: Annual Conf. on Neural Information Processing Systems, Montreal, Quebec, Canada, pp. 649-657, 2015.

Z. Yang, D. Yang, C. Dyer, X. He, A. Smola, & E. Hovy, “Hierarchical attention networks for document classification”, In Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies (pp. 1480-1489), June, 2016.

H. Liu, “Leveraging financial news for stock trend prediction with attention-based recurrent neural network”, arXiv preprint arXiv:1811.06173, 2018.

S. Hochreiter & J. Schmidhuber “Long short-term memory”, Neural computation, 9(8), 1735-1780, 1997.

Y. Zhang, Z. Zhang, D. Miao, & J. Wang, “Three-way enhanced convolutional neural networks for sentence-level sentiment classification”, Information Sciences, 477, 55-64, 2019.

S. Minaee, N. Kalchbrenner, E. Cambria, N. Nikzad, M. Chenaghlu, & J. Gao, “Deep learning--based text classification: a comprehensive review”, ACM computing surveys (CSUR), 54(3), 1-40, 2021.

R. Johnson & T. Zhang, “Deep pyramid convolutional neural networks for text categorization”, In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 562-570), July, 2017.

L. Yao, C. Mao, and Y. Luo, “Graph convolutional networks for text classification”, Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7370-7377, 2019.

T. Mikolov, I. Sutskever, K. Chen, C. Kai, G. Corrado, and J. Dean, “Distributed representations of words and phrases and their compositionality”, in Proceedings of the Advances in Neural Information Processing systems, pp. 3111-3119, Red Hook, NY, USA, December 2013.

J. Chung, C. Gulcehre, K. Cho, & Y. Bengio, “Empirical evaluation of gated recurrent neural networks on sequence modeling”, arXiv preprint arXiv:1412.3555, 2014.

A. Graves and J. Schmidhuber, “Framewise phoneme classification with bidirectional LSTM networks”, in Proceedings of the IEEE International Joint Conference on Neural Networks, pp. 2047-2052, Montreal, QC, Canada, July 2005.

Y. Cao, T. R. Li, Z. Jia, & C. F. Yin, “BGRU: A new method of Emotion analysis based on Chinese text”, IEEE J. Computer Science and Exploration, 13(6), 973-981, 2019.

Y. Li, & H. Dong, “Text sentiment analysis based on feature fusion of convolution neural network and bidirectional long short-term memory network”, Journal of computer Applications, 38(11), 3075, 2018.

F. Wei, H. Qin, S. Ye, & H. Zhao, “Empirical study of deep learning for text classification in legal document review”, In 2018 IEEE International Conference on Big Data (Big Data) (pp. 3317-3320), IEEE, 2018.

M. Aydogan, & A. Karci, “Improving the accuracy using pre-trained word embeddings on deep neural networks for Turkish text classification”, Physica A: Statistical Mechanics and its Applications, 541, Article 123288, 2020.

R. Setiabudi, N. M. S. Iswari, & A. Rusli, “Enhancing text classification performance by preprocessing misspelled words in Indonesian language”, TELKOMNIKA Telecommunication Computing Electronics and Control), 19(4), 1234-1241, 2021.

M. Zulqarnain, R. Ghazali, M. G. Ghouse, & M. F. Mushtaq, “Efficient processing of GRU based on word embedding for text classification”, JOIV: International Journal on Informatics Visualization, 3(4), 377-383, 2019.

Y. Liu, P. Li, & X. Hu, “Combining context-relevant features with multi-stage attention network for short text classification”, Computer Speech & Language, 71, 101268, 2022.

P. Dey, S. K. Chaulya, & S. Kumar, “Hybrid CNN-LSTM and IoT-based coal mine hazards monitoring and prediction system”, Process Safety and Environmental Protection, 152, 249-263, 2021.

M. Sahlgren, (2008). “The distributional hypothesis”, Italian Journal of Disability Studies, 20, 33-53.

A. Neelakantan, J. Shankar, A. Passos, A. McCallum, “Efficient non-parametric estimation of multiple embeddings per word in vector space”, arXiv. Available at http://arxiv.org/abs/1504.06654, 2015.

L. Yang, T. Pan, T. Yang, S. Wang, J. Tang, E. Cambria, “Learning word representations for sentiment analysis”, Cognitive Computation 9(6):843-851, 2017.

K. Armeni, R. M. Willems, S. L. Frank, “Probabilistic language models in cognitive neuroscience: promises and pitfalls”, Neuroscience & Biobehavioral Reviews 83:579-588, 2017.

T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, J. Dean. “Distributed representations of words and phrases and their compositionality”, Advances in Neural Information Processing Systems 4(11):3111-3119, 2013.

F. Nooralahzadeh, L. Øvrelid, J. T. Lønning, “Evaluation of domain-specific word embeddings using knowledge resources”, In: Proceedings of the eleventh international conference on language resources and evaluation (LREC 2018).

M. Zulqarnain, R. Ghazali, M. Aamir, & Y. M. M. Hassim, “An efficient two-state GRU based on feature attention mechanism for sentiment analysis”, Multimedia Tools and Applications, 1-26, 2022.

X. Feng, Y. Liang, X. Shi, D. Xu, X. Wang, & R. Guan, “Overfitting reduction of text classification based on AdaBELM”, Entropy, 19(7), 330, 2017.