Verification of a Dataset for Korean Machine Reading Comprehension with Numerical Discrete Reasoning over Paragraphs

Gyeongmin Kim - Korea University, Seoul 02841, Republic of Korea
Jaechoon Jo - Hanshin University, Osan 18101, Republic of Korea


Citation Format:



DOI: http://dx.doi.org/10.30630/joiv.6.2-2.1120

Abstract


Numerical reasoning in machine reading comprehension (MRC) has demonstrated significant performance improvements in the past few years. However, due to the process being restricted to specific languages, low-resource languages are not considered, and MRC studies on such languages are limited. In addition, the methods that rely on existing information extracted within the span of a paragraph have limitations in responding to questions requiring actual reasoning. To overcome these shortcomings, this study establishes a dataset for learning Korean Question and Answering (QA) models that not only answer within the span of passages but also perform numerical reasoning on passages and questions. Its efficacy was verified by training the model. We recruited eight annotators to tag the ground truth label, and they annotated datasets with 920, 115, and 115 passages in the train, dev, and test, respectively. A simple yet sophisticated automatic inter-annotation tool was created by effectively reducing the possibility of inaccuracy and error entailed by humans in the data construction process. This tool used common KoBERT and KoELECTRA. We defined four general conditions, and six conditions humans must inspect and fine-tune the pre-trained language models with numerically aware architecture. The KoELECTRA and NumNet+ with KoELECTRA were fine-tuned, and experiments in identical hyperparameter settings showed that compared with other models, the performance of NumNet+ with KoELECTRA was higher by more than 1.3 points. Our research contributes to the Korean MRC research and suggests potential and insight into MRC models capable of numerical reasoning.

Keywords


Machine reading comprehension; numerical reasoning; language model; ELECTRA; low-resource language.

Full Text:

PDF

References


C. A. Perfetti, N. Landi, & J. Oakhill (2005). The Acquisition of Reading Comprehension Skill. In M. J. Snowling & C. Hulme (Eds.), The science of reading: A handbook (pp. 227–247). Blackwell Publishing.

N. K. Duke, and P. D. Pearson. "Effective practices for developing reading comprehension." Journal of education 189.1-2 (2009): 107-122.

P. Rajpurkar, J. Zhang, K. Lopyrev, and P. Liang. 2016. SQuAD: 100,000+ Questions for Machine Comprehension of Text. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 2383–2392, Austin, Texas. Association for Computational Linguistics.

S. Lim, M. Kim, & J. Lee. (2018). Korquad: Korean qa dataset for machine comprehension. In Proceeding of the Conference of the Korea Information Science Society (pp. 539-541).

D. Jacob, et al. "Bert: Pre-training of deep bidirectional transformers for language understanding." arXiv preprint arXiv:1810.04805 (2018).

R. Colin, et al. "Exploring the limits of transfer learning with a unified text-to-text transformer." J. Mach. Learn. Res. 21.140 (2020): 1-67.

A. Vaswani, et al. "Attention is all you need." Advances in neural information processing systems 30 (2017).

G. Kim, et al. "AI Student: A Machine Reading Comprehension System for the Korean College Scholastic Ability Test." Mathematics 10.9 (2022): 1486.

G. Kim, et al. "Automatic extraction of named entities of cyber threats using a deep Bi-LSTM-CRF network." International journal of machine learning and cybernetics 11.10 (2020): 2341-2355.

A. Radford, et al. "Language models are unsupervised multitask learners." OpenAI blog 1.8 (2019): 9.

G. Kim, et al. "Enhancing Korean Named Entity Recognition With Linguistic Tokenization Strategies." IEEE Access 9 (2021): 151814-151823.

D. Dua, Y. Wang, P. Dasigi, G. Stanovsky, S. Singh, and M. Gardner. 2019. DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 2368–2378, Minneapolis, Minnesota. Association for Computational Linguistics.

Q. Ran, Y. Lin, P. Li, J. Zhou, and Z. Liu. 2019. NumNet: Machine Reading Comprehension with Numerical Reasoning. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 2474–2484, Hong Kong, China. Association for Computational Linguistics.

A. W. Yu, et al. "Qanet: Combining local convolution with global self-attention for reading comprehension." arXiv preprint arXiv:1804.09541 (2018).

M. Hu, Y. Peng, Z. Huang, and D. Li. 2019. A Multi-Type Multi-Span Network for Reading Comprehension that Requires Discrete Reasoning. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 1596–1606, Hong Kong, China. Association for Computational Linguistics.

K. Chen, et al. 2020. Question Directed Graph Attention Network for Numerical Reasoning over Text. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 6759–6768, Online. Association for Computational Linguistics.

A. Saha, S. Joty, and S. C. Hoi, (2021). Weakly supervised neuro-symbolic module networks for numerical reasoning. arXiv preprint arXiv:2101.11802.

S. Reddy, D. Chen, and C. D. Manning, (2019). Coqa: A conversational question answering challenge. Transactions of the Association for Computational Linguistics, 7, 249-266.

Z. Yang, P. Qi, S. Zhang, Y. Bengio, W. Cohen, R. Salakhutdinov, and C. D. Manning. 2018. HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 2369–2380, Brussels, Belgium. Association for Computational Linguistics. Zhang, S., Liu, X., Liu, J., Gao, J., Duh, K., & Van Durme, B. (2018). Record: Bridging the gap between human and machine commonsense reading comprehension. arXiv preprint arXiv:1810.12885.

Z. Sheng, et al. (2018). Record: Bridging the gap between human and machine commonsense reading comprehension. arXiv preprint arXiv:1810.12885.

J. Welbl, P. Stenetorp, and S. Riedel. 2018. Constructing Datasets for Multi-hop Reading Comprehension Across Documents. Transactions of the Association for Computational Linguistics, 6:287–302.

M. Alexandre, V. Carles, and E. Heetderks. "Low-resource languages: A review of past work and future challenges." arXiv preprint arXiv:2006.07264 (2020).

G. Kim, et al. "Reading Comprehension requiring Discrete Reasoning Over Paragraphs for Korean." Annual Conference on Human and Language Technology. Human and Language Technology, 2021.

C. Kevin, et al. "Electra: Pre-training text encoders as discriminators rather than generators." arXiv preprint arXiv:2003.10555 (2020).

J. Park, GitHub, 2020, "KoELECTRA: Pretrained ELECTRA Model for Korean", https://github.com/monologg/KoELECTRA