A Framework for Malay Computational Grammar Formalism based-on Enhanced Pola Grammar

Hassan Mohamed - National Defence University of Malaysia, Kuala Lumpur, Malaysia
Nur Aisyah Abdul Fataf - National Defence University of Malaysia, Kuala Lumpur, Malaysia
Tengku Mohd Tengku Sembok - National Defence University of Malaysia, Kuala Lumpur, Malaysia


Citation Format:



DOI: http://dx.doi.org/10.30630/joiv.7.2.1172

Abstract


In the era of IR4.0, Natural Language Processing (NLP) is one of the major focuses because text is stored digitally to code the information. Natural language understanding requires a computational grammar for syntax and semantics of the language in question for this information to be manipulated digitally. Many languages around the world have their own computational grammars for processing syntax and semantics. However, when it comes to the Malay language, the researchers have yet to come across a substantial computational grammar that can process Malay syntax and semantics based on a computational theoretical framework that can be applied in systems such as e-commerce. Hence, we intend to propose a formalism framework based on enhanced Pola Grammar with syntactic and semantic features. The objectives of this proposed framework are to create a linguistic computational formalism for the Malay language based on theoretical linguistic; implement templates for Malay words to handle syntax and semantic features in accordance with the enhanced Pola Grammar; and create a Malay Language Parser Algorithm that can be used for digital applications. To accomplish the objectives, the proposed framework will recursively formalise the computational Malay grammar and lexicon using a combination of solid theoretical linguistic foundations such as Dependency Grammar. A Malay parsing algorithm will be developed for the proposed model until the formalised grammar is deemed reliable. The findings of this indigenous Malay parser will help to advance Malay language applications in the digital economy.


Keywords


Malay Language; Malay Parser; Dependency Gramma

Full Text:

PDF

References


K. Dashtipour, M. Gogate, J. Li, F. Jiang, B. Kong, and A. Hussain, “A hybrid Persian sentiment analysis framework: Integrating dependency grammar based rules and deep neural networks,†Neurocomputing, vol. 380, 2020, doi: 10.1016/j.neucom.2019.10.009.

R. R. Iyer, R. Kohli, and S. Prabhumoye, “Modeling Product Search Relevance in e-Commerce,†Jan. 2020, [Online]. Available: http://arxiv.org/abs/2001.04980

H. Elhabbak, B. Descamps, E. Fischer, and S. Athanasiadis, “Contextualisation of eCommerce Users,†Oct. 2020, [Online]. Available: http://arxiv.org/abs/2011.01874

E. Agichtein, D. Hakkani-Tür, S. Kallumadi, and S. Malmasi, “Converse’20: The WSDM 2020 workshop on conversational systems for e-commerce recommendations and search,†in WSDM 2020 - Proceedings of the 13th International Conference on Web Search and Data Mining, Association for Computing Machinery, Inc, Jan. 2020, pp. 897–898. doi: 10.1145/3336191.3371882.

A. Papenmeier, D. Kern, D. Hienert, A. Sliwa, A. Aker, and N. Fuhr, “Dataset of Natural Language Queries for E-Commerce,†in CHIIR 2021 - Proceedings of the 2021 Conference on Human Information Interaction and Retrieval, 2021. doi: 10.1145/3406522.3446043.

M. J. A. Aziz, F. Ahmad, A. A. A. Ghani, and R. Mahmod, “Pola grammar technique for grammatical relation extraction in Malay language,†Malaysian Journal of Computer Science, vol. 19, no. 1, 2006.

B. Li, J. Cheng, Y. Liu, and F. Keller, “Dependency grammar induction with a neural variational transition-based parser,†in 33rd AAAI Conference on Artificial Intelligence, AAAI 2019, 31st Innovative Applications of Artificial Intelligence Conference, IAAI 2019 and the 9th AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, 2019. doi: 10.1609/aaai.v33i01.33016658.

H. Mohamed, N. Omar, and M. J. A. Aziz, “The Effectiveness of Using Malay Affixes for Handling Unknown Words In Unsupervised HMM POS Tagger,†International Journal of …, 2018.

A. Omar, Morfologi-sintaksis bahasa Melayu (Malaya) dan bahasa Indonesia: satu perbandingan pola. Kuala Lumpur: Dewan Bahasa dan Pustaka, 1968.

S. N. Karim, F. M. Onn, H. Musa, and H. A. Mahmood, Tatabahasa Dewan Edisi Ketiga, Third. Kuala Lumpur: Dewan Bahasa dan Pustaka, 2008.

J. Zhou and H. Zhao, “Head-driven phrase structure grammar parsing on Penn treebank,†in ACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, 2020. doi: 10.18653/v1/p19-1230.

E. Seifossadat and H. Sameti, “Stochastic Data-to-Text Generation Using Syntactic Dependency Information,†Comput Speech Lang, vol. 76, 2022, doi: 10.1016/j.csl.2022.101388.

M. T. Sembok, “A Robust Parsing and Translation Strategy for Categorial Grammar,†Sains Malays, vol. 19, no. 4, 1990.

R. Kadir, M. T. Sembok, and B. H. Zaman, “Logical Deduction Inference on Logic Representation in Open-ended Question Answering Process,†in Seminar Siswazah 2005 FTSM, Bangi, 2005.

R. Kadir, M. T. Sembok, and B. H. Zaman, “A Logical Deduction to First Order Logic for Use in an Intelligent Reading Comprehension,†in Proceedings of the International Conference on Robotics, Vision, Information and Signal Processing: ROVISP, 2005, pp. 819–823.

G. Knowles and M. Z. Don, “Tagging a Corpus of Malay Text and Coping with Syntactic Drift,†in Proceedings of the Corpus Linguistics, University of Lancaster, 2003, pp. 422–428.

B. Ranaivo-Malancon, “Malay lexical analysis through corpus-based approach,†in Proceedings of International Conference of Malay Lexicology and Lexicography (PALMA), Kuala Lumpur, 2005.

S. Gaber, M. Z. A. Nazri, N. Omar, and S. Abdullah, “Part-of-speech (pos) tagger for malay language using naÃve bayes and k-nearest neighbor model,†Journal of Critical Reviews, vol. 7, no. 16, 2020, doi: 10.31838/jcr.07.16.33.

N. Zamin, A. Oxley, Z. A. Bakar, and S. A. Farhan, “A lazy man’s way to part-of-speech tagging,†in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2012. doi: 10.1007/978-3-642-32541-0_9.

N. A. Rahman, N. K. Ismail, M. N. Nor, M. S. Kamis, and N. Alias, “Tagging narrator’s names in hadith text,†Journal of Fundamental and Applied Sciences, vol. 9, no. 55, pp. 295–309, 2017.

N. A. Nasharuddin, M. T. Abdullah, A. Azman, and R. A. Kadir, “A framework for English and Malay cross-lingual document alignment method,†International Journal of Advanced Trends in Computer Science and Engineering, vol. 8, no. 1.3 S1, 2019, doi: 10.30534/ijatcse/2019/3881.32019.

Y. Maisarah, “MYPARSER: A Malay Text Categorization Toolkit Using Inference Rule,†Master, UTHM, 2013.

D. E. Cahyani, L. Gumilar, and A. Pangestu, “Indonesian Parsing using Probabilistic Context-Free Grammar (PCFG) and Viterbi-Cocke Younger Kasami (Viterbi-CYK),†in 2020 3rd International Seminar on Research of Information Technology and Intelligent Systems, ISRITI 2020, 2020. doi: 10.1109/ISRITI51436.2020.9315395.

S. Alias, “A Pattern-Growth Sentence Compression Technique for Malay Text Summarizer,†2018.

N. H. Mohd Noor, S. A. Mohd Noah, and M. J. Ab Aziz, “Classification of Short Possessive Clitic Pronoun Nya In Malay Text To Support Anaphor Candidate Determination,†Journal of Information and Communication Technology, vol. 19, 2020, doi: 10.32890/jict2020.19.4.3.

K. Popper and P. Camiller, All life is problem solving. 2013. doi: 10.4324/9780203431900.

D. Keyt, “Wittgenstein’s Picture Theory of Language,†Philos Rev, vol. 73, no. 4, 1964, doi: 10.2307/2183303.

J. Jiang and H. Liu, “Lucien Tesnière, Elements of structural syntax. Translated by Timothy Osborne and Sylvain Kahane. Amsterdam & Philadelphia, PA: John Benjamins, 2015. Pp. lxxxii $+$ 698.,†J Linguist, vol. 51, no. 3, 2015, doi: 10.1017/s0022226715000249.

T. Osborne, “NPs, not DPs: The NP vs. DP debate in the context of dependency grammar,†Acta Linguistica Academica, vol. 68, no. 3, pp. 274–317, 2021.

A. Lopopolo, A. van den Bosch, K. M. Petersson, and R. M. Willems, “Distinguishing Syntactic Operations in the Brain: Dependency and Phrase-Structure Parsing,†Neurobiology of Language, vol. 2, no. 1, 2020, doi: 10.1162/nol_a_00029.

I. Mel’Äuk, “Levels of Dependency in Linguistic Description: Concepts and Problems,†Dependency and Valency. An International Handbook of Contemporary Research. Vol. 1, 2003.

H. B. Curry, “The inconsistency of certain formal logics,†Journal of Symbolic Logic, vol. 7, no. 3, 1942, doi: 10.2307/2269292.

B. Almeida, A. Mordido, P. Thiemann, and V. T. Vasconcelos, “Polymorphic lambda calculus with context-free session types,†Inf Comput, 2022, doi: 10.1016/j.ic.2022.104948.

E. G. Daylight, “The halting problem and security’s language-theoretic approach: Praise and criticism from a technical historian,†Computability, vol. 10, no. 2, 2021, doi: 10.3233/com-180217.

H. Singh, “Visualizing and Computing Natural Language Expressions: Through a Typed Lambda Calculus λ,†in Lecture Notes in Electrical Engineering, 2021. doi: 10.1007/978-981-16-5078-9_49.

M. Nasir and C. I. Ezeife, “Semantic enhanced Markov model for sequential E-commerce product recommendation,†Int J Data Sci Anal, vol. 15, no. 1, 2023, doi: 10.1007/s41060-022-00343-y.