Comparison of Adam Optimization and RMS prop in Minangkabau-Indonesian Bidirectional Translation with Neural Machine Translation

Fadhli Ahda - Universitas Negeri Malang, Jl. Semarang No.5, Malang, Indonesia
Aji Prasetya Wibawa - Institut Teknologi dan Bisnis Asia Malang, Malang, Indonesia
Didik Dwi Prasetya - Institut Teknologi dan Bisnis Asia Malang, Malang, Indonesia
Danang Arbian Sulistyo - Universitas Negeri Malang, Jl. Semarang No.5, Malang, Indonesia


Citation Format:



DOI: http://dx.doi.org/10.62527/joiv.8.1.1818

Abstract


Language is a tool humans use to establish communication. Still, the language used is one language and between regions or nations with their languages. Indonesia is a country that has a diversity of second languages and is the fourth most populous country in the world. It is recorded that Indonesia has nearly 800 regional languages, but research activities in natural language processing are still lacking. Minangkabau is an endangered language spoken by the Minangkabau people in Indonesia's West Sumatra province. According to UNESCO, the Minangkabau language is listed as a language that is "definitely endangered," with only around 5 million speakers worldwide. This study uses neural machine translation (NMT) to create a formula based on this information. Neural machine translation, in contrast to conventional statistical machine translation, intends to build a single neural network that can be built up to achieve the best performance. Because it can simultaneously hold memory for a long time, comprehend complicated relationships in data, and provide information that is very important in determining the outcome of translation, LSTM is one of the most powerful machine-learning techniques for translating languages. The BLUE score is utilized in the NMT evaluation. The test results use 520 Minangkabau sentences, conducting tests based on the number of epochs ranging from 100-1000, resulting in optimization using Adam being better than optimization RMSprop. This is evidenced by the results of the best BLUE-1 score of 0.997816 using 1000 epochs.

Keywords


LSTM, Machine Translation, Optimization, BLUE

Full Text:

PDF

References


A. F. Aji et al., “One Country, 700+ Languages: NLP Challenges for Underrepresented Languages and Dialects in Indonesia.” arXiv, Mar. 24, 2022. Accessed: Oct. 13, 2022. [Online]. Available: http://arxiv.org/abs/2203.13357

M. Amien, C. Feng, and H. Huang, “Location-based Twitter Filtering for the Creation of Low-Resource Language Datasets in Indonesian Local Languages.” arXiv, Jun. 14, 2022. Accessed: Oct. 13, 2022. [Online]. Available: http://arxiv.org/abs/2206.07238

D. Bradley and M. Bradley, Language Endangerment. Cambridge University Press, 2019.

M.-T. Luong, H. Pham, and C. D. Manning, “Effective Approaches to Attention-based Neural Machine Translation.” arXiv, Sep. 20, 2015. doi: 10.48550/arXiv.1508.04025.

S. H. Haji and A. M. Abdulazeez, “COMPARISON OF OPTIMIZATION TECHNIQUES BASED ON GRADIENT DESCENT ALGORITHM: A REVIEW,” PalArchs J. Archaeol. Egypt Egyptol., vol. 18, no. 4, Art. no. 4, Feb. 2021.

G. Singh, S. Sharma, V. Kumar, M. Kaur, M. Baz, and M. Masud, “Spoken Language Identification Using Deep Learning,” Comput. Intell. Neurosci., vol. 2021, pp. 1–12, Sep. 2021, doi: 10.1155/2021/5123671.

D. K. R. Gaddam, M. D. Ansari, S. Vuppala, V. K. Gunjan, and M. M. Sati, “A Performance Comparison of Optimization Algorithms on a Generated Dataset,” in ICDSMLA 2020, A. Kumar, S. Senatore, and V. K. Gunjan, Eds., in Lecture Notes in Electrical Engineering. Singapore: Springer, 2022, pp. 1407–1415. doi: 10.1007/978-981-16-3690-5_135.

J. Wieting, T. Berg-Kirkpatrick, K. Gimpel, and G. Neubig, “Beyond BLEU: Training Neural Machine Translation with Semantic Similarity.” arXiv, Sep. 14, 2019. doi: 10.48550/arXiv.1909.06694.

S. A. Mohamed, A. A. Elsayed, Y. F. Hassan, and M. A. Abdou, “Neural machine translation: past, present, and future,” Neural Comput. Appl., vol. 33, no. 23, pp. 15919–15931, Dec. 2021, doi: 10.1007/s00521-021-06268-0.

C. Mahfud, R. Astari, A. Kasdi, M. Mu’ammar, M. Muyasaroh, and F. Wajdi, “Islamic cultural and Arabic linguistic influence on the languages of Nusantara; From lexical borrowing to localized Islamic lifestyles,” Wacana J. Humanit. Indones., vol. 22, no. 1, Dec. 2022, doi: 10.17510/wacana.v22i1.914.

J. Santoso, E. I. Setiawan, F. X. Ferdinandus, G. Gunawan, and L. Hernandez, “Indonesian Language Term Extraction using Multi-Task Neural Network,” Knowl. Eng. Data Sci., vol. 5, no. 2, Art. no. 2, Dec. 2022, doi: 10.17977/um018v5i22022p160-167.

S. Ranathunga, E.-S. A. Lee, M. Prifti Skenduli, R. Shekhar, M. Alam, and R. Kaur, “Neural Machine Translation for Low-resource Languages: A Survey,” ACM Comput. Surv., vol. 55, no. 11, p. 229:1-229:37, Feb. 2023, doi: 10.1145/3567592.

“Natural Language Processing in Higher Education | Bulletin of Social Informatics Theory and Application.” Accessed: Oct. 03, 2023. [Online]. Available: https://pubs.ascee.org/index.php/businta/article/view/593

Md. A. Islam, Md. S. H. Anik, and A. B. M. A. A. Islam, “Towards achieving a delicate blending between rule-based translator and neural machine translator,” Neural Comput. Appl., vol. 33, no. 18, pp. 12141–12167, Sep. 2021, doi: 10.1007/s00521-021-05895-x.

M. Artetxe, G. Labaka, E. Agirre, and K. Cho, “Unsupervised Neural Machine Translation.” arXiv, Feb. 26, 2018. doi: 10.48550/arXiv.1710.11041.

J. Santoso, E. I. Setiawan, C. N. Purwanto, and F. Kurniawan, “Indonesian Sentence Boundary Detection using Deep Learning Approaches,” Knowl. Eng. Data Sci., vol. 4, no. 1, Art. no. 1, Jun. 2021, doi: 10.17977/um018v4i12021p38-48.

Z. Jianqiang and G. Xiaolin, “Comparison Research on Text Pre-processing Methods on Twitter Sentiment Analysis,” IEEE Access, vol. 5, pp. 2870–2879, 2017, doi: 10.1109/ACCESS.2017.2672677.

A. P. Wibawa, H. K. Fithri, I. A. E. Zaeni, and A. Nafalski, “Generating Javanese Stopwords List using K-means Clustering Algorithm,” Knowl. Eng. Data Sci., vol. 3, no. 2, Art. no. 2, Dec. 2020, doi: 10.17977/um018v3i22020p106-111.

Rianto, A. B. Mutiara, E. P. Wibowo, and P. I. Santosa, “Improving the accuracy of text classification using stemming method, a case of non-formal Indonesian conversation,” J. Big Data, vol. 8, no. 1, p. 26, Dec. 2021, doi: 10.1186/s40537-021-00413-1.

S. M. Tauhid and Y. Ruldeviyani, “Sentiment Analysis of Indonesians Response to Influencer in Social Media,” in 2020 7th International Conference on Information Technology, Computer, and Electrical Engineering (ICITACEE), Semarang, Indonesia: IEEE, Sep. 2020, pp. 90–95. doi: 10.1109/ICITACEE50144.2020.9239218.

Le, Ho, Lee, and Jung, “Application of Long Short-Term Memory (LSTM) Neural Network for Flood Forecasting,” Water, vol. 11, no. 7, p. 1387, Jul. 2019, doi: 10.3390/w11071387.

D. A. Sulistyo, A. P. Wibawa, D. D. Prasetya, and F. A. Ahda, “LSTM-Based Machine Translation for Madurese-Indonesian,” J. Appl. Data Sci., vol. 4, no. 3, Art. no. 3, Sep. 2023, doi: 10.47738/jads.v4i3.113.

D. Janardhanan and E. Barrett, “CPU workload forecasting of machines in data centers using LSTM recurrent neural networks and ARIMA models,” in 2017 12th International Conference for Internet Technology and Secured Transactions (ICITST), Dec. 2017, pp. 55–60. doi: 10.23919/ICITST.2017.8356346.

Y. Deng, H. Fan, and S. Wu, “A hybrid ARIMA-LSTM model optimized by BP in the forecast of outpatient visits,” J. Ambient Intell. Humaniz. Comput., Oct. 2020, doi: 10.1007/s12652-020-02602-x.

W. Lu, J. Li, Y. Li, A. Sun, and J. Wang, “A CNN-LSTM-Based Model to Forecast Stock Prices,” Complexity, vol. 2020, pp. 1–10, Nov. 2020, doi: 10.1155/2020/6622927.

J. Long and K. Furati, Identification and prediction of time-varying parameters of COVID-19 model: a data-driven deep learning approach. 2021.

P. Kumari and D. Toshniwal, “Long short term memory–convolutional neural network based deep hybrid approach for solar irradiance forecasting,” Appl. Energy, vol. 295, p. 117061, Aug. 2021, doi: 10.1016/j.apenergy.2021.117061.

W. Wang, C. M. Lee, J. Liu, T. Colakoglu, and W. Peng, “An empirical study of cyclical learning rate on neural machine translation,” Nat. Lang. Eng., vol. 29, no. 2, pp. 316–336, Mar. 2023, doi: 10.1017/S135132492200002X.

P. Nimmani, S. Vodithala, and V. Polepally, “Neural Network Based Integrated Model for Information Retrieval,” in 2021 5th International Conference on Intelligent Computing and Control Systems (ICICCS), May 2021, pp. 1286–1289. doi: 10.1109/ICICCS51141.2021.9432241.

D. Soydaner, “A Comparison of Optimization Algorithms for Deep Learning,” Int. J. Pattern Recognit. Artif. Intell., vol. 34, no. 13, p. 2052013, Dec. 2020, doi: 10.1142/S0218001420520138.

E. Okewu, S. Misra, and F.-S. Lius, “Parameter Tuning Using Adaptive Moment Estimation in Deep Learning Neural Networks,” in Computational Science and Its Applications – ICCSA 2020, O. Gervasi, B. Murgante, S. Misra, C. Garau, I. Blečić, D. Taniar, B. O. Apduhan, A. M. A. C. Rocha, E. Tarantino, C. M. Torre, and Y. Karaca, Eds., in Lecture Notes in Computer Science. Cham: Springer International Publishing, 2020, pp. 261–272. doi: 10.1007/978-3-030-58817-5_20.

J. S. John, “AdamD: Improved bias-correction in Adam.” arXiv, Oct. 22, 2021. doi: 10.48550/arXiv.2110.10828.

A. Pranolo, Y. Mao, A. P. Wibawa, A. B. P. Utama, and F. A. Dwiyanto, “Optimized Three Deep Learning Models Based-PSO Hyperparameters for Beijing PM2.5 Prediction,” Knowl. Eng. Data Sci., vol. 5, no. 1, p. 53, Jun. 2022, doi: 10.17977/um018v5i12022p53-66.

D. P. Kingma and J. Ba, “Adam: A Method for Stochastic Optimization.” arXiv, Jan. 29, 2017. doi: 10.48550/arXiv.1412.6980.

Z. Xiong, Y. Cui, Z. Liu, Y. Zhao, M. Hu, and J. Hu, “Evaluating explorative prediction power of machine learning algorithms for materials discovery using k-fold forward cross-validation,” Comput. Mater. Sci., vol. 171, p. 109203, Jan. 2020, doi: 10.1016/j.commatsci.2019.109203.

J. J. Salazar, L. Garland, J. Ochoa, and M. J. Pyrcz, “Fair train-test split in machine learning: Mitigating spatial autocorrelation for improved prediction accuracy,” J. Pet. Sci. Eng., vol. 209, p. 109885, Feb. 2022, doi: 10.1016/j.petrol.2021.109885.

S. Saud, B. Jamil, Y. Upadhyay, and K. Irshad, “Performance improvement of empirical models for estimation of global solar radiation in India: A k-fold cross-validation approach,” Sustain. Energy Technol. Assess., vol. 40, p. 100768, Aug. 2020, doi: 10.1016/j.seta.2020.100768.

G. I. Diaz, A. Fokoue-Nkoutche, G. Nannicini, and H. Samulowitz, “An effective algorithm for hyperparameter optimization of neural networks,” IBM J. Res. Dev., vol. 61, no. 4/5, p. 9:1-9:11, Jul. 2017, doi: 10.1147/JRD.2017.2709578.