Handwritten Character Recognition using Deep Learning Algorithm with Machine Learning Classifier

Muhamad Arief Liman - Bina Nusantara University, Jakarta, Indonesia
Antonio Josef - Bina Nusantara University, Jakarta, Indonesia
Gede Putra Kusuma - Bina Nusantara University, Jakarta, Indonesia


Citation Format:



DOI: http://dx.doi.org/10.62527/joiv.8.1.1707

Abstract


Handwritten character recognition is a problem that has been worked on for many mainstream languages. Handwritten letter recognition has been proven to achieve promising results. Several studies using deep learning models have been conducted to achieve better accuracies. In this paper, the authors conducted two experiments on the EMNIST Letters dataset: Wavemix-Lite and CoAtNet. The Wavemix-Lite model uses Two-Dimensional Discrete Wavelet Transform Level 1 to reduce the parameters and speed up the runtime. The CoAtNet is a combined model of CNN and Visual Transformer where the image is broken down into fixed-size patches. The feature extraction part of the model is used to embed the input image into a feature vector. From those two models, the authors hooked the value of the features of the Global Average Pool layer using EMNIST Letters data. The features hooked from the training results of the two models, such as SVM, Random Forest, and XGBoost models, were used to train the machine learning classifier. The experiments conducted by the authors show that the best machine-learning model is the Random Forest, with 96.03% accuracy using the Wavemix-Lite model and 97.90% accuracy using the CoAtNet model. These results showcased the benefit of using a machine learning model for classifying image features that are extracted using a deep learning model.

Keywords


Handwritten Character Recognition; Deep Learning; Image Embedding; Machine Learning Models; Ensemble Model

Full Text:

PDF

References


D. C. Cireşan, U. Meier, L. M. Gambardella, and J. Schmidhuber, “Deep, Big, Simple Neural Nets for Handwritten Digit Recognition,” Neural Comput, vol. 22, no. 12, pp. 3207–3220, Dec. 2010, doi: 10.1162/NECO_a_00052.

I. Khandokar, M. Hasan, F. Ernawan, S. Islam, and M. N. Kabir, “Handwritten character recognition using convolutional neural network,” J Phys Conf Ser, vol. 1918, no. 4, p. 042152, Jun. 2021, doi: 10.1088/1742-6596/1918/4/042152.

A. Priya, S. Mishra, S. Raj, S. Mandal, and S. Datta, “Online and offline character recognition: A survey,” in 2016 International Conference on Communication and Signal Processing (ICCSP), IEEE, Apr. 2016, pp. 0967–0970. doi: 10.1109/ICCSP.2016.7754291.

V. Jayasundara, S. Jayasekara, H. Jayasekara, J. Rajasegaran, S. Seneviratne, and R. Rodrigo, “TextCaps : Handwritten Character Recognition with Very Small Datasets,” Apr. 2019, doi: 10.1109/WACV.2019.00033.

H. Ansaf, H. Najm, J. M. Atiyah, and O. A. Hassen, “Improved Approach for Identification of Real and Fake Smile using Chaos Theory and Principal Component Analysis,” Journal of Southwest Jiaotong University, vol. 54, no. 5, 2019, doi: 10.35741/issn.0258-2724.54.5.20.

R. Vaidya, D. Trivedi, S. Satra, and Prof. M. Pimpale, “Handwritten Character Recognition Using Deep-Learning,” in 2018 Second International Conference on Inventive Communication and Computational Technologies (ICICCT), IEEE, Apr. 2018, pp. 772–775. doi: 10.1109/ICICCT.2018.8473291.

P. Jeevan, K. Viswanathan, and A. Sethi, “WaveMix-Lite: A Resource-efficient Neural Network for Image Analysis,” May 2022.

Z. Dai, H. Liu, Q. V. Le, and M. Tan, “CoAtNet: Marrying Convolution and Attention for All Data Sizes,” Jun. 2021.

T. S. Gunawan, A. F. R. Mohd Noor, and M. Kartiwi, “Development of English Handwritten Recognition Using Deep Neural Network,” Indonesian Journal of Electrical Engineering and Computer Science, vol. 10, no. 2, p. 562, May 2018, doi: 10.11591/ijeecs.v10.i2.pp562-568.

P. Pad, S. Narduzzi, C. Kundig, E. Turetken, S. A. Bigdeli, and L. A. Dunbar, “Efficient Neural Vision Systems Based on Convolutional Image Acquisition,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2020.

A. Gesmundo and J. Dean, “An Evolutionary Approach to Dynamic Introduction of Tasks in Large-scale Multitask Learning Systems,” May 2022.

A. Gesmundo, “A Continual Development Methodology for Large-scale Multitask Dynamic ML Systems,” Sep. 2022.

B. Ma, X. Li, Y. Xia, and Y. Zhang, “Autonomous deep learning: A genetic DCNN designer for image classification,” Neurocomputing, vol. 379, pp. 152–161, Feb. 2020, doi: 10.1016/J.NEUCOM.2019.10.007.

H. M. D. Kabir et al., “SpinalNet: Deep Neural Network with Gradual Input,” Jul. 2020.

H. Taud and J. F. Mas, “Multilayer Perceptron (MLP),” 2018, pp. 451–455. doi: 10.1007/978-3-319-60801-3_27.

Z.-H. Huang, W.-J. Li, J. Shang, J. Wang, and T. Zhang, “Non-uniform patch based face recognition via 2D-DWT,” Image Vis Comput, vol. 37, pp. 12–19, May 2015, doi: 10.1016/j.imavis.2014.12.005.

A. Ghosh, B. Bhattacharya, and S. Basu Roy Chowdhury, “AdGAP: Advanced Global Average Pooling,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32, no. 1, Apr. 2018, doi: 10.1609/aaai.v32i1.12154.

J. Ul Rahman, A. Ali, M. Ur Rehman, and R. Kazmi, “A Unit Softmax with Laplacian Smoothing Stochastic Gradient Descent for Deep Convolutional Neural Networks,” 2020, pp. 162–174. doi: 10.1007/978-981-15-5232-8_14.

A. Voulodimos, N. Doulamis, A. Doulamis, and E. Protopapadakis, “Deep Learning for Computer Vision: A Brief Review,” Comput Intell Neurosci, vol. 2018, 2018, doi: 10.1155/2018/7068349.

K. Han et al., “A Survey on Vision Transformer,” IEEE Trans Pattern Anal Mach Intell, vol. 45, no. 1, pp. 87–110, Jan. 2023, doi: 10.1109/TPAMI.2022.3152247.

K. Islam, “Recent Advances in Vision Transformer: A Survey and Outlook of Recent Work,” Mar. 2022.

J. Wolfe, X. Jin, T. Bahr, and N. Holzer, “APPLICATION OF SOFTMAX REGRESSION AND ITS VALIDATION FOR SPECTRAL-BASED LAND COVER MAPPING,” The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, vol. XLII-1/W1, pp. 455–459, May 2017, doi: 10.5194/isprs-archives-XLII-1-W1-455-2017.

X. Qi, T. Wang, and J. Liu, “Comparison of Support Vector Machine and Softmax Classifiers in Computer Vision,” in 2017 Second International Conference on Mechanical, Control and Computer Engineering (ICMCCE), IEEE, Dec. 2017, pp. 151–155. doi: 10.1109/ICMCCE.2017.49.

Y. Cheng, W. Liu, P. Duan, J. Liu, and T. Mei, “PyAnomaly: A Pytorch-based Toolkit for Video Anomaly Detection,” in Proceedings of the 28th ACM International Conference on Multimedia, New York, NY, USA: ACM, Oct. 2020, pp. 4473–4476. doi: 10.1145/3394171.3414540.

R. L. Kumar, J. Kakarla, B. V. Isunuri, and M. Singh, “Multi-class brain tumor classification using residual network and global average pooling,” Multimed Tools Appl, vol. 80, no. 9, pp. 13429–13438, Apr. 2021, doi: 10.1007/s11042-020-10335-4.

V. K. Chauhan, K. Dahiya, and A. Sharma, “Problem formulations and solvers in linear SVM: a review,” Artif Intell Rev, vol. 52, no. 2, pp. 803–855, Aug. 2019, doi: 10.1007/s10462-018-9614-6.

L. Breiman, “Random Forest,” Mach Learn, vol. 45, no. 1, pp. 5–32, 2001, doi: 10.1023/A:1010933404324.

T. Chen and C. Guestrin, “XGBoost,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA: ACM, Aug. 2016, pp. 785–794. doi: 10.1145/2939672.2939785.

W. Liu, J. Wei, and Q. Meng, “Comparisions on KNN, SVM, BP and the CNN for Handwritten Digit Recognition,” in 2020 IEEE International Conference on Advances in Electrical Engineering and Computer Applications( AEECA), IEEE, Aug. 2020, pp. 587–590. doi: 10.1109/AEECA49918.2020.9213482.

P. C. Vashist, A. Pandey, and A. Tripathi, “A Comparative Study of Handwriting Recognition Techniques,” in 2020 International Conference on Computation, Automation and Knowledge Management (ICCAKM), IEEE, Jan. 2020, pp. 456–461. doi: 10.1109/ICCAKM46823.2020.9051464.

B. K. Nyaupane, R. K. Sah, and K. C. Dahal, “SVM, KNN, Random Forest, and Neural Network based Handwritten Nepali Barnamala Recognition,” Journal of Innovations in Engineering Education, vol. 4, no. 2, pp. 64–70, Dec. 2021, doi: 10.3126/jiee.v4i2.38254.

B. Vamsi, D. Bhattacharyya, and D. Midhunchakkaravarthy, “Detection of Brain Stroke Based on the Family History Using Machine Learning Techniques,” 2021, pp. 17–31. doi: 10.1007/978-981-16-1773-7_2.

O. M. Khandy and S. Dadvandipour, “Analysis of machine learning algorithms for character recognition: a case study on handwritten digit recognition,” Indonesian Journal of Electrical Engineering and Computer Science, vol. 21, pp. 574–581, 2021.

A. Ibrahem Ahmed Osman, A. Najah Ahmed, M. F. Chow, Y. Feng Huang, and A. El-Shafie, “Extreme gradient boosting (Xgboost) model to predict the groundwater levels in Selangor Malaysia,” Ain Shams Engineering Journal, vol. 12, no. 2, pp. 1545–1556, Jun. 2021, doi: 10.1016/j.asej.2020.11.011.

A. Asselman, M. Khaldi, and S. Aammou, “Enhancing the prediction of student performance based on the machine learning XGBoost algorithm,” Interactive Learning Environments, pp. 1–20, May 2021, doi: 10.1080/10494820.2021.1928235.

G. Cohen, S. Afshar, J. Tapson, and A. van Schaik, “EMNIST: Extending MNIST to handwritten letters,” in 2017 International Joint Conference on Neural Networks (IJCNN), IEEE, May 2017, pp. 2921–2926. doi: 10.1109/IJCNN.2017.7966217.

R. M. Gower, N. Loizou, X. Qian, A. Sailanbayev, E. Shulgin, and P. Richtárik, “SGD: General Analysis and Improved Rates.” PMLR, pp. 5200–5209, May 24, 2019. Accessed: Apr. 16, 2023. [Online]. Available: https://proceedings.mlr.press/v97/qian19b.html

S. Bock and M. Weis, “A Proof of Local Convergence for the Adam Optimizer,” Proceedings of the International Joint Conference on Neural Networks, vol. 2019-July, Jul. 2019, doi: 10.1109/IJCNN.2019.8852239.

T.-T.-H. Le, J. Kim, and H. Kim, “An Effective Intrusion Detection Classifier Using Long Short-Term Memory with Gradient Descent Optimization,” in 2017 International Conference on Platform Technology and Service (PlatCon), IEEE, Feb. 2017, pp. 1–6. doi: 10.1109/PlatCon.2017.7883684.

N. S. Keskar and R. Socher, “Improving Generalization Performance by Switching from Adam to SGD,” Dec. 2017.