Closer Look at Image Classification for Indonesian Sign Language with Few-Shot Learning Using Matching Network Approach

Irma Permata Sari - Universitas Negeri Jakarta, East Jakarta, 13220, Indonesia


Citation Format:



DOI: http://dx.doi.org/10.30630/joiv.7.3.1320

Abstract


Huge datasets are important to build powerful pipelines and ground well to new images. In Computer Vision, the most basic problem is image classification. The classification of images may be a tedious job, especially when there are a lot of amounts. But CNN is known to be data-hungry while gathering. How can we build some models without much data? For example, in the case of Sign Language Recognition (SLR). One type of Sign Language Recognition system is vision-based. In Indonesian Sign Language dataset has a relatively small sample image. This research aims to classify sign language images using Computer Vision for Sign Language Recognition systems. We used a small dataset, Indonesian Sign Language. Our dataset is listed in 26 classes of alphabet, A-Z. It has loaded 12 images for each class. The methodology in this research is few-shot learning. Based on our experiment, the best accuracy for few-shot learning is Mnasnet1_0 (85.75%) convolutional network model for Matching Networks, and loss estimation is about 0,43. And the experiment indicates that the accuracy will be increased by increasing the number of shots. We can inform you that this model's matching network framework is unsuitable for the Inception V3 model because the kernel size cannot be greater than the actual input size. We can choose the best algorithm based on this research for the Indonesian Sign Language application we will develop further.


Keywords


Few-Shot Learning; Matching Network; Sign Language Recognition; Computer Vision

Full Text:

PDF

References


C. M. Jin, Z. Omar, and M. H. Jaward, "A mobile application of American sign language translation via image processing algorithms," Proc. - 2016 IEEE Reg. 10 Symp. TENSYMP 2016, pp. 104–109, 2016, doi: 10.1109/TENCONSpring.2016.7519386.

R. Alzohairi, R. Alghonaim, W. Alshehri, S. Aloqeely, M. Alzaidan, and O. Bchir, "Image based Arabic Sign Language recognition system," Int. J. Adv. Comput. Sci. Appl., vol. 9, no. 3, pp. 185–194, 2018, doi: 10.14569/IJACSA.2018.090327.

D. Li, C. R. Opazo, X. Yu, and H. Li, "Word-level deep sign language recognition from video: A new large-scale dataset and methods comparison," Proc. - 2020 IEEE Winter Conf. Appl. Comput. Vision, WACV 2020, pp. 1448–1458, 2020, doi: 10.1109/WACV45572.2020.9093512.

Woo, "ä¹³é¼ å¿ƒè‚Œæå– HHS Public Access," Physiol. Behav., vol. 176, no. 1, pp. 139–148, 2017.

F. Pahde, M. Puscas, T. Klein, and M. Nabi, "Multimodal prototypical networks for few-shot learning," Proc. - 2021 IEEE Winter Conf. Appl. Comput. Vision, WACV 2021, pp. 2643–2652, 2021, doi: 10.1109/WACV48630.2021.00269.

Y. Fu, Y. Xie, Y. Fu, and Y.-G. Jiang, "Meta Style Adversarial Training for Cross-Domain Few-Shot Learning," 2023.

M. Kumar, V. Kumar, H. Glaude, C. De Lichy, A. Alok, and R. Gupta, "Protoda: Efficient Transfer Learning for Few-Shot Intent Classification," 2021 IEEE Spok. Lang. Technol. Work. SLT 2021 - Proc., pp. 966–972, 2021, doi: 10.1109/SLT48900.2021.9383495.

V. Garcia and J. Bruna, "Few-shot learning with graph neural networks," 6th Int. Conf. Learn. Represent. ICLR 2018 - Conf. Track Proc., pp. 1–13, 2018.

S. Ravi and H. Larochelle, "Optimization as a model for few-shot learning," 5th Int. Conf. Learn. Represent. ICLR 2017 - Conf. Track Proc., pp. 1–11, 2017.

J. Snell, K. Swersky, and R. Zemel, "Prototypical networks for few-shot learning," Adv. Neural Inf. Process. Syst., vol. 2017-Decem, pp. 4078–4088, 2017.

Y. Liu, B. Schiele, and Q. Sun, "An Ensemble of Epoch-Wise Empirical Bayes for Few-Shot Learning," Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 12361 LNCS, pp. 404–421, 2020, doi: 10.1007/978-3-030-58517-4_24.

H. Tang, Z. Li, Z. Peng, and J. Tang, "BlockMix: Meta Regularization and Self-Calibrated Inference for Metric-Based Meta-Learning," MM 2020 - Proc. 28th ACM Int. Conf. Multimed., pp. 610–618, 2020, doi: 10.1145/3394171.3413884.

C. Xie, M. Tan, B. Gong, J. Wang, A. L. Yuille, and Q. V. Le, "Adversarial examples improve image recognition," Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 816–825, 2020, doi: 10.1109/CVPR42600.2020.00090.

Y. L. and H. T. S. J. Zhang, J. Song, L. Gao, "Progressive Meta-Learning With Curriculum," IEEE Trans. Circuits Syst. Video Technol., vol. 32, no. 9, pp. 5916–5930, 2022, doi: 10.1109/TCSVT.2022.3164190.

S. Qiao, C. Liu, W. Shen, and A. Yuille, "Few-Shot Image Recognition by Predicting Parameters from Activations," Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 7229–7238, 2018, doi: 10.1109/CVPR.2018.00755.

M. M. Rahman, M. S. Islam, M. H. Rahman, R. Sassi, M. W. Rivolta, and M. Aktaruzzaman, "A new benchmark on american sign language recognition using convolutional neural network," 2019 Int. Conf. Sustain. Technol. Ind. 4.0, STI 2019, vol. 0, pp. 24–25, 2019, doi: 10.1109/STI47673.2019.9067974.

W. Y. Chen, Y. C. F. Wang, Y. C. Liu, Z. Kira, and J. Bin Huang, "A closer look at few-shot classification," 7th Int. Conf. Learn. Represent. ICLR 2019, no. 2018, pp. 1–17, 2019.

Y. Ding et al., Multi-scale Relation Network for Few-Shot Learning Based on Meta-learning, vol. 1. Springer International Publishing.

W. Zheng, X. Tian, B. Yang, S. Liu, Y. Ding, and J. Tian, "applied sciences A Few Shot Classification Methods Based on Multiscale Relational Networks," 2022.

D. Matching, "Double-View Matching Matching Network Network for for Few-Shot Learning to Classify Covid-19 in X-ray images Learning to Classify Covid-19 in X-ray images," vol. 1, no. 1, 2021, doi: 10.36244/ICJ.2021.1.4.

R. Hou, H. Chang, B. Ma, S. Shan, and X. Chen, "Cross Attention Network for Few-shot Classification," vol. 1, no. c, pp. 1–12, 2019.

H. Yang, X. He, and F. Porikli, "One-Shot Action Localization by Learning Sequence Matching Network," Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 1450–1459, 2018, doi: 10.1109/CVPR.2018.00157.

L. Gui, Y. Wang, D. Ramanan, and M. F. Moura, "Few-Shot Human Motion Prediction via," Eur. Conf. Comput. Vis., pp. 1–19, 2020.

P. Sirinam, M. S. Rahman, N. Mathews, and M. Wright, "Triplet fingerprinting: More practical and portable website fingerprinting with N-shot learning," Proc. ACM Conf. Comput. Commun. Secur., pp. 1131–1148, 2019, doi: 10.1145/3319535.3354217.

C. Liu et al., "Learning a Few-shot Embedding Model with Contrastive Learning," 35th AAAI Conf. Artif. Intell. AAAI 2021, vol. 10A, pp. 8635–8643, 2021, doi: 10.1609/aaai.v35i10.17047.

C. Finn, P. Abbeel, and S. Levine, "Model-agnostic meta-learning for fast adaptation of deep networks," 34th Int. Conf. Mach. Learn. ICML 2017, vol. 3, pp. 1856–1868, 2017.

D. Chen, Y. Chen, Y. Li, F. Mao, Y. Hey, and H. Xue, "Self-Supervised Learning for Few-Shot Image Classification," ICASSP, IEEE Int. Conf. Acoust. Speech Signal Process. - Proc., vol. 2021-June, pp. 1745–1749, 2021, doi: 10.1109/ICASSP39728.2021.9413783.

T. Banerjee, N. R. Thurlapati, V. Pavithra, S. Mahalakshmi, D. Eledath, and V. Ramasubramanian, "Few-shot learning for frame-wise phoneme recognition: Adaptation of matching networks," Eur. Signal Process. Conf., vol. 2021-Augus, pp. 516–520, 2021, doi: 10.23919/EUSIPCO54536.2021.9616234.

H. Li, W. Dong, X. Mei, C. Ma, and F. Huang, "LGM-Net : Learning to Generate Matching Networks for Few-Shot Learning," 2019.

Z. Chen, Y. Fu, K. Chen, and Y. G. Jiang, "Image block augmentation for one-shot learning," 33rd AAAI Conf. Artif. Intell. AAAI 2019, 31st Innov. Appl. Artif. Intell. Conf. IAAI 2019 9th AAAI Symp. Educ. Adv. Artif. Intell. EAAI 2019, vol. 1, pp. 3379–3386, 2019, doi: 10.1609/aaai.v33i01.33013379.

B. Zhao, X. Sun, Y. Fu, Y. Yao, and Y. Wang, "MSplit LBI: Realizing feature selection and dense estimation simultaneously in few-shot and zero-shot learning," 35th Int. Conf. Mach. Learn. ICML 2018, vol. 13, pp. 9421–9432, 2018.

F. Sung, Y. Yang, and L. Zhang, "Relation Network for Few-Shot Learning," Cvpr, pp. 1199–1208, 2018.

S. Gidaris, A. Bursuc, N. Komodakis, P. P. Perez, and M. Cord, "Boosting few-shot visual learning with self-supervision," Proc. IEEE Int. Conf. Comput. Vis., vol. 2019-Octob, pp. 8058–8067, 2019, doi: 10.1109/ICCV.2019.00815.

A. Parnami and M. Lee, "Learning from Few Examples: A Summary of Approaches to Few-Shot Learning," no. January 2020, pp. 1–32, 2022.

Y. Cheng, M. Yu, X. Guo, and B. Zhou, "Few-shot Learning with Meta Metric Learners," no. Nips, pp. 1–9, 2019.

D. Lee, J. Yoon, J. Song, S. Lee, and S. Yoon, "One-Shot Learning for Text-to-SQL Generation," 2019.

O. Vinyals, C. Blundell, T. Lillicrap, K. Kavukcuoglu, and D. Wierstra, "Matching networks for one shot learning," Adv. Neural Inf. Process. Syst., pp. 3637–3645, 2016.

S. Jadon, "An Overview of Deep Learning Architectures in Few-Shot Learning Domain," 2020, doi: 10.13140/RG.2.2.31573.24803/1.

J. Snell, K. Swersky, and R. Zemel, "Prototypical Networks for Few-shot Learning," no. Nips, 2017.

I. P. Sari, Widodo, M. Nugraheni, and P. Wanda, "A Basic Concept of Image Classification for Covid-19 Patients Using Chest CT Scan and Convolutional Neural Network," Proceeding - 1st Int. Conf. Inf. Technol. Adv. Mech. Electr. Eng. ICITAMEE 2020, pp. 175–178, 2020, doi: 10.1109/ICITAMEE50454.2020.9398462.

C. Michael, "NEURAL ARCHITECTURE SEARCH FOR SMART PHONES," 2019.