Optimizing Hand Gesture Recognition Using CNN Model Supported by Raspberry pi for Self-Service Technology

Abdul Rangkuti - Bina Nusantara University, Jakarta, Indonesia
Varyl Athalaa - Bina Nusantara University, Jakarta, Indonesia
Farrel Indallah - Bina Nusantara University, Jakarta, Indonesia
Fajar Febriansyah - Bina Nusantara University, Jakarta, Indonesia

Citation Format:

DOI: http://dx.doi.org/10.30630/joiv.7.1.1032


This study describes the optimization of hand gesture recognition on Raspberry Pi 4 technology has advanced over the past years, some computers are now able to compute much more complex problems like real-time object detection. But for small devices, optimization is required to run in real-time with acceptable performance in terms of latency and low-cost effect on accuracy. Low latency is a requirement for most technology, especially when integrating real-time object detection as input into Self-Service Technology on Raspberry Pi for the store. This research was conducted on 288 pictures with six types of chosen hand gestures for command inputs that have been configured in the Self-Service Technology as a training dataset. In the experiment carried out with 5 CNN object detection models were used, namely YOLOv3-Tiny-PRN, YOLOv4-Tiny, MobileNetV2-Yolov3-NANO, YOLO-Fastest-1.1, and YOLO-Fastest-1.1-XL. Based on the experiment after optimization, the FPS and inference time metrics have improved performance. The performance improves due to a gained average value of FPS by 3 FPS and a reduced average value of inference time by 119,260 ms. But such an improvement also comes with a reduction in overall accuracy. The rest of the parameters have a reduced score on Precision, Recall, F1-Score, and some for IoU. Only YOLO-Fastest-1.1-XL have an improved value of IoU by about 0.58%. Some improvements in the CNN and dataset might improve the performance even more without sacrificing too much on the accuracy, but it's most likely suitable for another research as a continuation of this topic.


Darknet; Hand Gesture; Opencv; Object Detection; YOLO; Raspberry Pi

Full Text:



H. El-Aawar, “INCREASING THE TRANSISTOR COUNT BY CONSTRUCTING A TWO-LAYER CRYSTAL SQUARE ON A SINGLE CHIP,†Int. J. Comput. Sci. Inf. Technol., vol. 7, no. 3, 2015, doi: 10.5121/ijcsit.2015.7308.

R. Singh, “An approach to enhance performance of Computer-Literature Review,†Int. J. Sci. Dev. Res., vol. 1, no. 5, 2016, Accessed: Jun. 30, 2022. [Online]. Available: www.ijsdr.org.

Z. Jiang, L. Zhao, S. Li, Y. Jia, and Z. Liquan, “Real-time object detection method based on improved YOLOv4-tiny,†Nov. 2020, doi: 10.48550/arxiv.2011.04244.

A. H. Rangkuti, V. H. Athala, N. F. Luthfi, S. V. Aditama, and J. M. Kerta, “Reliable of traditional cloth pattern Classification Using Convolutional Neural Network,†2021 2nd Int. Conf. Artif. Intell. Data Sci. AiDAS 2021, 2021, doi: 10.1109/AiDAS53897.2021.9574402.

E. Considine and K. Cormican, “Self-service Technology Adoption: An Analysis of Customer to Technology Interactions,†Procedia Comput. Sci., vol. 100, pp. 103–109, Jan. 2016, doi: 10.1016/J.PROCS.2016.09.129.

A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao, “YOLOv4: Optimal Speed and Accuracy of Object Detection,†Apr. 2020, doi: 10.48550/arxiv.2004.10934.

N. F. Thejowahyono, M. V. Setiawan, S. B. Handoyo, and A. H. Rangkuti, “Hand Gesture Recognition as Signal for Help using Deep Neural Network,†Int. J. Emerg. Technol. Adv. Eng., vol. 12, no. 2, pp. 37–47, 2022, doi: 10.46338/ijetae0222_05.

C. Nyaga and R. Wario, “A Review of Sign Language Hand Gesture Recognition Algorithms,†Adv. Intell. Syst. Comput., vol. 1213 AISC, pp. 207–216, 2021, doi: 10.1007/978-3-030-51328-3_30.

V. Moysiadis et al., “An Integrated Real-Time Hand Gesture Recognition Framework for Human–Robot Interaction in Agriculture,†Appl. Sci., vol. 12, no. 16, 2022, doi: 10.3390/app12168160.

M. Peral, A. Sanfeliu, and A. Garrell, “Efficient Hand Gesture Recognition for Human-Robot Interaction,†IEEE Robot. Autom. Lett., vol. 7, no. 4, pp. 10272–10279, Oct. 2022, doi: 10.1109/LRA.2022.3193251.

Q. Gao, J. Liu, and Z. Ju, “Hand gesture recognition using multimodal data fusion and multiscale parallel convolutional neural network for human–robot interaction,†Expert Syst., vol. 38, no. 5, Aug. 2021, doi: 10.1111/EXSY.12490.

N. M. Mahmoud, H. Fouad, and A. M. Soliman, “Smart healthcare solutions using the internet of medical things for hand gesture recognition system,†Complex Intell. Syst., vol. 7, no. 3, pp. 1253–1264, Jun. 2021, doi: 10.1007/S40747-020-00194-9.

E. Spandana, M. Rajasekar, and N. Sandhya, “Care-giver alerting for bedridden patients using hand gesture recognition system,†J. Phys. Conf. Ser., vol. 1921, no. 1, 2021, doi: 10.1088/1742-6596/1921/1/012077.

S. Ameur, A. Ben Khalifa, and M. S. Bouhlel, “Hand-Gesture-Based Touchless Exploration of Medical Images with Leap Motion Controller,†Proc. 17th Int. Multi-Conference Syst. Signals Devices, SSD 2020, pp. 1116–1121, Jul. 2020, doi: 10.1109/SSD49366.2020.9364244.

N. Zengeler, T. Kopinski, and U. Handmann, “Hand gesture recognition in automotive human–machine interaction using depth cameras,†Sensors (Switzerland), vol. 19, no. 1, Jan. 2019, doi: 10.3390/S19010059.

H. Feng, G. Mu, S. Zhong, P. Zhang, and T. Yuan, “Benchmark Analysis of YOLO Performance on Edge Intelligence Devices,†Cryptography, vol. 6, no. 2, pp. 1–16, 2022, doi: 10.3390/cryptography6020016.

K. Ntzelepi, M. Ε. Filippakis, M. E. Poulou, and A. Angelakis, “PERFORMANCE EVALUATION OF YOLOV4 AND YOLOV4-TINY FOR REAL-TIME FACE-MASK DETECTION ON MOBILE DEVICES,†Int. J. Artif. Intell. Appl., vol. 13, no. 3, 2022, doi: 10.5121/ijaia.2022.13303.

C. Sager, C. Janiesch, and P. Zschech, “A survey of image labelling for computer vision applications,†J. Bus. Anal., vol. 4, no. 2, pp. 91–110, 2021, doi: 10.1080/2573234X.2021.1908861/SUPPL_FILE/TJBA_A_1908861_SM4490.DOTX.

L. Budagyan and R. Abagyan, “Weighted quality estimates in machine learning,†Bioinformatics, vol. 22, no. 21, pp. 2597–2603, 2006, doi: 10.1093/bioinformatics/btl458.

A. Anton, N. F. Nissa, A. Janiati, N. Cahya, and P. Astuti, “Application of Deep Learning Using Convolutional Neural Network (CNN) Method For Women’s Skin Classification,†Sci. J. Informatics, vol. 8, no. 1, pp. 144–153, 2021, doi: 10.15294/sji.v8i1.26888.

J. Redmon, “Darknet: Open Source Neural Networks in C.†2013, Accessed: Jun. 21, 2022. [Online]. Available: https://pjreddie.com/darknet/.

dog-qiuqiu, “dog-qiuqiu/Yolo-Fastest: yolo-fastest-v1.1.0,†Zenodo, Jul. 2021, doi: 10.5281/ZENODO.5131532.

A. H. RANGKUTI, V. H. ATHALA, E. TANUAR, and J. M. KERTA, “ENHANCING A RELIABLE TRADITIONAL CLOTHES PATTERN RETRIEVAL : CNN MODEL AND DISTANCE METRICS,†vol. 100, no. 10, pp. 3183–3193, 2022, [Online]. Available: https://scholar.google.co.id/citations?view_op=view_citation&hl=en&user=gCBD3a4AAAAJ&sortby=pubdate&citation_for_view=gCBD3a4AAAAJ:ZeXyd9-uunAC.

A. H. Rangkuti, V. H. Atthala, E. Tanuar, and J. M. Kerta, “Performance Evaluation of traditional Clothes pattern retrieval with CNN Model and Distance Matrices,†pp. 1–9, 2020.

J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,†Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., vol. 2016-Decem, pp. 779–788, 2016, doi: 10.1109/CVPR.2016.91.

S. Makahaube, A. M. Sambul, and S. R. U. A. Sompie, “Implementation of Gesture Recognition Technology for Self-Education Service Platform,†J. Tek. Inform., vol. 16, no. 4, pp. 465–472, Oct. 2021, doi: 10.35793/JTI.16.4.2021.34210.

Esri, “How the Compute Accuracy For Object Detection tool works,†ArcGIS Pro, Esri. 2020, [Online]. Available: https://pro.arcgis.com/en/pro-app/latest/tool-reference/image-analyst/how-compute-accuracy-for-object-detection-works.htm.

A. F. Gad, “Accuracy, Precision, and Recall in Deep Learning.†2020, [Online]. Available: https://blog.paperspace.com/deep-learning-metrics-precision-recall-accuracy/.

M. Naveenkumar and V. Ayyasamy, “OpenCV for Computer Vision Applications,†Proc. Natl. Conf. Big Data Cloud Comput., no. March 2015, pp. 52–56, 2016, [Online]. Available: https://www.researchgate.net/publication/301590571_OpenCV_for_Computer_Vision_Applications.

I. Martinez-Alpiste, G. Golcarenarenji, Q. Wang, · Jose, and M. Alcaraz-Calero, “Smartphone-based real-time object recognition architecture for portable and constrained systems,†J. Real-Time Image Process., vol. 19, pp. 103–115, 2022, doi: 10.1007/s11554-021-01164-1.