Implementation of Convolutional Neural Network and Long Short-Term Memory Algorithms in Human Activity Recognition Based on Visual Processing Video

Andi Nur Rachman; Husni Mubarok; Euis Nur Fitriani Dewi; Rama Edwinda Putra

doi:10.30630/joiv.7.2.1504

Implementation of Convolutional Neural Network and Long Short-Term Memory Algorithms in Human Activity Recognition Based on Visual Processing Video

Andi Rachman - Universitas Siliwangi Tasikmalaya, Indonesia
Husni Mubarok - Universitas Siliwangi Tasikmalaya, Indonesia
Euis Nur Fitriani Dewi - Universitas Siliwangi Tasikmalaya, Indonesia
Rama Edwinda Putra - Universitas Siliwangi Tasikmalaya, Indonesia

Citation Format:

DOI: http://dx.doi.org/10.30630/joiv.7.2.1504

Abstract

Human Activity Recognition (HAR) is an interesting research topic, especially in identifying human movement actions focusing on video-based security surveillance. Symptom of an illness from a movement. The use of HAR in this research is the key to better understanding the various semantics contained in the video to find out the pattern of a human movement, especially in sports movements. In this study, a combination of the CNN and LSTM method algorithms was applied by using several variations of the model parameter values on the dropout layer and batch size to convert the pattern in the video into image form to produce a HAR model. Data processing at the convolution layer is used to extract spatial features in the frame. The extraction results are fed to the LSTM layer on each network for modeling the temporal sequence of human movement. In this way, the network on the model will learn spatiotemporal features directly in end-to-end data training tests to produce a robust model. The test data used are 10 sports activities obtained from related research from the University of Central Florida (UCF). The results showed that the performance was quite good, although there were still errors in the classification of sports activities because they had similarities in the movements of the activities carried out. The classification results show a loss value of 0.4 and an accuracy of 0.94. In further research, what needs to be corrected is the loss value which is still high so that several times the test results show an error in the classification of sports activities that have similarities in the movements of the activities.

Keywords

Human Activity Recognition (HAR); Classification; Convolutional Neural Network; Long Short-Term Memory

Full Text:

PDF

References

Å»elawski, Marcin and Hachaj, Tomasz. "The application of topological data analysis to human motion recognition" International Journal Technical Transactions, vol.118, no.1, 2021, pp.-. https://doi.org/10.37705/TechTrans/e2021011

Z. Zhang, Z. Lv, C. Gan, and Q. Zhu, "Human action recognition using convolutional LSTM and fully-connected LSTM with different attentions," International Journal Neurocomputing, vol.410, pp.304â€“316, 2020, doi: 10.1016/j.neucom.2020.06.032.

Mokari, M., Mohammadzade, H., & Ghojogh, B. (2020). Recognizing involuntary actions from 3D skeleton data using body states. International Journal Scientia Iranica, 27(3), 1424-1436. doi: 10.24200/sci.2018.20446

K. Muhammad et al., "Human action recognition using attention based LSTM network with dilated CNN features," International Journal Future General Computing System, vol. 125, pp. 820â€“830, 2021, doi: 10.1016/j.future.2021.06.045.

J. Donahue et al., "Long-term Recurrent Convolutional Networks for Visual Recognition and Description," pp. 1â€“14, 2016.

S. U. Park, J. H. Park, M. A. Al-Masni, M. A.Al-Antari, M. Z. Uddin, and T. S. Kim, "A Depth Camera-based Human Activity Recognition via Deep Learning Recurrent Neural Network for Health and Social Care Services," Procedia Computing Science., vol. 100, pp. 78â€“84, 2016, doi: 10.1016/j.procs.2016.09.126.

S. Arif, J. Wang, T. Ul Hassan, and Z. Fei, "3D-CNN-based fused feature maps with LSTM applied to action recognition," Journal Future Internet, vol.11, no. 2, 2019, doi: 10.3390/fi11020042.

N. Surayahani, M. Norzali, and M. Razali, "Human Activity Recognition Based on Convolutional Neural Network," Journal International Science Technology., vol. 2018-Augus, pp. 48â€“57, 2018, doi: 10.1109/ICPR.2018.8545435.

S. Deep and X. Zheng, "Leveraging CNN and Transfer Learning for Vision-based Human Activity Recognition," 2019 29th International Telecommunication Networks Application Conference ITNAC 2019, pp.35â€“38, 2019, doi: 10.1109/ITNAC46935.2019.9078016.

Y. Zhao, K. L. Man, J. Smith, K. Siddique, and S. U. Guan, "Improved two-stream model for human action recognition," Eurasip Journal Image Video Process, vol. 2020, no. 1, 2020, doi: 10.1186/s13640-020-00501-x.

W. Xu, Y. Pang, Y. Yang, and Y. Liu, "Human Activity Recognition Based On Convolutional Neural Network," in 2018 24th International Conference on Pattern Recognition (ICPR), Aug. 2018, vol. 11742 LNAI, pp. 165â€“170, doi: 10.1109/ICPR.2018.8545435.

R. Mutegeki and D. S. Han, "A CNN-LSTM Approach to Human Activity Recognition," 2020 International Conference Artificial Intelligent Information Communication. ICAIIC 2020, pp. 362â€“366, 2020, doi: 10.1109/ICAIIC48513.2020.9065078.

Y.-C. Liu, J.-J.Ding, Y.-J. Chang, C.-Y. Wang, and J.-C. Wang, "Action recognition using three dimension convolution and long short term memory," in 2017 IEEE International Conference on Consumer Electronics - Taiwan (ICCE-TW), Jun. 2017, pp. 83â€“84, doi: 10.1109/ICCE- China.2017.7991006.

Alzubaidi, L., Zhang, J., Humaidi, A. J., Al-Dujaili, A., Duan, Y., Al-Shamma, O., SantamarÃa, J., Fadhel, M. A., Al-Amidie, M., & Farhan, L. "Review of deep learning: concepts, CNN architectures, challenges, applications, future directions", Journal of Big Data (Vol. 8, Issue 1). Springer International Publishing, 2021, https://doi.org/10.1186/s40537-021-00444-8

Batta, M., "Machine Learning Algorithms - A Review", International Journal of Science and Research (IJ, 9(1), 381-undefined, 2020, https://doi.org/10.21275/ART20203995

Caron, M., Bojanowski, P., Joulin, A., & Douze, M., "Deep Clustering for Unsupervised Learning of Visual Features", 2019, https://arxiv.org/abs/1807.05520

Nima, R., & Shila, F., "Crack classification in rotor-bearing system by means of wavelet transform and deep learning methods: an experimental investigation", Journal of Mechanical Engineering, Automation and Control Systems, 1(2), 102â€“113, 2020 https://doi.org/10.21595/jmeacs.2020.21799

Rebala, G., A, R., & S, C., "Machine Learning Definition and Basics", Springer, Cham, 2019, https://doi.org/10.1007/978-3-030-15729-6_1

Wildan, M., Aldi, P., & Aditsania, A., â€œAnalisis dan Implementasi Long Short Term Memory Neural Network untuk Prediksi Harga Bitcoinâ€, E-Proceeding of Engineering, 5(2), 3548â€“3555, 2018, https://openlibrarypublications.telkomuniversity.ac.id/index.php/engineering/article/view/6739

Reddy, K. K., & Shah, M., 'Recognizing 50 human action categories of web videos", Journal Machine Vision and Applications, 24(5), 971â€“981, 2013, https://doi.org/10.1007/s00138-012-0450-4

Ghosh, A., Sufian, A., Sultana, F., Chakrabarti, A., & De, D. "Fundamental concepts of convolutional neural network", Journal Intelligent Systems Reference Library (Vol. 172, Issue January), 2019, https://doi.org/10.1007/978-3-030-32644-9_36

FranÃ§ois-lavet, V., Henderson, P., Islam, R., Bellemare, M. G., FranÃ§ois-lavet, V., Pineau, J., & Bellemare, M. G. â€œAn Introduction to Deep Reinforcement Learningâ€, Foundations and Trends in Machine Learning, II(3â€“4), 1â€“140, 2018, https://doi.org/10.1561/2200000071

Firmansyah, R., â€œImplementasi Deep Learning Menggunakan Convolutional Neural Network Untuk Klasifikasi Bungaâ€, Fakultas Sains Dan Teknologi UIN Syarif Hidayatullah Jakarta, 2020, https://repository.uinjkt.ac.id/dspace/handle/123456789/55347

Apaydin, H., Feizi, H., Sattari, M. T., & Colak, M. S., "Comparative Analysis of Recurrent Neural Network", Water (Switzerland), 12, 1â€“18, 2020, https://www.mdpi.com/2073-4441/12/5/1500

Hochreiter, S., & Schmidhuber, J., â€œLong Short-Term Memoryâ€. Journal Neural Computation, 9(8), 1735â€“1780, 1997, https://doi.org/10.1162/neco.1997.9.8.1735

Bhaskar, D., Manhart, A, Milzman, J, Nardini, J. T, Storey, K. M., Topaz, C. M., & Ziegelmeier, L. (2019). Analyzing collective motion with machine learning and topology. Chaos: An Interdisciplinary Journal of Nonlinear Science, 29(12), 123â€“125.

Ko, J. H., Han, D. W., & Newell, K. M. (2018). Skill level changes the coordination and variability of standing posture and movement in a pistol-aiming task. Journal of Sports Sciences, 36(7), 809â€“816.

Alwin Poulose, Jung Hwan Kim, Dong Seog Han, "HIT HAR: Human Image Threshing Machine for Human Activity Recognition Using Deep Learning Models", Computational Intelligence and Neuroscience, vol. 2022, Article ID 1808990, 21 pages, 2022. https://doi.org/10.1155/2022/1808990

M. Ronald, A. Poulose, and D. S. Han, "iSPLInception: an inception-ResNet deep learning architecture for human activity recognition," IEEE Access, vol. 9, pp. 68985â€“69001, 2021.

W. Wang, A. X. Liu, M. Shahzad, K. Ling, and S. Lu, "Device-free human activity recognition using commercial WiFi devices," IEEE Journal on Selected Areas in Communications, vol. 35, no. 5, pp. 1118â€“1131, 2017.

F. Wang, W. Gong, and J. Liu, "On spatial diversity in WiFi-based human activity recognition: a deep learning-based approach," IEEE Internet of Things Journal, vol. 6, no. 2, pp. 2035â€“2047, 2019.

Y. Wang, J. Wu, and H. Li, "Human detection based on improved mask R-CNN," Journal of Physics: Conference Series, vol. 1575, no. 1, Article ID 012067, 2020.

Username
Password
Remember me