Development on Deaf Support Application Based on Daily Sound Classification Using Image-based Deep Learning

Ji-Hee An - Department of Smart Information and Telecommunication Engineering, Sangmyung University, Cheonan, Chungnam, Republic of Korea
Na-Kyoung Koo - Department of Smart Information and Telecommunication Engineering, Sangmyung University, Cheonan, Chungnam, Republic of Korea
Ju-Hye Son - Department of Smart Information and Telecommunication Engineering, Sangmyung University, Cheonan, Chungnam, Republic of Korea
Hye-Min Joo - Department of Smart Information and Telecommunication Engineering, Sangmyung University, Cheonan, Chungnam, Republic of Korea
Seungdo Jeong - Department of Smart Information and Telecommunication Engineering, Sangmyung University, Cheonan, Chungnam, Republic of Korea

Citation Format:



According to statistics, the number of hearing-impaired persons among the disabled in Korea accounts for 27% of all persons with disabilities. However, there is insufficient support for the deaf and hard of hearing's protective devices and life aids compared to the large number. In particular, the hearing impaired misses much information obtained through sound and causes inconvenience in daily life. Therefore, in this paper, we propose a method to relieve the discomfort in the daily life of the hearing impaired. It analyzes sounds that can occur frequently and must be recognized in daily life and guide them to the hearing impaired through applications and vibration bracelets. Sound analysis was learned by using deep learning by converting sounds that often occur in daily life into the Mel-Spectrogram. The sound that actually occurs is recorded through the application, and then it is identified based on the learning result. According to the identification result, predefined alarms and vibrations are provided differently so that the hearing impaired can easily recognize it. As a result of the recognition of the four major sounds occurring in real life in the experiment, the performance showed an average of 85% and an average of 80% of the classification rate for mixed sounds. It was confirmed that the proposed method can be applied to real-life through experiments. Through the proposed method, the quality of life can be improved by allowing the hearing impaired to recognize and respond to sounds that are essential in daily life.


Sound analysis; Mel-Spectrogram; YOLO; deep learning; hearing impaired.

Full Text:



Registration Status for Diabled, Website of the Ministry of Health and Welfare in Korea, viewed at May 10, 2021. Available:

"2017 White Paper for the Disabled," Korea Disabled People's Development Institute, 2017.

Rodriguez-Villarreal, Kevin, et al., "Development of Warning Device in Risk Situations for Children with Hearing Impairment at Low Cost," Development, vol.10, no.11, 2019.

Hu, Menghan, et al., "An overview of assistive devices for blind and visually impaired people," International Journal of Robotics and Automation, vol.34, no.5, pp. 580-598, 2019.

Saleem, Muhammad Imran, et al., "Full Duplex Smart System for Deaf & Dumb and Normal People," 2020 Global Conference on Wireless and Optical Technologies (GCWOT). IEEE, 2020.

Tapu, Ruxandra, Bogdan Mocanu, and Titus Zaharia., "Wearable assistive devices for visually impaired: A state of the art survey," Pattern Recognition Letters, vol.137, pp. 37-52, 2020.

Abdelmagid, Fatima, Hamda Fasla, and Mourad Elhadef, "Jusoor: A Wearable Communication Device for the Deaf-Blind: An Ideation-Themed Capstone Project," The 7th Annual International Conference on Arab Women in Computing in Conjunction with the 2nd Forum of Women in Research. 2021.

YaÄŸanoÄŸlu, M., "Real time wearable speech recognition system for deaf persons," Computers & Electrical Engineering, vol.91, p.107026, 2021.

Fang, Wei, Lin Wang, and Peiming Ren, "Tinier-YOLO: A real-time object detection method for constrained environments," IEEE Access, vol.8 pp. 1935-1944, 2019.

Wu, Dihua, et al., "Using channel pruning-based YOLO v4 deep learning algorithm for the real-time and accurate detection of apple flowers in natural environments," Computers and Electronics in Agriculture, vol.178, pp.105742, 2020.

Rajagukguk, Juniastel, and Nurdieni Eka Sari, "Detection system of sound noise level (SNL) based on condenser microphone sensor," Journal of Physics, vol.970, no.1, IOP Publishing, 2018.

H. Y. Oh, H. Kwon, G. Kwon, "Developing Application for The Hearing Impaired Using the Synthesized Automaton," in Proceeding of Korea Computer Congress, pp.2031-2033, 2015.

D. K. Heo, B. K. Lee, S. J. Lee, Y. J. Nam, J. H. Kwon, B. S. Song, "Development of An Assistive Device for Sound Information Transmission for Hearing Impaired People," in Proceedings of the Korean Society of Rehabilitation and Welfare Engineering Conference, pp. 18-20, 2015.

Ji, Chunyan, et al., "A review of infant cry analysis and classification," EURASIP Journal on Audio, Speech, and Music Proc., pp. 1-17, 2021.

Burileanu, C., "Recent Experiments and Findings in Baby Cry Classification," Future Access Enablers for Ubiquitous and Intelligent Infrastructures: Third International Conference, FABULOUS 2017, Bucharest, Romania, October 12-14, 2017, Proceedings. vol. 241, Springer, 2018.

Alsouda, Y., Pllana, S., Kurti, A., "Iot-based urban noise identification using machine learning: performance of SVM, KNN, bagging, and random forest," in Proceedings of the international conference on omni-layer intelligent systems, 2019, p. 62-67.

Alsouda, Y., Pllana, S., Kurti, A., "A machine learning driven IoT solution for noise classification in smart cities," arXiv preprint arXiv:1809.00238, 2018.

J. Salamon and J. P. Bello, “Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification,†IEEE Signal Processing Letters, vol.24, no.3, pp. 279-283, 2017.

Z. Zhang, S. Xu, S. Cao, and S. Zhang, “Deep Convolutional Neural Networks with Mixup for Environmental Sound Classification,†in Proceedings of the Chinese Conference on Pattern Recognition and Computer Vision, pp. 356-367, 2018.

V. Boddapati, A. Petef, and L. Lundberg, "Classifying Environmental Sounds Using Image Recognition networks," Procedia Computer Science, vol.112, pp.2048-2056, 2017.

Khawas, Chunnu, and Pritam Shah, "Application of firebase in android app development-a study," International Journal of Computer Applications. vol.179, no.46, pp. 49-53, 2018.

V. Bisot, S. Essid and G. Richard, “HOG and subband power distribution image features for acoustic scene classification,†in Proceeding of EUSIPCO, pp. 719-723, 2015.

A. Rakotomamonjy and G. Gasso, “Histogram of Gradients of Time-Frequency Representations for Audio Scene Classification,†IEEE/ACM Trans. Audio, Speech, and Language Process, vol.23, no.1, pp. 142-153, 2015.

Dörfler, Monika, Roswitha Bammer, and Thomas Grill, "Inside the spectrogram: Convolutional Neural Networks in audio processing," in Proceeding of IEEE International Conference on Sampling Theory and Applications, 2017.

Dong, Mingwen. "Convolutional neural network achieves human-level accuracy in music genre classification," arXiv preprint arXiv:1802.09697, 2018.

Zhou, Quan, et al. "Cough recognition based on mel-spectrogram and convolutional neural network," Frontiers in Robotics and AI, vol.8, 2021.

Mushtaq, Z., and Su, S. F., "Environmental sound classification using a regularized deep convolutional neural network with data augmentation," Applied Acoustics, vol.167, 2020.

Redmon, Joseph, and A. Farhadi, "YOLO9000: better, faster, stronger," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 7263-7271.

WU, Xifang, et al, "Real-time vehicle color recognition based on yolo9000," in International Conference in Communications, Signal Processing, and Systems, Springer, Singapore, 2018. p. 82-89.

K. J. Piczak, "ESC: Dataset for Environmental Sound Classification," in Proceedings of the 23rd ACM International Conference on Multimedia, pp. 1015-1018, 2015.