Speech Command Recognition using Artificial Neural Networks

Sushan Poudel; Dr. R Anuradha

doi:10.30630/joiv.4.2.358

Speech Command Recognition using Artificial Neural Networks

Sushan Poudel - Sri Ramakrishna Engineering College, Coimbatore, India
Dr. R Anuradha - Sri Ramakrishna Engineering College, Coimbatore, India

Citation Format:

DOI: http://dx.doi.org/10.30630/joiv.4.2.358

Abstract

Speech is one of the most effective way for human and machine to interact. This project aims to build Speech Command Recognition System that is capable of predicting the predefined speech commands. Dataset provided by Googleâ€™s TensorFlow and AIY teams is used to implement different Neural Network models which include Convolutional Neural Network and Recurrent Neural Network combined with Convolutional Neural Network. The combination of Convolutional and Recurrent Neural Network outperforms Convolutional Neural Network alone by 8% and achieved 96.66% accuracy for 20 labels.

Keywords

Speech Command Recognition; Recurrent Neural Network (RNN); Convolutional Neural Network (CNN)

Full Text:

PDF

References

L.Deng, O. Abdel-Hamid, and D. Yu, â€œA Deep Convolutional Nerual Network using Hetreogenous Pooling for Trading Acoustic Invaraince with Phonetic Confusion,â€ in Proc.IEEE Int. Conf. Acous., Speech, Signal Process.(ICASSP), pp. 6669-6673,2013.

Arpita Gupta and Akshay Joshi, â€œSpeech Recognition using Artifical Neural Networkâ€, pp. 0068â€“0071, April 2018.

Xuejiao Li, Zixuan, â€œSpeech Command Recognition with Convollutional Neural Networkâ€.

Naima Zerari, Samir Abdelhamid, Hassen Bouzgou, Christian Raymond, â€œBi-directional Recurrent End-to-End Neural Network Classifier for Spoken Arab Digit,â€ in IEEE, 2018.

Shakil Ahmed Sumon, Joydip Cowdhury, Sujit Debnath, Nabeel Mohammed, â€œBangla Short Speech Commands Recognition Using Convolutional Neural Networksâ€, in ICBLASP, September 2018.

R. Nicole, â€œNeural Network based Recognition of Speech using MFCC Featuresâ€, IEEE, 2014.

Y. Yorozu, M. Hirano, K. Oka, and Y. Tagawa, â€œSimplyfying very Deep Convolutional Nerual Network Architectures for Robust Speech Recogitionâ€, IEEE, pp. 236â€“243.

Username
Password
Remember me