Speech Command Recognition using Artificial Neural Networks

Sushan Poudel - Sri Ramakrishna Engineering College, Coimbatore, India
Dr. R Anuradha - Sri Ramakrishna Engineering College, Coimbatore, India


Citation Format:



DOI: http://dx.doi.org/10.30630/joiv.4.2.358

Abstract


Speech is one of the most effective way for human and machine to interact. This project aims to build Speech Command Recognition System that is capable of predicting the predefined speech commands. Dataset provided by Google’s TensorFlow and AIY teams is used to implement different Neural Network models which include Convolutional Neural Network and Recurrent Neural Network combined with Convolutional Neural Network. The combination of Convolutional and Recurrent Neural Network outperforms Convolutional Neural Network alone by 8% and achieved 96.66% accuracy for 20 labels.


Keywords


Speech Command Recognition; Recurrent Neural Network (RNN); Convolutional Neural Network (CNN)

Full Text:

PDF

References


L.Deng, O. Abdel-Hamid, and D. Yu, “A Deep Convolutional Nerual Network using Hetreogenous Pooling for Trading Acoustic Invaraince with Phonetic Confusion,†in Proc.IEEE Int. Conf. Acous., Speech, Signal Process.(ICASSP), pp. 6669-6673,2013.

Arpita Gupta and Akshay Joshi, “Speech Recognition using Artifical Neural Networkâ€, pp. 0068–0071, April 2018.

Xuejiao Li, Zixuan, “Speech Command Recognition with Convollutional Neural Networkâ€.

Naima Zerari, Samir Abdelhamid, Hassen Bouzgou, Christian Raymond, “Bi-directional Recurrent End-to-End Neural Network Classifier for Spoken Arab Digit,†in IEEE, 2018.

Shakil Ahmed Sumon, Joydip Cowdhury, Sujit Debnath, Nabeel Mohammed, “Bangla Short Speech Commands Recognition Using Convolutional Neural Networksâ€, in ICBLASP, September 2018.

R. Nicole, “Neural Network based Recognition of Speech using MFCC Featuresâ€, IEEE, 2014.

Y. Yorozu, M. Hirano, K. Oka, and Y. Tagawa, “Simplyfying very Deep Convolutional Nerual Network Architectures for Robust Speech Recogitionâ€, IEEE, pp. 236–243.