Speech Command Recognition using Artificial Neural Networks

Sushan Poudel - Sri Ramakrishna Engineering College, Coimbatore, India
Dr. R Anuradha - Sri Ramakrishna Engineering College, Coimbatore, India


Citation Format:



DOI: http://dx.doi.org/10.30630/joiv.4.2.358

Abstract


Speech is one of the most effective way for human and machine to interact. This project aims to build Speech Command Recognition System that is capable of predicting the predefined speech commands. Dataset provided by Google’s TensorFlow and AIY teams is used to implement different Neural Network models which include Convolutional Neural Network and Recurrent Neural Network combined with Convolutional Neural Network. The combination of Convolutional and Recurrent Neural Network outperforms Convolutional Neural Network alone by 8% and achieved 96.66% accuracy for 20 labels.


Keywords


Speech Command Recognition; Recurrent Neural Network (RNN); Convolutional Neural Network (CNN)

Full Text:

PDF

References


L.Deng, O. Abdel-Hamid, and D. Yu, “A Deep Convolutional Nerual Network using Hetreogenous Pooling for Trading Acoustic Invaraince with Phonetic Confusion,” in Proc.IEEE Int. Conf. Acous., Speech, Signal Process.(ICASSP), pp. 6669-6673,2013.

Arpita Gupta and Akshay Joshi, “Speech Recognition using Artifical Neural Network”, pp. 0068–0071, April 2018.

Xuejiao Li, Zixuan, “Speech Command Recognition with Convollutional Neural Network”.

Naima Zerari, Samir Abdelhamid, Hassen Bouzgou, Christian Raymond, “Bi-directional Recurrent End-to-End Neural Network Classifier for Spoken Arab Digit,” in IEEE, 2018.

Shakil Ahmed Sumon, Joydip Cowdhury, Sujit Debnath, Nabeel Mohammed, “Bangla Short Speech Commands Recognition Using Convolutional Neural Networks”, in ICBLASP, September 2018.

R. Nicole, “Neural Network based Recognition of Speech using MFCC Features”, IEEE, 2014.

Y. Yorozu, M. Hirano, K. Oka, and Y. Tagawa, “Simplyfying very Deep Convolutional Nerual Network Architectures for Robust Speech Recogition”, IEEE, pp. 236–243.




Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

__________________________________________________________________________
JOIV : International Journal on Informatics Visualization
ISSN 2549-9610  (print) | 2549-9904 (online)
Organized by Department of Information Technology - Politeknik Negeri Padang, and Institute of Visual Informatics - UKM and Soft Computing and Data Mining Centre - UTHM
Published by Department of Information Technology - Politeknik Negeri Padang
W : http://joiv.org
E : joiv@pnp.ac.id, hidra@pnp.ac.id, rahmat@pnp.ac.id

View JOIV Stats

Creative Commons License is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.