CNN-LSTM for Heartbeat Sound Classification

Nurseno Aji - Politeknik Negeri Semarang, Semarang, Central Java, 50275, Indonesia
Kurnianingsih Kurnianingsih - Politeknik Negeri Semarang, Semarang, Central Java, 50275, Indonesia
Naoki Masuyama - Osaka Metropolitan University, Osaka, Japan
Yusuke Nojima - Osaka Metropolitan University, Osaka, Japan

Citation Format:



Cardiovascular disorders are among the primary causes of death. Regularly monitoring the heart is of paramount importance in preventing fatalities arising from heart diseases. Heart disease monitoring encompasses various approaches, including the analysis of heartbeat sounds. The auditory patterns of a heartbeat can serve as indicators of heart health. This study aims to build a new model for categorizing heartbeat sounds based on associated ailments. The Phonocardiogram (PCG) method digitizes and records heartbeat sounds. By converting heartbeat sounds into digital data, researchers are empowered to develop a deep learning model capable of discerning heart defects based on distinct cardiac rhythms. This study proposes the utilization of Mel-frequency cepstral coefficients for feature extraction, leveraging their application in voice data analysis. These extracted features are subsequently employed in a multi-step classification process. The classification process merges a convolutional neural network (CNN) with a long short-term memory network (LSTM), forming a comprehensive deep learning architecture. This architecture is further enhanced through optimization utilizing the Adagrad optimizer. To examine the effectiveness of the proposed method, its classification performance is evaluated using the "Heartbeat Sounds" dataset sourced from Kaggle. Experimental results underscore the effectiveness of the proposed method by comparing it with simple CNN, CNN with vanilla LSTM, and traditional machine learning methods (MLP, SVM, Random Forest, and k-NN).


Cardiovascular; Phonocardiogram; CNN-LSTM; Hearttbeat

Full Text:



D. M. Nogueira, C. A. Ferreira, E. F. Gomes, and A. M. Jorge, “Classifying Heart Sounds Using Images of Motifs, MFCC and Temporal Features,” J. Med. Syst., vol. 43, no. 6, 2019, doi: 10.1007/s10916-019-1286-5.

WHO, “Cardiovascular diseases (CVDs),” 2021. .

F. D. Fuchs and P. K. Whelton, “High Blood Pressure and Cardiovascular Disease,” Hypertension, vol. 75, no. 2, pp. 285–292, 2020, doi: 10.1161/HYPERTENSIONAHA.119.14240.

H. Yadav et al., “CNN and Bidirectional GRU-Based Heartbeat Sound Classification Architecture for Elderly People,” Mathematics, vol. 11, no. 6, pp. 1–25, 2023, doi: 10.3390/math11061365.

L. Ciumărnean et al., “Cardiovascular risk factors and physical activity for the prevention of cardiovascular diseases in the elderly,” Int. J. Environ. Res. Public Health, vol. 19, no. 1, 2022, doi: 10.3390/ijerph19010207.

J. L. Rodgers et al., “Cardiovascular risks associated with gender and aging,” J. Cardiovasc. Dev. Dis., vol. 6, no. 2, 2019, doi: 10.3390/jcdd6020019.

S. Tanwar et al., “Human Arthritis Analysis in Fog Computing Environment Using Bayesian Network Classifier and Thread Protocol,” IEEE Consum. Electron. Mag., vol. 9, no. 1, pp. 88–94, 2020, doi: 10.1109/MCE.2019.2941456.

Y. Kim, Y. K. Hyon, S. Lee, S. D. Woo, T. Ha, and C. Chung, “The coming era of a new auscultation system for analyzing respiratory sounds,” BMC Pulm. Med., vol. 22, no. 1, pp. 1–11, 2022, doi: 10.1186/s12890-022-01896-1.

B. Xiao, Y. Xu, X. Bi, J. Zhang, and X. Ma, “Heart sounds classification using a novel 1-D convolutional neural network with extremely low parameter consumption,” Neurocomputing, vol. 392, no. xxxx, pp. 153–159, 2020, doi: 10.1016/j.neucom.2018.09.101.

P. T. Krishnan, P. Balasubramanian, and S. Umapathy, “Automated heart sound classification system from unsegmented phonocardiogram (PCG) using deep neural network,” Phys. Eng. Sci. Med., vol. 43, no. 2, pp. 505–515, 2020, doi: 10.1007/s13246-020-00851-w.

P. Keikhosrokiani, A. B. Naidu A/P Anathan, S. Iryanti Fadilah, S. Manickam, and Z. Li, “Heartbeat sound classification using a hybrid adaptive neuro-fuzzy inferences system (ANFIS) and artificial bee colony,” Digit. Heal., vol. 9, 2023, doi: 10.1177/20552076221150741.

M. Fakhry, “Variational Mode Decomposition and a Light CNN-LSTM Model for Classification of Heart Sound Signals,” no. July, 2023, doi: 10.13039/5011000110.

M. Deng, T. Meng, J. Cao, S. Wang, J. Zhang, and H. Fan, “Heart sound classification based on improved MFCC features and convolutional recurrent neural networks,” Neural Networks, vol. 130, pp. 22–32, 2020, doi: 10.1016/j.neunet.2020.06.015.

Y. Al-Issa and A. M. Alqudah, “A lightweight hybrid deep learning system for cardiac valvular disease classification,” Sci. Rep., vol. 12, no. 1, pp. 1–20, 2022, doi: 10.1038/s41598-022-18293-7.

A. Raza, A. Mehmood, S. Ullah, M. Ahmad, G. S. Choi, and B. W. On, “Heartbeat sound signal classification using deep learning,” Sensors (Switzerland), vol. 19, no. 21, pp. 1–15, 2019, doi: 10.3390/s19214819.

A. Meghanani, A. C. S., and A. G. Ramakrishnan, “An Exploration of Log-Mel Spectrogram and MFCC Features for Alzheimer’s Dementia Recognition from Spontaneous Speech,” 2021 IEEE Spok. Lang. Technol. Work. SLT 2021 - Proc., pp. 670–677, 2021, doi: 10.1109/SLT48900.2021.9383491.

V. Bansal, G. Pahwa, and N. Kannan, “Cough classification for COVID-19 based on audio mfcc features using convolutional neural networks,” 2020 IEEE Int. Conf. Comput. Power Commun. Technol. GUCON 2020, pp. 604–608, 2020, doi: 10.1109/GUCON48875.2020.9231094.

A. Chowdhury and A. Ross, “Fusing MFCC and LPC Features Using 1D Triplet CNN for Speaker Recognition in Severely Degraded Audio Signals,” IEEE Trans. Inf. Forensics Secur., vol. 15, no. c, pp. 1616–1629, 2020, doi: 10.1109/TIFS.2019.2941773.

E. Rejaibi, A. Komaty, F. Meriaudeau, S. Agrebi, and A. Othmani, “MFCC-based Recurrent Neural Network for automatic clinical depression recognition and assessment from speech,” Biomed. Signal Process. Control, vol. 71, no. PA, p. 103107, 2022, doi: 10.1016/j.bspc.2021.103107.

A. Hamza et al., “Deepfake Audio Detection via MFCC features using Machine Learning,” IEEE Access, vol. 10, no. December, pp. 134018–134028, 2022, doi: 10.1109/ACCESS.2022.3231480.

T. Kattenborn, J. Leitloff, F. Schiefer, and S. Hinz, “Review on Convolutional Neural Networks (CNN) in vegetation remote sensing,” ISPRS J. Photogramm. Remote Sens., vol. 173, no. July 2020, pp. 24–49, 2021, doi: 10.1016/j.isprsjprs.2020.12.010.

R. Yan, J. Liao, J. Yang, W. Sun, M. Nong, and F. Li, “Multi-hour and multi-site air quality index forecasting in Beijing using CNN, LSTM, CNN-LSTM, and spatiotemporal clustering,” Expert Syst. Appl., vol. 169, no. December 2020, p. 114513, 2021, doi: 10.1016/j.eswa.2020.114513.

D. Bhatt et al., “Cnn variants for computer vision: History, architecture, application, challenges and future scope,” Electron., vol. 10, no. 20, pp. 1–28, 2021, doi: 10.3390/electronics10202470.

M. E. H. Chowdhury, A. Khandakar, K. Alzoubi, and S. Mansoor, “Real-Time Smart-Digital Stethoscope System for Heart Diseases Monitoring,” 2019.

N. Gupta and A. Singh, “Integration of textual cues for fine-grained image captioning using deep CNN and LSTM,” Neural Comput. Appl., vol. 0123456789, 2019, doi: 10.1007/s00521-019-04515-z.

A. Yadav, C. K. Jha, and A. Sharan, “Optimizing LSTM for time series prediction in Indian stock market,” Procedia Comput. Sci., vol. 167, no. 2019, pp. 2091–2100, 2020, doi: 10.1016/j.procs.2020.03.257.

Z. Jin, Y. Yang, and Y. Liu, “Stock closing price prediction based on sentiment analysis and LSTM,” Neural Comput. Appl., vol. 32, no. 13, pp. 9713–9729, 2020, doi: 10.1007/s00521-019-04504-2.

W. Lu, J. Li, Y. Li, A. Sun, and J. Wang, “A CNN-LSTM-based model to forecast stock prices,” Complexity, vol. 2020, 2020, doi: 10.1155/2020/6622927.

R. Mutegeki and D. S. Han, “A CNN-LSTM Approach to Human Activity Recognition,” 2020 Int. Conf. Artif. Intell. Inf. Commun. ICAIIC 2020, pp. 362–366, 2020, doi: 10.1109/ICAIIC48513.2020.9065078.

F. Elmaz, R. Eyckerman, W. Casteels, S. Latré, and P. Hellinckx, “CNN-LSTM architecture for predictive indoor temperature modeling,” Build. Environ., vol. 206, no. September, p. 108327, 2021, doi: 10.1016/j.buildenv.2021.108327.

T. Y. Kim and S. B. Cho, “Predicting residential energy consumption using CNN-LSTM neural networks,” Energy, vol. 182, pp. 72–81, 2019, doi: 10.1016/

S. Venkatesh and M. Jeyakarthic, “Adagrad Optimizer with Elephant Herding Optimization based Hyper Parameter Tuned Bidirectional LSTM for Customer Churn Prediction in IoT Enabled Cloud Environment,” Webology, vol. 17, no. 2, pp. 631–651, 2020, doi: 10.14704/WEB/V17I2/WEB17057.

L. Qiao, X. Li, Q. Umer, and P. Guo, “Deep learning based software defect prediction,” Neurocomputing, vol. 385, pp. 100–110, 2020, doi: 10.1016/j.neucom.2019.11.067.

N. Zhang, D. Lei, and J. F. Zhao, “An Improved Adagrad Gradient Descent Optimization Algorithm,” Proc. 2018 Chinese Autom. Congr. CAC 2018, no. 3, pp. 2359–2362, 2019, doi: 10.1109/CAC.2018.8623271.