Syllable Segmentation with Vowel Detection on Verse Quranic Recitation

Timor Setiyaningsih - Universitas Darma Persada, Jl. Taman Malaka Selatan Pondok Kelapa, Jakarta, 13450, Indonesia
Mohd Sanusi Azmi - Universiti Teknikal Malaysia Melaka, Durian Tunggal, Melaka, 76100, Malaysia
Azah Kamilah Draman - Universiti Teknikal Malaysia Melaka, Durian Tunggal, Melaka, 76100, Malaysia


Citation Format:



DOI: http://dx.doi.org/10.62527/joiv.8.4.2663

Abstract


In speech recognition, segmentation involves partitioning a continuous audio signal containing speech into smaller units or segments, such as words, phonemes, or syllables. This process is paramount in speech recognition systems, as it delineates the boundaries between distinct speech elements, facilitating subsequent analysis and processing. Segmentation accuracy significantly impacts speech recognition systems' overall precision and performance, enabling more precise identification and processing of individual speech units. Moreover, proper segmentation empowers the automatic speech recognition (ASR) system to distinguish between different syllables or words effectively, leading to more efficient speech recognition outcomes.  This research paper investigates the importance of vowel detection for syllable segmentation in speech recognition, particularly in Arabic speech, such as the Quran, where changes in each syllable can alter the meaning. While existing techniques only consider pronunciation by different readers, this study employs onset detection to account for the presence of Arabic vowels. Specifically, the study focuses on detecting the onset of the recitation of Surah Al-Fatihah's fourth verse using 50 data sets in the syllable detection testing process. The results indicate that syllable detection performs excellently on syllables with /a/ and /i/ vowels. However, syllables with /u/ vowels produce results below 70%. The study suggests that the onset-based method is ideal for syllables with the presence of /a/and /i/ vowels, demonstrating the importance of considering Arabic vowel letters in speech recognition.


Keywords


Quran; syllable; vowel; onset

Full Text:

PDF

References


J. H. Alkhateeb, “A Machine Learning Approach for Recognizing the Holy Quran Reciter,” International Journal of Advanced Computer Science and Applications, vol. 11, no. 7, 2020, doi:10.14569/ijacsa.2020.0110735.

S. S. Alrumiah and A. A. Al-Shargabi, “Intelligent Quran Recitation Recognition and Verification: Research Trends and Open Issues,” Arabian Journal for Science and Engineering, vol. 48, no. 8, pp. 9859–9885, Nov. 2022, doi: 10.1007/s13369-022-07273-8.

N. A. Damer, M. Al-ayyoub, and I. Hmeidi, “Automatically Determining Correct Application of Basic Quranic Recitation Rules,” Zarka Private University, 2017, pp. 620–625.

H. M. A. Tabbaa and B. Soudan, “Computer-Aided Training for Quranic Recitation,” Procedia - Social and Behavioral Sciences, vol. 192, pp. 778–787, Jun. 2015, doi: 10.1016/j.sbspro.2015.06.092.

A. Qayyum, “Quran Reciter Identification : A Deep Learning Approach,” pp. 492–497, 2022.

F. Ahmad, S. Z. Yahya, Z. Saad, and A. R. Ahmad, “Tajweed Classification Using Artificial Neural Network,” 2018 International Conference on Smart Communications and Networking (SmartNets), pp. 1–4, Nov. 2018, doi: 10.1109/smartnets.2018.8707394.

N. Jamaliah Ibrahim, M. Yamani Idna Idris, M. Y. Zulkifli Mohd Yusoff, and A. Anuar, “The Problems, Issues and Future Challenges of Automatic Speech Recognition for Quranic verse Recitation: A Review,” Al-Bayān – Journal of Qurʾān and Ḥadīth Studies, vol. 13, no. 2, pp. 168–196, Dec. 2015, doi: 10.1163/22321969-12340024.

L. Marlina et al., “Makhraj recognition of Hijaiyah letter for children based on Mel-Frequency Cepstrum Coefficients (MFCC) and Support Vector Machines (SVM) method,” 2018 International Conference on Information and Communications Technology (ICOIACT), pp. 935–940, Mar. 2018, doi: 10.1109/icoiact.2018.8350684.

M. Z. Adam, N. Shafie, H. Abas, and A. Azizan, “Analysis of Momentous Fragmentary Formants in Talaqi-like Neoteric Assessment of Quran Recitation using MFCC Miniature Features of Quranic Syllables,” vol. 12, no. 9, pp. 533–540, 2021.

S. Al-Issa, M. Al-Ayyoub, O. Al-Khaleel, and N. Elmitwally, “Building a neural speech recognizer for quranic recitations,” International Journal of Speech Technology, vol. 26, no. 4, pp. 1131–1151, Aug. 2022, doi: 10.1007/s10772-022-09988-3.

D. Omran, S. Fawzi, and A. Kandil, “Automatic Detection of Some Tajweed Rules,” 2023 20th Learning and Technology Conference (L&T), pp. 157–160, Jan. 2023, doi: 10.1109/lt58159.2023.10092350.

N. Shafie, A. Azizan, M. Z. Adam, H. Abas, Y. M. Yusof, and N. A. Ahmad, “Dynamic Time Warping Features Extraction Design for Quranic Syllable-based Harakaat Assessment,” International Journal of Advanced Computer Science and Applications, vol. 13, no. 12, 2022, doi: 10.14569/ijacsa.2022.0131207.

Dr. Md., F. Khatun, and Dr. Md., “Blocking Black Area Method for Speech Segmentation,” International Journal of Advanced Research in Artificial Intelligence, vol. 4, no. 2, 2015, doi:10.14569/ijarai.2015.040201.

R. Kumari, A. Dev, and A. Kumar, “Automatic Segmentation of Hindi Speech into Syllable-Like Units,” International Journal of Advanced Computer Science and Applications, vol. 11, no. 5, 2020, doi:10.14569/ijacsa.2020.0110553.

J. Yadav, “Detection of vowel transition regions from Hindi language,” Computer Speech & Language, vol. 70, p. 101231, Nov. 2021, doi: 10.1016/j.csl.2021.101231.

A. H. Abo Absa, M. Deriche, M. Elshafei-Ahmed, Y. M. Elhadj, and B.-H. Juang, “A Hybrid Unsupervised Segmentation Algorithm for Arabic Speech Using Feature Fusion and a Genetic Algorithm (July 2018),” IEEE Access, vol. 6, pp. 43157–43169, 2018, doi:10.1109/access.2018.2859631.

S. P. Panda and A. K. Nayak, “Automatic speech segmentation in syllable centric speech recognition system,” International Journal of Speech Technology, vol. 19, no. 1, pp. 9–18, Nov. 2015, doi:10.1007/s10772-015-9320-6.

L. Mary, A. P. Antony, B. P. Babu, and S. R. M. Prasanna, “Automatic syllabification of speech signal using short time energy and vowel onset points,” International Journal of Speech Technology, vol. 21, no. 3, pp. 571–579, May 2018, doi: 10.1007/s10772-018-9517-6.

B. D. Sarma, B. Sharma, S. A. Shanmugam, S. R. M. Prasanna, and H. A. Murthy, “Exploration of vowel onset and offset points for hybrid speech segmentation,” TENCON 2015 - 2015 IEEE Region 10 Conference, pp. 1–6, Nov. 2015, doi: 10.1109/tencon.2015.7373137.

K. Geetha and R. Vadivel, “Syllable Segmentation of Tamil Speech Signals Using Vowel Onset Point and Spectral Transition Measure,” Automatic Control and Computer Sciences, vol. 52, no. 1, pp. 25–31, Jan. 2018, doi: 10.3103/s0146411618010042.

R. Thirumuru, S. V. Gangashetty, and A. K. Vuppala, “Improved vowel region detection from a continuous speech using post processing of vowel onset points and vowel end-points,” Multimedia Tools and Applications, vol. 77, no. 4, pp. 4753–4767, Aug. 2017, doi:10.1007/s11042-017-5044-8.

A. Kumar, S. Shahnawazuddin, and G. Pradhan, “Improvements in the Detection of Vowel Onset and Offset Points in a Speech Sequence,” Circuits, Systems, and Signal Processing, vol. 36, no. 6, pp. 2315–2340, Sep. 2016, doi: 10.1007/s00034-016-0409-1.

J. Yadav and K. S. Rao, “Detection of Vowel Offset Point From Speech Signal,” IEEE Signal Processing Letters, vol. 20, no. 4, pp. 299–302, Apr. 2013, doi: 10.1109/lsp.2013.2245647.

A. Das and S. Garnaik, “Discrete fourier transform based Vowel Onset Point Detection Using Spectral Peaks Energy,” 2019 3rd International conference on Electronics, Communication and Aerospace Technology (ICECA), pp. 512–514, Jun. 2019, doi:10.1109/iceca.2019.8822154.

K. Tripathi and K. S. Rao, “VOP detection for read and conversation speech using CWT coefficients and phone boundaries,” Journal of Ambient Intelligence and Humanized Computing, vol. 13, no. 1, pp. 105–116, Jan. 2021, doi: 10.1007/s12652-020-02890-3.

K. Tripathi and K. S. Rao, “Robust vowel region detection method for multimode speech,” pp. 13615–13637, 2021.

A. Agarwal, J. Mishra, and S. R. M. Prasanna, “VOP Detection in Variable Speech Rate Condition,” Interspeech 2020, pp. 3690–3694, Oct. 2020, doi: 10.21437/interspeech.2020-2326.

A. A. Yusuf, The Holy Qur’an: Translation and Commentary. IPCI, Islamic Vision, 1946. Revised and edited by the Presidency of Islamic Researches, Ifta, Call and Guidance, King Fahd Holy Printing Complex.

M. S. M. Mendjel, S. Ghazi, A. Dib, and H. Seridi, “A New Audio Approach Based on User Preferences Analysis to Enhance Music Recommendations,” Revue d’Intelligence Artificielle, vol. 37, no. 5, pp. 1341–1349, Oct. 2023, doi: 10.18280/ria.370527.

P. A. Babu, V. Siva Nagaraju, and R. R. Vallabhuni, “Speech Emotion Recognition System With Librosa,” 2021 10th IEEE International Conference on Communication Systems and Network Technologies (CSNT), Jun. 2021, doi: 10.1109/csnt51715.2021.9509714.

A. K. Singh, “Prediction of Voice Sentiment using Machine Learning Technique,” 2021 10th International Conference on System Modeling & Advancement in Research Trends (SMART), pp. 162–166, Dec. 2021, doi: 10.1109/smart52563.2021.9676221.

S. G. E. Brucal et al., “Filipino Speech to Text System using Convolutional Neural Network,” 2021 Fifth World Conference on Smart Trends in Systems Security and Sustainability (WorldS4), pp. 176–181, Jul. 2021, doi: 10.1109/worlds451998.2021.9513991.

B. McFee et al., “librosa: Audio and Music Signal Analysis in Python,” Proceedings of the 14th Python in Science Conference, pp. 18–24, 2015, doi: 10.25080/majora-7b98e3ed-003.

P. Raguraman, M. R., and M. Vijayan, “LibROSA Based Assessment Tool for Music Information Retrieval Systems,” 2019 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), pp. 109–114, Mar. 2019, doi: 10.1109/mipr.2019.00027.