Efficient Recognition and Classification of Stuttered Speech Signal using Deep Learning Technique

Authors

  • Mohandoss Rajalakshmi Assistant Professor, Department of Computing Technologies, SRM Institute of Science and Technology, SRM Nagar, Kattankalathur, Chennai, Tamil Nadu, India.
  • Ramasubbu Rengaraj Associate Professor, Department of Electrical and Electronics Engineering, Sri Sivasubramaniya Nadar College of Engineering, Kalavakkam, Chennai, Tamil Nadu, India.
  • Giri Rajan Babu Venkatakrishnan Associate Professor, Department of Electrical and Electronics Engineering, Sri Sivasubramaniya Nadar College of Engineering, Kalavakkam, Tamil Nadu, India.
  • Jeya R. Assistant Professor, Department of Computing Technologies, SRM Institute of Science and Technology, SRM Nagar, Kattankalathur, Chennai, Tamil Nadu, India.

Keywords:

Automated Speech Recognition System (ASSR), Deep Neural Network (DNN), MFCC Feature Extraction, Stuttered Speech Recognition, Speech-to-Text API

Abstract

Speech recognition systems in modern-day devices are a popular feature that has facilitated human-machine interaction. Users need not learn complex programming languages to communicate with their devices and can give commands using their voice to perform multiple tasks. However, its usage is limited if it encounters stutter in the voice input of a person with this disfluency. This work is based on building a system that not only classifies the speech as stuttered or normal but also rectifies the discourse by removing stuttered portions from the signal. It first takes the speaker’s audio as input and performs segmentation on the speech signal to divide it into segments of 300ms. MFCC feature extraction from these segments is done. These features are fed into the model for classification of the audio segment, which is then corrected to give stutter-free audio, along with its text conversion.

Downloads

Download data is not yet available.

References

A. Czyzewski, A. Kaczmarek, and B. Kostek, 2003. Intelligent processing of stuttered speech. Journal of Intelligent Information Systems, 21, pp.143-171.

M.A. Anusuya, and S.K. Katti, 2010. Speech recognition by machine, a review. arXiv preprint arXiv:1001.2267.

F. Afroz, and S.G. Koolagudi, 2019. Recognition and Classification of Pauses in Stuttered Speech Using Acoustic Features. In 2019 6th International Conference on Signal Processing and Integrated Networks (SPIN) (pp. 921-926). IEEE.

A.A. Surya, and S.M. Varghese, 2016. Automatic speech recognition system for stuttering disabled persons. International Journal of Control Theory and Applications, 9(43), pp.16-20.

L.S. Chee, O.C. Ai, M. Hariharan, and S. Yaacob, 2009. MFCC based recognition of repetitions and prolongations in stuttered speech using k-NN and LDA. In 2009 IEEE student conference on research and development (SCOReD) (pp. 146-149). IEEE.

https://wiki.aalto.fi/display/ITSP/Deltas+and+Delta-deltas.

P. Arbajian, A. Hajja, Z.W. Raś, and A.A. Wieczorkowska, 2018. Segment-removal based stuttered speech remediation. In New Frontiers in Mining Complex Patterns: 6th International Workshop, NFMCP 2017, Held in Conjunction with ECML-PKDD 2017, Skopje, Macedonia, September 18-22, 2017, Revised Selected Papers 6 (pp. 16-34). Springer International Publishing.

D. Gartzman, 2020. Getting to know the mel spectrogram. Towards Data Science.

K N, V. N., and S P, M. 2016. Detection and Analysis of Stuttered Speech. International Journal of Advanced Research in Electronics and Communication Engineering (IJARECE), 5(4), pp. 952-955.

S. Khara, S. Singh, and D. Vir, 2018. A comparative study of the techniques for feature extraction and classification in stuttering. In 2018 Second International Conference on Inventive Communication and Computational Technologies (ICICCT) (pp. 887-893). IEEE.

https://librosa.org/librosa/generated/librosa.feature.melspectrogram.html.

J. Loy, 2020. How to build your own Neural Network from scratch in Python. https://towardsdatascience.com/how-to-build-your-own-neural-network-from-scratch-in-python-68998a08e4f6

Downloads

Published

24.03.2024

How to Cite

Rajalakshmi, M. ., Rengaraj, R. ., Venkatakrishnan, G. R. B. ., & R., J. . (2024). Efficient Recognition and Classification of Stuttered Speech Signal using Deep Learning Technique. International Journal of Intelligent Systems and Applications in Engineering, 12(18s), 613–622. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/5009

Issue

Section

Research Article