Automatic Speech Recognition System using Bio-Inspired Optimization and Convolutional Neural Network

Authors

  • K. Pavan Raju, A. Sri Krishna, M. Murali

Keywords:

Automatic Speech Recognition, Opposition Whale Optimization, Convolutional Neural Network, Machine Learning, Optimization Algorithms.

Abstract

The incorporation of machine learning methods has led to significant breakthroughs in Automatic Speech Recognition (ASR) systems. This study presents a new method that integrates the Opposition Whale Optimization algorithm (OWOA) with Convolutional Neural Network (CNN) to improve the efficiency of ASR systems. The Opposition Whale Optimization method, which draws inspiration from the social behavior of humpback whales, provides a distinct capacity to explore and exploit in order to optimize the parameters of the ASR system. The method employs opposition-based learning to achieve a well-rounded exploration of the search space, resulting in improved convergence speed and prevention of premature convergence. The use of OWOA, in combination with a CNN architecture, allows for the extraction of hierarchical characteristics from voice data. CNNs have achieved notable success in several pattern recognition applications owing to their capacity to grasp spatial and temporal connections in data. By using the capabilities of CNNs, the suggested ASR system may efficiently acquire distinctive characteristics from unprocessed audio data, resulting in enhanced accuracy in recognizing speech. The suggested technique is validated by conducting experimental assessments on typical speech datasets. The performance increases produced by integrating OWOA and CNN are assessed by conducting comparative assessments against baseline ASR systems. The findings indicate that the suggested ASR system surpasses current approaches in terms of accuracy in recognizing speech and speed of convergence, hence highlighting its potential for practical applications.

Downloads

Download data is not yet available.

References

Liu, Y., Karanasou, P., Hain, T.,“ An investigation into speaker informed DNN front - end for VCSR ,” In 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) pp. 4300 - 4304, April., 2015.

Sainath, T.N., Kingsbury, B., Mohamed, A.R., Dahl, G.E., Saon, G., Soltau, H., Beran, T., Aravkin, A.Y., Ramabhadran, B., “ Improvements to deep convolutional neural networks for LVCSR,” In 2013 IEEE workshop on automatic speech recognition and understanding, pp. 315-320, December 2013.

Sakshi Dua, Sethuraman Sambath Kumar, Yasser Albagory, Rajakumar Ramalingam , Ankur Dumka, Rajesh Singh, Mamoon Rashid , Anita Gehlot, Sultan S. Alshamrani and Ahmed Saeed AlGhamdi, “Developing a Speech Recognition System forecognizing Tonal Speech Signals Using a Convolutional Neural Network”, Appl. Sci. 2022, 12, 6223. https://doi.org/ 10.3390/app12126223.

Jagannath Dayal Pradhan, L. V. Narasimha Prasad, Tusar Kanti Dash, Manisha Guduri, and Ganapati Panda, “Cascaded PFLANN Model for Intelligent Health Informatics in Detection of Respiratory Diseases from Speech Using Bio-inspired Computation”, Journal of Artificial Intelligence and Technology, 2024, 4, 124-131.

Tusar Kanti Dash, Soumya Mishra, Ganapati Panda, Suresh Chandra Satapathy , “Detection of COVID-19 from speech signal using bio-inspired based cepstral features”, www.elsevier.com/locate/patcog, Available online 24 April 2021.

Rajni Sobti, Kalpna Guleria, Virender Kadyan, “ Automatic Speech Recognition System for Low Resource Punjabi Language using Deep Neural Network-Hidden Markov Model (DNN-HMM)”, International Journal of INTELLIGENT SYSTEMS AND APPLICATIONS IN ENGINEERING, 2024.

Li, X.; Zhou, Z. Speech command recognition with convolutional neural network. In CS229 Stanford Education; 2017. Available online: http://cs229.stanford.edu/proj2017/final-reports/5244201.pdf

Taniya Hasija, Virender Kadyan and Kalpna Guleria, “Out Domain Data Augmentation on Punjabi Children Speech Recognition using Tacotron”, Journal of Physics: Conference Series, 1950 (2021) 012044, IOP Publishing, doi:10.1088/1742-6596/1950/1/012044.

Waris A., Aggarwal R.K., “Optimization of deep neural network for automatic speech recognition,” In International Conference on Inventive Research in Computing Applications (ICIRCA), India, pp. 524-527, 2018.

Jayasankar T., Vinothkumar K., Vijayaselvi, A. “Automatic gender identification in speech recognition by genetic algorithm,” Applied Mathematics & Information Sciences, vol.11, no.3, pp. 907-913, 2017.

Downloads

Published

26.03.2024

How to Cite

K. Pavan Raju. (2024). Automatic Speech Recognition System using Bio-Inspired Optimization and Convolutional Neural Network. International Journal of Intelligent Systems and Applications in Engineering, 12(21s), 4273–4283. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/6285

Issue

Section

Research Article