Enhancing Hindi Speech and Text Recognition Using Hybrid Deep Learning and Semantic Models
Keywords:
Hindi Speech Recognition, Deep Learning, Semantic Models, Word Sense Disambiguation, Text Recognition, Natural Language Processing.Abstract
The growing adoption of digital technologies among Hindi-speaking populations necessitates advanced speech and text recognition systems tailored to the linguistic complexity of Hindi. Existing models often struggle with phonetic ambiguities, rich morphology, and script variations inherent in the language. This study proposes an integrated framework that combines hybrid deep learning models with semantic approaches to enhance the accuracy and robustness of Hindi speech and text recognition. For speech recognition, multiple feature extraction techniques such as Perceptual Linear Prediction (PLP) and Mel Frequency Cepstral Coefficients (MFCC) are integrated with deep neural networks to minimize substitution and confusion errors. In text recognition, semantic models like Word Sense Disambiguation (WSD) using Hindi WordNet and corpus-based semantic similarity measures improve contextual understanding and disambiguation. Comparative evaluations reveal that this hybrid approach significantly enhances performance across speech-to-text conversion, sentiment analysis, and question-answering tasks. The proposed methodology addresses critical gaps in existing NLP solutions for Hindi and lays the foundation for developing more inclusive and intelligent language processing systems. These advancements are vital for deploying AI-powered services in education, governance, and digital communication tailored to Hindi speakers.
Downloads
References
Tripathi, Praffullit, Prasenjit Mukherjee, Manik Hendre, Manish Godse, and Baisakhi Chakraborty. "Word sense disambiguation in hindi language using score based modified lesk algorithm." Int. J. Com. Dig. Sys 10, no. 1 (2021).
Singh, Shashank, and Shailendra Singh. "HINDIA: a deep-learning-based model for spell-checking of Hindi language." Neural Computing and Applications 33, no. 8 (2021): 3825-3840.
Verma, Devika, Ramprasad Joshi, Shubhamkar Joshi, and Onkar Susladkar. "Study of similarity measures as features in classification for answer sentence selection task in Hindi question answering: Language-specific v/s other measures." In Proceedings of the 35th Pacific Asia Conference on Language, Information and Computation, pp. 747-756. 2021.
Bhatt, Shobha, Anurag Jain, and Amita Dev. "Feature extraction techniques with analysis of confusing words for speech recognition in the Hindi language." Wireless Personal Communications 118, no. 4 (2021): 3303-3333.
Gunna, Sanjana, Rohit Saluja, and C. V. Jawahar. "Transfer learning for scene text recognition in Indian languages." In International Conference on Document Analysis and Recognition, pp. 182-197. Cham: Springer International Publishing, 2021.
Kumari, Ruchika, Amita Dev, and Ashwani Kumar. "An efficient adaptive artificial neural network based text to speech synthesizer for Hindi language." Multimedia Tools and Applications 80, no. 16 (2021): 24669-24695.
Kulkarni, Dhanashree S., and Sunil S. Rodd. "Sentiment analysis in Hindi—A survey on the state-of-the-art techniques." Transactions on Asian and Low-Resource Language Information Processing 21, no. 1 (2021): 1-46.
Desai, Nikita P., and Vipul K. Dabhi. "Taxonomic survey of Hindi Language NLP systems." arXiv preprint arXiv:2102.00214 (2021).
Nguyen, Li, Christopher Bryant, Sana Kidwai, and Theresa Biberauer. "Automatic language identification in code-switched Hindi-English social media text." Journal of Open Humanities Data 7 (2021).
Younas, Farah, Jumana Nadir, Muhammad Usman, Muhammad Attique Khan, Sajid Ali Khan, Seifedine Kadry, and Yunyoung Nam. "An artificial intelligence approach for word semantic similarity measure of Hindi language." KSII Transactions on Internet and Information Systems (TIIS) 15, no. 6 (2021): 2049-2068.
Chayapathi, A. R., G. Sunil Kumar, Swamy BE Manjunath, J. Thriveni, and K. R. Venugopal. "Analysis of pattern matching algorithms used for searching the patterns in mlir framework." Turkish Journal of Computer and Mathematics Education 12, no. 7 (2021): 738-748.
Joshi, Manju Lata, Nisheeth Joshi, and Namita Mittal. "SGATS: Semantic Graph-based Automatic Text Summarization from Hindi Text Documents." Transactions on Asian and Low-Resource Language Information Processing 20, no. 6 (2021): 1-32.
Omayio, Enock Osoro, Indu Sreedevi, and Jeebananda Panda. "Word spotting of handwritten Hindi scripts by circular histogram of oriented displacement (CHOD) features." In 2021 4th Biennial International Conference on Nascent Technologies in Engineering (ICNTE), pp. 1-6. IEEE, 2021.
Chakrawarti, Rajesh Kumar, Pratosh Bansal, and Jayshri Bansal. "Phrase-Based Statistical Machine Translation of Hindi Poetries into English." In International Conference on Information and Communication Technology for Intelligent Systems, pp. 53-65. Singapore: Springer Nature Singapore, 2020.
Mogla, Radha, C. Vasantha Lakshmi, and Niladri Chatterjee. "A Systematic Approach for English-Hindi Parallel Database Creation for Transliteration of General Domain English Words." In 2021 IEEE 4th International Conference on Computing, Power and Communication Technologies (GUCON), pp. 1-5. IEEE, 2021.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.