Gender based Real Time Vocal Emotion Detection
Keywords:
Face Recognition, Computer Vision, Face Detection, Uncontrolled EnvironmentAbstract
Emotion Detection (ED) is recognizing the emotional facet of speech regardless of its semantics. Although human beings are able to perform this task efficaciously, automatically conducting it, utilizing programming devices or techniques is still a subject of research. This work proposes, several classifier algorithms have been implemented and compared based on the accuracy and emotion category. MLP, SVM, CNN and DNN with LSTM layer have been trained. At first the classifiers are training with initial datasets. The datasets that are being used are CREMA-D, TESS, SAVEE and RAVDESS. Subsequently, with pre-processing of data and data augmentations using four parameters namely Noise, Stretch, Speed and Pitch. Zero Crossed Rate, Energy – Root Mean Square and MFCC are extracted from the data. Models have witnessed improvement in accuracy under the post processing. Models thrived and succeeded in achieving accuracy of 66.61% (MLP), 53% (SVM), 88.35% (CNN) and 91.30% (DNN). This work also proposes a Real time Speech Detection System acts as artificial intelligence which classifies emotions by analyzing the audio files via the trained model based on gender.
Downloads
References
J. Andersson, L. Baresi, N. Bencomo, R. Lemos, A. Gorla, P. Inverardi and T. Vogel, "Software Engineering Processes for Self-Adaptive Systems," in Software Engineering for Self-Adaptive Systems II, Springer, 2013, pp. 51-75.
Anusha Koduru, Hima Bindu Valiveti, Anil Kumar Budati , "Feature extraction algorithms to improve the speech emotion recognition rate" International Journal of Speech Technology,2020
Manju D. Pawar & Rajendra D. Kokate , "Convolution neural network based automatic speech emotion recognition using Mel-frequency Cepstrum coefficients", Springer, 2021
Koo, H., Jeong, S., Yoon, S., & Kim, W., ''Development of speech emotion recognition algorithm using MFCC and prosody" International Conference on Electronics, Information, and Communication (ICEIC) (pp. 1-4), IEEE.2020
N. Vryzas, L. Vrysis, M. Matsiola, R. Kotsakis, C. Dimoulas and G. Kalliris, "Continuous PAPERS Speech Emotion Recognition with Convolutional Neural Networks",2020
Mustaqeem, Kwon, S, "MLT-D Net: Speech Emotion Recognition Using 10 Dilated CNN Based on Multi-Learning Trick Approach, Expert Systems with Applications,2020
Kudakwashe Zvarevashe and Oludayo Olugbara , "Ensemble Learning of Hybrid Acoustic Features for Speech Emotion Recognition",2020
Jerry Joy, Aparna Kannan, Shreya Ram, S. Rama, "Speech Emotion Recognition using Neural Network and MLP Classifier", IJESC,2020
Enkhtogtokh Togootogtokh, Christian Klasen, "Deep-EMO: Deep Learning for Speech Emotion Recognition",2020
Arash Shilandari, Hossein Marvi and Hossein Khosravi "Speech Emotion Recognition using Data Augmentation Method by Cycle-Generative Adversarial Networks",2021
Stavros Ntalampiras , "Speech emotion recognition via learning analogies",2021
U. Kumaran, S. Radha Rammohan, Senthil Murugan Nagarajan, A. Prathik , "Fusion of me/ and gammatone frequency cepstral coefficients for speech emotion recognition using deep C-RNN”,2020
Zengwei Yao, Zihao Wang, Weihuang Liu, Yaqian Liu, Jiahui Pan, "Speech emotion recognition using fusion of three multi-task learning-based classifiers HSF-DNN, MS-CNN and LLD-RNN”,2020
Panwit Nantasri, Ekachai, Jesseda karnjana, Surasak Boonkla,” ,” A Light-Weight Artificial Neural Network For Speech Emotion Recognition Using Average Values Of Mfccs And Their Derivatives”, 17th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology, June 2020, DOI:10.1109/ECTI-CON49241.2020.9158221, 2020
Mustaqeem, Soonil Kwon, “MLT-DNet: Speech emotion recognition using 1D dilated CNN based on multi-learning trick approach”, Expert systems with Application, Volume 167, 1 April 2021, 114177, https://doi.org/10.1016/j.eswa.2020.114177
Nikolaos Vryzas, Rigas Kotsakis, Aikaterini Liatsou, “, Speech Emotion Recognition for Performance Interaction”, June 2018, Journal of the Audio Engineering Society. Audio Engineering Society 66(6):457-467 10.17743/jaes.2018.0036.
Verma, D. N. . (2022). Access Control-Based Cloud Storage Using Role-Fully Homomorphic Encryption Scheme. Research Journal of Computer Systems and Engineering, 3(1), 78–83. Retrieved from https://technicaljournals.org/RJCSE/index.php/journal/article/view/46
Saxena, K. ., & Gupta, Y. K. . (2023). Analysis of Image Processing Strategies Dedicated to Underwater Scenarios. International Journal on Recent and Innovation Trends in Computing and Communication, 11(3s), 253–258. https://doi.org/10.17762/ijritcc.v11i3s.6232
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.