Gender based Real Time Vocal Emotion Detection

Authors

  • Anusha Anchan NITTE (Deemed to be University), Dept. of Computer Science and Engineering, NMAM Institute of Technology, Nitte - 574110, Karnataka, India
  • Manasa G. R. NITTE (Deemed to be University), Dept. of Computer Science and Engineering, NMAM Institute of Technology, Nitte - 574110, Karnataka, India
  • Joylin Priya Pinto NITTE (Deemed to be University), Dept. of Computer Science and Engineering, NMAM Institute of Technology, Nitte - 574110, Karnataka, India

Keywords:

Face Recognition, Computer Vision, Face Detection, Uncontrolled Environment

Abstract

Emotion Detection (ED) is recognizing the emotional facet of speech regardless of its semantics. Although human beings are able to perform this task efficaciously, automatically conducting it, utilizing programming devices or techniques is still a subject of research. This work proposes, several classifier algorithms have been implemented and compared based on the accuracy and emotion category.  MLP, SVM, CNN and DNN with LSTM layer have been trained. At first the classifiers are training with initial datasets. The datasets that are being used are CREMA-D, TESS, SAVEE and RAVDESS. Subsequently, with pre-processing of data and data augmentations using four parameters namely Noise, Stretch, Speed and Pitch. Zero Crossed Rate, Energy – Root Mean Square and MFCC are extracted from the data. Models have witnessed improvement in accuracy under the post processing. Models thrived and succeeded in achieving accuracy of 66.61% (MLP), 53% (SVM), 88.35% (CNN) and 91.30% (DNN). This work also proposes a Real time Speech Detection System acts as artificial intelligence which classifies emotions by analyzing the audio files via the trained model based on gender.

Downloads

Download data is not yet available.

References

J. Andersson, L. Baresi, N. Bencomo, R. Lemos, A. Gorla, P. Inverardi and T. Vogel, "Software Engineering Processes for Self-Adaptive Systems," in Software Engineering for Self-Adaptive Systems II, Springer, 2013, pp. 51-75.

Anusha Koduru, Hima Bindu Valiveti, Anil Kumar Budati , "Feature extraction algorithms to improve the speech emotion recognition rate" International Journal of Speech Technology,2020

Manju D. Pawar & Rajendra D. Kokate , "Convolution neural network based automatic speech emotion recognition using Mel-frequency Cepstrum coefficients", Springer, 2021

Koo, H., Jeong, S., Yoon, S., & Kim, W., ''Development of speech emotion recognition algorithm using MFCC and prosody" International Conference on Electronics, Information, and Communication (ICEIC) (pp. 1-4), IEEE.2020

N. Vryzas, L. Vrysis, M. Matsiola, R. Kotsakis, C. Dimoulas and G. Kalliris, "Continuous PAPERS Speech Emotion Recognition with Convolutional Neural Networks",2020

Mustaqeem, Kwon, S, "MLT-D Net: Speech Emotion Recognition Using 10 Dilated CNN Based on Multi-Learning Trick Approach, Expert Systems with Applications,2020

Kudakwashe Zvarevashe and Oludayo Olugbara , "Ensemble Learning of Hybrid Acoustic Features for Speech Emotion Recognition",2020

Jerry Joy, Aparna Kannan, Shreya Ram, S. Rama, "Speech Emotion Recognition using Neural Network and MLP Classifier", IJESC,2020

Enkhtogtokh Togootogtokh, Christian Klasen, "Deep-EMO: Deep Learning for Speech Emotion Recognition",2020

Arash Shilandari, Hossein Marvi and Hossein Khosravi "Speech Emotion Recognition using Data Augmentation Method by Cycle-Generative Adversarial Networks",2021

Stavros Ntalampiras , "Speech emotion recognition via learning analogies",2021

U. Kumaran, S. Radha Rammohan, Senthil Murugan Nagarajan, A. Prathik , "Fusion of me/ and gammatone frequency cepstral coefficients for speech emotion recognition using deep C-RNN”,2020

Zengwei Yao, Zihao Wang, Weihuang Liu, Yaqian Liu, Jiahui Pan, "Speech emotion recognition using fusion of three multi-task learning-based classifiers HSF-DNN, MS-CNN and LLD-RNN”,2020

Panwit Nantasri, Ekachai, Jesseda karnjana, Surasak Boonkla,” ,” A Light-Weight Artificial Neural Network For Speech Emotion Recognition Using Average Values Of Mfccs And Their Derivatives”, 17th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology, June 2020, DOI:10.1109/ECTI-CON49241.2020.9158221, 2020

Mustaqeem, Soonil Kwon, “MLT-DNet: Speech emotion recognition using 1D dilated CNN based on multi-learning trick approach”, Expert systems with Application, Volume 167, 1 April 2021, 114177, https://doi.org/10.1016/j.eswa.2020.114177

Nikolaos Vryzas, Rigas Kotsakis, Aikaterini Liatsou, “, Speech Emotion Recognition for Performance Interaction”, June 2018, Journal of the Audio Engineering Society. Audio Engineering Society 66(6):457-467 10.17743/jaes.2018.0036.

Verma, D. N. . (2022). Access Control-Based Cloud Storage Using Role-Fully Homomorphic Encryption Scheme. Research Journal of Computer Systems and Engineering, 3(1), 78–83. Retrieved from https://technicaljournals.org/RJCSE/index.php/journal/article/view/46

Saxena, K. ., & Gupta, Y. K. . (2023). Analysis of Image Processing Strategies Dedicated to Underwater Scenarios. International Journal on Recent and Innovation Trends in Computing and Communication, 11(3s), 253–258. https://doi.org/10.17762/ijritcc.v11i3s.6232

Downloads

Published

04.11.2023

How to Cite

Anchan, A. ., G. R., M. ., & Pinto, J. P. . (2023). Gender based Real Time Vocal Emotion Detection . International Journal of Intelligent Systems and Applications in Engineering, 12(3s), 282–289. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/3706

Issue

Section

Research Article