Analysis of Models and Dataset used for Predicting Emotion in Text

Authors

  • Adzmer Muhali Department of Computer Studies and Engineering, Sulu State College, Jolo, Sulu, Philippines

Keywords:

Naïve Bayes, Logistic Regression, Accuracy, Precision, Recall, Classification Report, Confusion Matrix, Dataset, Emotion, Text

Abstract

This paper presents the analysis of two models namely Naïve Bayes and Logistic Regression, and a dataset for predicting emotion in a text. The experiment use emotion dataset from KAGGLE website, containing 21,459 data with two columns labelled as Text and Emotion, emotion class consists of happy, anger, sadness, love, fear, and surprise. This is to evaluate the models and dataset applied in this research if it is good and enough for predicting emotion in text. Specifically, to apply data collection, data preparation, feature engineering, model building, and model evaluation. Based on the results, we conclude that Logistic Regression Model gives the best performance. In classification report, the result shows that the accuracy of Naïve Bayes is 77 percent only while Logistic Regression is 89 percent. The result for the best model performance also has the highest percentage of accuracy obtain rather than the previous research discussed in this paper that uses different models. The result of analysis for the dataset is good when it comes for training purposes but for the real time application, the data for each emotion should be balance since the dataset utilized in this research is an imbalance dataset.

Downloads

Download data is not yet available.

References

Singhal P, Muljono, N. A. S. Winarsih and C. Supriyanto, "Evaluation of classification methods for Indonesian text emotion detection," 2016 International Seminar on Application for Technology of Information and Communication (ISemantic), 2016, pp. 130-133, doi: 10.1109/ISEMANTIC.2016.7873824.

S. Chaffar and D. Inkpen, "Using a Heterogeneous Dataset for Emotion

Analysis in Text," in Advances in Artificial Intelligence, Springer Berlin

Heidelberg, 2011, pp. 62-67

N. Chirawichitchai, "Emotion Classification of Thai Text based Using

Term weighting and Machine Learning Techniques," in 11th

International Joint Conference on Computer Science and Software

Engineering (JCSSE), 2014

P. Inrak and S. Sinthupinyo, "Applying Latent Semantic Analysis To

Classify Emotions In Thai Text," in 2010 2nd International Conference

on Computer Engineering and Technology (ICCET), 2010

W. Li and H. Xu, "Text-based emotion classification using emotion

cause extraction," Expert Systems with Applications journal, vol. 41, p.

–1749 , 2014

J. Li, Y. Xu, H. Xiong and Y. Wang, "Chinese Text Emotion

Classification Based On Emotion Dictionary," in 2010 IEEE 2nd

Symposium on Web Society, 2010

A. Z. Arifin, Y. A. Sari and E. K. Ratnasari, "Emotion Detection of

Tweets in Indonesian Language using Non-Negative Matrix Factorization," International Journal of Intelligent Systems and Applications, 2014.

Arifin and K. E. Purnama, "Classification of Emotions in Indonesia Texts using K-NN Method," International Journal of Information and

Electronics Engineering, vol. 2, no. 6, 2012

M. Sunghwan, "Recognising Emotions and Sentiments in Text," Thesis,

Dept. Elect. and Inform. Eng., University of Sydney, Sydney, Australia,

Acheampong, FA, Wenyu, C, Nunoo-Mensah, H. Text-based emotion detection: Advances, challenges, and opportunities. Engineering Reports. 2020; 2:e12189. https://doi.org/10.1002/eng2.12189

Shanmugam, S. P. ., Vadivu, M. S. ., Anitha, D., Varun, M., & Saranya, N. N. . (2023). A Internet of Things Improvng Deep Neural Network Based Particle Swarm Optimization Computation Prediction Approach for Healthcare System. International Journal on Recent and Innovation Trends in Computing and Communication, 11(4s), 92–99. https://doi.org/10.17762/ijritcc.v11i4s.6311

Mwangi , J., Cohen, D., Silva, C., Min-ji, K., & Suzuki, H. Improving Fraud Detection in Financial Transactions with Machine Learning. Kuwait Journal of Machine Learning, 1(4). Retrieved from http://kuwaitjournals.com/index.php/kjml/article/view/148

Beemkumar, N., Gupta, S., Bhardwaj, S., Dhabliya, D., Rai, M., Pandey, J.K., Gupta, A. Activity recognition and IoT-based analysis using time series and CNN (2023) Handbook of Research on Machine Learning-Enabled IoT for Smart Applications Across Industries, pp. 350-364.

Downloads

Published

03.09.2023

How to Cite

Muhali, A. . (2023). Analysis of Models and Dataset used for Predicting Emotion in Text. International Journal of Intelligent Systems and Applications in Engineering, 12(1s), 474–480. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/3484

Issue

Section

Research Article