Analysis of Models and Dataset used for Predicting Emotion in Text
Keywords:
Naïve Bayes, Logistic Regression, Accuracy, Precision, Recall, Classification Report, Confusion Matrix, Dataset, Emotion, TextAbstract
This paper presents the analysis of two models namely Naïve Bayes and Logistic Regression, and a dataset for predicting emotion in a text. The experiment use emotion dataset from KAGGLE website, containing 21,459 data with two columns labelled as Text and Emotion, emotion class consists of happy, anger, sadness, love, fear, and surprise. This is to evaluate the models and dataset applied in this research if it is good and enough for predicting emotion in text. Specifically, to apply data collection, data preparation, feature engineering, model building, and model evaluation. Based on the results, we conclude that Logistic Regression Model gives the best performance. In classification report, the result shows that the accuracy of Naïve Bayes is 77 percent only while Logistic Regression is 89 percent. The result for the best model performance also has the highest percentage of accuracy obtain rather than the previous research discussed in this paper that uses different models. The result of analysis for the dataset is good when it comes for training purposes but for the real time application, the data for each emotion should be balance since the dataset utilized in this research is an imbalance dataset.
Downloads
References
Singhal P, Muljono, N. A. S. Winarsih and C. Supriyanto, "Evaluation of classification methods for Indonesian text emotion detection," 2016 International Seminar on Application for Technology of Information and Communication (ISemantic), 2016, pp. 130-133, doi: 10.1109/ISEMANTIC.2016.7873824.
S. Chaffar and D. Inkpen, "Using a Heterogeneous Dataset for Emotion
Analysis in Text," in Advances in Artificial Intelligence, Springer Berlin
Heidelberg, 2011, pp. 62-67
N. Chirawichitchai, "Emotion Classification of Thai Text based Using
Term weighting and Machine Learning Techniques," in 11th
International Joint Conference on Computer Science and Software
Engineering (JCSSE), 2014
P. Inrak and S. Sinthupinyo, "Applying Latent Semantic Analysis To
Classify Emotions In Thai Text," in 2010 2nd International Conference
on Computer Engineering and Technology (ICCET), 2010
W. Li and H. Xu, "Text-based emotion classification using emotion
cause extraction," Expert Systems with Applications journal, vol. 41, p.
–1749 , 2014
J. Li, Y. Xu, H. Xiong and Y. Wang, "Chinese Text Emotion
Classification Based On Emotion Dictionary," in 2010 IEEE 2nd
Symposium on Web Society, 2010
A. Z. Arifin, Y. A. Sari and E. K. Ratnasari, "Emotion Detection of
Tweets in Indonesian Language using Non-Negative Matrix Factorization," International Journal of Intelligent Systems and Applications, 2014.
Arifin and K. E. Purnama, "Classification of Emotions in Indonesia Texts using K-NN Method," International Journal of Information and
Electronics Engineering, vol. 2, no. 6, 2012
M. Sunghwan, "Recognising Emotions and Sentiments in Text," Thesis,
Dept. Elect. and Inform. Eng., University of Sydney, Sydney, Australia,
Acheampong, FA, Wenyu, C, Nunoo-Mensah, H. Text-based emotion detection: Advances, challenges, and opportunities. Engineering Reports. 2020; 2:e12189. https://doi.org/10.1002/eng2.12189
Shanmugam, S. P. ., Vadivu, M. S. ., Anitha, D., Varun, M., & Saranya, N. N. . (2023). A Internet of Things Improvng Deep Neural Network Based Particle Swarm Optimization Computation Prediction Approach for Healthcare System. International Journal on Recent and Innovation Trends in Computing and Communication, 11(4s), 92–99. https://doi.org/10.17762/ijritcc.v11i4s.6311
Mwangi , J., Cohen, D., Silva, C., Min-ji, K., & Suzuki, H. Improving Fraud Detection in Financial Transactions with Machine Learning. Kuwait Journal of Machine Learning, 1(4). Retrieved from http://kuwaitjournals.com/index.php/kjml/article/view/148
Beemkumar, N., Gupta, S., Bhardwaj, S., Dhabliya, D., Rai, M., Pandey, J.K., Gupta, A. Activity recognition and IoT-based analysis using time series and CNN (2023) Handbook of Research on Machine Learning-Enabled IoT for Smart Applications Across Industries, pp. 350-364.
Downloads
Published
How to Cite
Issue
Section
License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.