Extracting Contextual Feature Form Hinglish Short Text by Handling Spelling Variation at Character and Word Level

Authors

  • Rajshree Singh Research Scholar, School of Computer Application, Babu Banarasi Das University, Lucknow
  • Reena Srivastava Dean, School of Computer Application, Babu Banarasi Das University, Lucknow

Keywords:

Pre-Processing, short text, Auto-Encoder, Hinglish

Abstract

The major purpose of sarcasm detection has been to comprehend the text's evaluation. Sarcasm detection is regarded as one of the most provocations in it, and it has been subject to significant provocation. Irony is a unique manner of expressing information that contradicts a notion and creates confusion. Data pre-processing is one of the main duties performed by most developers. Numerous studies on irony detection use a variety of feature extraction techniques. These studies used a variety of machine learning classification models that includes Naive Bayes, Logistic Regression, etc. Precision, recall, and F-score are among the research project results that can be utilized to forecast the most appropriate model. This study discusses numerous methods for detecting sarcasm and irony in text. Extra Tree Classifier and gradient boosting classifier gives the best result having F-Score 95.43 and 95.29 respectively with Wn = 4,Cn = 3 and CWn =4 to detect sarcasm in Hinglish Language.

Downloads

Download data is not yet available.

References

Aditya Joshi, Pushpak Bhattacharyya, and Mark J. Carman. 2017. Automatic sarcasm

detection: A survey. ACM Comput. Surv., 50(5).

Sahil Swami, Ankush Khandelwal, Vinay Singh, Syed Sarfaraz Akhtar, and Manish Shrivastava. A corpus of english-hindi code-mixed tweets for sarcasm detection, (2018).

Shalini, K., Hb, B.G., Kumar, M., Soman, K.P.: Sentiment analysis for code-mixed Indian social media text with distributed representation. In: 2018 International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp. 1126–1131 (2018). https://doi.org/10.1109/ICACCI.2018.8554835.

Mishra, P., Danda, P., Dhakras, P.: Code-mixed sentiment analysis using machine learning and neural network approaches (2018)

Choudhary, N., Singh, R., Bindlish, I., Shrivastava, M.: Sentiment analysis of code-mixed languages leveraging resource rich languages. In: 19th International Conference on Computational Linguistics and Intelligent Text Processing, 2018, Hanoi, Vietnam (2018).

Pradhan, R., Sharma, D.K. An ensemble deep learning classifier for sentiment analysis on code-mix Hindi–English data. Soft Computing(2022).

R. Bhargava, Y. Sharma and S. Sharma,: Sentiment analysis for mixed script Indic sentences. In: 2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI), Jaipur, India, pp. 524-529(2016).

Patwa, P., Aguilar, G., Kar, S., Pandey, S.J., Pykl, S., Gambäck, B., Chakraborty, T., Solorio, T., & Das, A. SemEval-2020

Task 9: Overview of Sentiment Analysis of Code-Mixed Tweets, (2020).

Mandal, S., Mahata, S.K., & Das, D. Preparing Bengali-English Code-Mixed Corpus for Sentiment Analysis of Indian Languages.(2018)

Banerjee, S., Jayapal, A.K., & Thavareesan, S. NUIG-Shubhanker@Dravidian-CodeMix- FIRE2020: Sentiment Analysis of Code-Mixed Dravidian text using XLNet. Fire(2020)

Sarcasm Detection using Machine Learning

Downloads

Published

17.05.2023

How to Cite

Singh, R. ., & Srivastava, R. . (2023). Extracting Contextual Feature Form Hinglish Short Text by Handling Spelling Variation at Character and Word Level. International Journal of Intelligent Systems and Applications in Engineering, 11(6s), 713–719. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/2906

Issue

Section

Research Article