Handling High Dimensional Word Patterns as Features by Ensemble Learning for Opinion Valuation from Twitter Streams


  • T. Jhansi Rani Assistant Professor, CSE, Dept. of GITAM Deemed to be University, Hyderabad & Research Scholar, JNTU, Hyderabad, India
  • K. Anuradha Professor, CSE, GRIET, Hyderabad, India


Feature Optimization, Machine-Learning, KS-Test, Term-Occurrence, Naïve Bayes, Wilcoxon signed-rank, Fuzzy-c Means, Handling Dimensionality


The stream of over billion tweets are often influence by ambiguity. Due to volume and ambiguity these tweets reflects high dimensionality.  The curse of high dimensionality causes more false alarming in detection of sentiment polarity using supervised learning. Though the many of contemporary contributions portrayed novel ensemble classification strategies, limited to handle the volume of data constraints or ambiguity constrained. This manuscript endeavored to portray a novel ensemble classification model that uses fusion of diversified measures to find optimal features, and a novel clustering method fuzzy c-means clustering technique to handle the high dimensionality. The resultant clusters are further used as input training corpus for classification, such that each cluster is used as input training corpus for individual classifier. The experimental study has carried by multi label four fold cross validation. In order to scale the performance, the results obtained for cross validation metrics for proposed model titled “ELOV” and the contemporary contributions of ensemble models. The performance analysis projecting that the proposed model is outperforming the contemporary contributions.


Download data is not yet available.


Bing Liu, Sentiment analysis and opinion mining, Synthesis Lectures on Hu- man Language Technologies 5 (2012), no. 1, 1-167.

Hamdan, Yasir Babiker. "Faultless Decision Making for False Information in Online: A Systematic Approach." Journal of Soft Computing Paradigm (JSCP) 2.04 (2020): 226-235.

Pooja Kumari, Shikha Singh, Devika More, and Dakshata Talpade, “Sentiment Analysis of Tweets”, IJSTE - International Journal of Science Technology & Engineering, ISSN: 2349-784X, Volume: 1, Issue: 10, pp: 130-134, 2015.

Asst. Prof. A Kowcika, Aditi Gupta, Karthik Sondhi, Nishit Shivhre, and Raunaq Kumar, “Sentiment Analysis for Social Media”, International Journal of Advanced Research in Computer Science and Software Engineering, ISSN: 2277 128X, Volume: 3, Issue: 7, 2013.

Ali Hasan, Sana Moin, Ahmad Karim, and Shahaboddin Shamshirband, “Machine Learning-Based Sentiment Analysis for Twitter Accounts”, Mathematical and Computational Applications, ISSN: 2297-8747, Volume: 21, Issue: 1, 2016.

Kadhim, R. R., and M. Y. Kamil. “Evaluation of Machine Learning Models for Breast Cancer Diagnosis Via Histogram of Oriented Gradients Method and Histopathology Images”. International Journal on Recent and Innovation Trends in Computing and Communication, vol. 10, no. 4, Apr. 2022, pp. 36-42, doi:10.17762/ijritcc.v10i4.5532.

Rasika Wagh, and Payal Punde, “Survey on Sentiment Analysis using Twitter Dataset” 2nd International conference on Electronics, Communication and Aerospace Technology (ICECA 2018) IEEE Conference, ISBN: 978-1-5386-0965-1,2018.

R. Sharma, S. Nigam, and R. Jain, “Opinion mining of movie reviews at document level,” arXiv preprint arXiv: 1408.3829, 2014.

R. Sharma, S. Nigam, and R. Jain, “Polarity detection at sentence level,” International Journal of Computer Applications, vol. 86, no. 11, 2014.

Sally Fouad Shady. (2021). Approaches to Teaching a Biomaterials Laboratory Course Online. Journal of Online Engineering Education, 12(1), 01–05. Retrieved from http://onlineengineeringeducation.com/index.php/joee/article/view/43

A. Harb, M. Plantié, G. Dray, M. Roche, F. Trousset, and P. Poncelet, “Web Opinion Mining: How to extract opinions from blogs?” in Proceedings of the 5th international conference on Soft computing as transdisciplinary science and technology, 2008: ACM, pp. 211-217.

T. Zagibalov and J. Carroll, “Unsupervised classification of sentiment and objectivity in Chinese text,” in Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-I, 2008.

A. Tripathy and S. K. Rath, “Classification of sentiment of reviews using supervised machine learning techniques,” International Journal of Rough Sets and Data Analysis (IJRSDA), vol. 4, no. 1, pp. 56-74, 2017.

M. R. Saleh, M. T. Martín-Valdivia, A. Montejo-Ráez, and L. UreñaLópez, "Experiments with SVM to classify opinions in different domains," Expert Systems with Applications, vol. 38, no. 12, pp. 14799- 14804, 2011.

V. Kharde and P. Sonawane, “Sentiment analysis of twitter data: A survey of techniques,” arXiv preprint arXiv: 1601.06971, 2016.

A. Go, R. Bhayani, and L. Huang, “Twitter sentiment classification using distant supervision,” CS224N Project Report, Stanford, vol. 1, no. 2009, p. 12, 2009.

J. Read, “Using emoticons to reduce dependency in machine learning techniques for sentiment classification,” in Proceedings of the ACL student research workshop, 2005: Association for Computational Linguistics, pp. 43-48.

M. Anjaria and R. M. R. Guddeti, “Influence factor based opinion mining of Twitter data using supervised learning,” in 2014 Sixth International Conference on Communication Systems and Networks (COMSNETS), 2014, pp. 1-8.

A. Barhan and A. Shakhomirov, “Methods for Sentiment Analysis of twitter messages,” in 12th Conference of FRUCT Association, 2012.

P.-W. Liang and B.-R. Dai, “Opinion mining on social media data,” in Mobile Data Management (MDM), 2013 IEEE 14th International Conference on, 2013, and vol. 2: IEEE, pp. 91-96.

A. Pak and P. Paroubek, “Twitter as a corpus for sentiment analysis and opinion mining,” in LREc, 2010, vol. 10, no. 2010.

Chaudhary, D. S. . (2022). Analysis of Concept of Big Data Process, Strategies, Adoption and Implementation. International Journal on Future Revolution in Computer Science &Amp; Communication Engineering, 8(1), 05–08. https://doi.org/10.17762/ijfrcsce.v8i1.2065

H. Saif, Y. He, and H. Alani, “Semantic sentiment analysis of twitter,” in International semantic web conference, 2012: Springer, pp. 508-524.

H. Hamdan, F. Béchet, and P. Bellot, “Experiments with DBpedia, WordNet and SentiWordNet as resources for sentiment analysis in micro-blogging,” in Second Joint Conference on Lexical and Computational Semantics (* SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013), 2013, vol. 2, pp. 455-459.

F. Akba, A. Uçan, E. A. Sezer, and H. Sever, “Assessment of feature selection metrics for sentiment analyses: Turkish movie reviews,” in 8th European Conference on Data Mining, 2014, vol. 191, pp. 180-184.

Sharma, V. N., & Hans, D. A. . (2022). A Study to Reconnoitering the dynamics of Talent Management Procedure at Hotels in Jharkhand. International Journal of New Practices in Management and Engineering, 11(01), 41–46. https://doi.org/10.17762/ijnpme.v11i01.172

H. Saif, Y. He, and H. Alani, “Alleviating data sparsity for twitter sentiment analysis,” 2012: CEUR Workshop Proceedings (CEUR-WS. org).

Kabisha, M. S., Rahim, K. A., Khaliluzzaman, M., & Khan, S. I. (2022). Face and Hand Gesture Recognition Based Person Identification System using Convolutional Neural Network. International Journal of Intelligent Systems and Applications in Engineering, 10(1), 105–115. https://doi.org/10.18201/ijisae.2022.273

E. Martınez-Cámara, Y. Gutiérrez-Vázquez, J. Fernández, A. MontejoRáez, and R. Munoz-Guillena, "Ensemble classifier for Twitter Sentiment Analysis," 2015.

T. Chalothom and J. Ellman, “Simple Approaches of Sentiment Analysis via Ensemble Learning,” Berlin, Heidelberg, 2015: Springer Berlin Heidelberg, pp. 631-639.

M. M. Fouad, T. F. Gharib, and A. S. Mashat, “Efficient Twitter Sentiment Analysis System with Feature Selection and lassifier Ensemble,” in International Conference on Advanced Machine Learning Technologies and Applications, 2018: Springer, pp. 516-527.

X. Hu, J. Tang, H. Gao, and H. Liu, “Unsupervised sentiment analysis with emotional signals,” in Proceedings of the 22nd international conference on World Wide Web, 2013: ACM, pp. 607-618.

N. Azzouza, K. Akli-Astouati, A. Oussalah, and S. A. Bachir, “A realtime Twitter sentiment analysis using an unsupervised method,” in Proceedings of the 7th International Conference on Web Intelligence, Mining and Semantics, 2017: ACM, p. 15.

R. Ortega, A. Fonseca, and A. Montoyo, “SSA-UO: unsupervised Twitter sentiment analysis,” in Second joint conference on lexical and computational semantics (* SEM), 2013, vol. 2, pp. 501-507.

G. Paltoglou and M. Thelwall, “Twitter, MySpace, Digg: Unsupervised sentiment analysis in social media,” ACM Transactions on Intelligent Systems and Technology (TIST), vol. 3, no. 4, p. 66, 2012.

F. M. Kundi, A. Khan, S. Ahmad, and M. Z. Asghar, “Lexicon-based sentiment analysis in the social web,” Journal of Basic and Applied Scientific Research, vol. 4, no. 6, pp. 238-48, 2014.

M. Z. Asghar, A. Khan, S. Ahmad, M. Qasim, and I. A. Khan, “Lexiconenhanced sentiment analysis framework using rule-based classification scheme,” PloS one, vol. 12, no. 2, p. e0171649, 2017.

Nazeer, Ishrat, et al. “Use of Novel Ensemble Machine Learning Approach for Social Media Sentiment Analysis.” Analyzing Global Social Media Consumption. IGI Global, 2021. 16-28.

Abbas, Alaa Khudhair, et al. “Twitter sentiment analysis using an ensemble majority vote classifier.” Journal of Southwest Jiaotong University 55.1 (2020).

Matsuki, Kazunaga, Victor Kuperman, and Julie A. Van Dyke. “The Random Forests statistical technique: An examination of its value for the study of reading.” Scientific Studies of Reading 20.1 (2016): 20-33.

Yang XS, Deb S. Cuckoo search: recent advances and applications. Neural Computing and Applications. 2014 Jan 1; 24(1):169-74.

Ghasemi, Asghar, and Saleh Zahediasl. “Normality tests for statistical analysis: a guide for non-statisticians.” International journal of endocrinology and metabolism 10.2 (2012): 486.

Budak, H. and TAS ̧abat, S.E. “A modified t-score for feature selection”, Anadolu U niversitesi € Bilim Ve Teknoloji Dergisi A-Uygulamalı Bilimlerve Muhendislik €, Vol. 17 No. 5, 2016, pp. 845-852.


McKnight, Patrick E., and Julius Najab. “Mann Whitney U Test.” The Corsini encyclopedia of psychology (2010): 1-1.

Rey, Denise, and Markus Neuhäuser. “Wilcoxon-signed-rank test.” International encyclopedia of statistical science.Springer Berlin Heidelberg, 2011. 1658-1659.


T-table. http://www.sjsu.edu/faculty/gerstman/StatPrimer/t-table.pdf, (2017).

Rudra Kumar, M., and Vinit Kumar Gunjan. "Peer Level Credit Rating: An Extended Plugin for Credit Scoring Framework." ICCCE 2021. Springer, Singapore, 2022. 1227-1237

Rudra Kumar, M., Rashmi Pathak, and Vinit Kumar Gunjan. "Machine Learning-Based Project Resource Allocation Fitment Analysis System (ML-PRAFS)." Computational Intelligence in Machine Learning. Springer, Singapore, 2022. 1-14

M. Shasidhar, V.Sudheer Raja, B. Vijay Kumar, “MRI Brain Image Segmentation Using Modified Fuzzy C-Means Clustering Algorithm”, IEEE International Conference on Communication Systems and Network Technologies, 2011.

Madapuri, Rudra Kumar, and P. C. Mahesh. "HBS-CRA: scaling impact of change request towards fault proneness: defining a heuristic and biases scale (HBS) of change request artifacts (CRA)." Cluster Computing 22.5 (2019): 11591-11599

Chalapathi, M. M., et al. "Ensemble Learning by High-Dimensional Acoustic Features for Emotion Recognition from Speech Audio Signal." Security and Communication Networks 2022 (2022).

Python. (n.d.). Retrieved from https://www.python.org/downloads/

Pycharm. (n.d.). Retrieved from https://www.jetbrains.com/pycharm/download/


The block diagram represented ELOV




How to Cite

T. J. . Rani and K. . Anuradha, “Handling High Dimensional Word Patterns as Features by Ensemble Learning for Opinion Valuation from Twitter Streams ”, Int J Intell Syst Appl Eng, vol. 10, no. 1s, pp. 297 –, Oct. 2022.