Identifying Severity of Cyberbullying Using Scalable Labeled Multi-Platform Dataset

Authors

  • Madhura Vyawahare Department of Computer Engineering, Pillai College of Engineering New Panvel, Maharashtra, India https://orcid.org/0000-0002-8981-7636
  • Sharvari Govilkar HOD, Department of Computer Engineering, Pillai College of Engineering New Panvel, Maharashtra, India

Keywords:

Cyberbullying, Cybercrimes, Dataset Annotation, Social Media, Machine Learning

Abstract

Increasing invective posts on online social media platforms is of great concern considering the wellbeing of society and psychological health of youth. These invective posts many times take the form of cyberbullying if not tackled in an early stage. It is required to identify such posts which are harmful and may become even more dangerous for any netizens, to maintain a psychologically healthy society. Many machine learning and deep learning based systems were designed in the past for automated cyberbullying detection. Accurate and precise cyberbullying detection needs a large and correctly annotated dataset. The work is focused on resolving the issue of unavailability of appropriate dataset by designing an automated labeling system for creating and labeling the dataset to detect severity of cyberbullying. The meta-features apart from textual comments like semantic and syntactic features also contribute to learning of the machine. Principal components analysis is used for feature extraction and reduction. Rule based methodology is designed, developed and implemented which considers textual, semantic and syntactic features and results in a rich in features, multi-platform, multi-label dataset for severity of cyberbullying detection as well as cyberbullying prediction. Till now only two approaches have been used for Annotation of dataset: Manual labeling and filtration method. A new rule based automated approach is proposed and implemented in this work. Using this new approach the dataset of size 17 lakh entries with 5 labels is prepared and used for training the machines. To make the dataset standardized and usable for researchers in future, it is tested and verified with various methods. Evaluation of the proposed system based on accuracy, precision, recall and f-measure demonstrates that the performance of multiclass classification trained from the prepared dataset is highly improved.

Downloads

Download data is not yet available.

References

Aggarwal Akshita, Kavita Maurya, and Anshima Chaudhary. "Comparative Study for Predicting the Severity of Cyberbullying Across Multiple Social Media Platforms." In 2020 4th International Conference on Intelligent Computing and Control Systems (ICICCS), pp. 871-877. IEEE, 2020

Agrawal Sweta, and Amit Awekar. "Deep learning for detecting cyberbullying across multiple social media platforms." In European conference on information retrieval, pp. 141-153. Springer, Cham, 2018

Balakrishnan, Vimala, Shahzaib Khan, and Hamid R. Arabnia. "Improving cyberbullying detection using Twitter users’ psychological features and machine learning." Computers & Security 90 (2020): 101710.

Barlett Christopher P. "Predicting adolescent's cyberbullying behavior: A longitudinal risk analysis." Journal of adolescence 41 (2015): 86-95.

Bayzick J., Kontostathis, A., & Edwards, L. (2018). Detecting the presence of cyberbullying using computer software. (Distinguished Honors), Ursinus College

Jan, Tabassum Gull, Surinder Singh Khurana, and Munish Kumar. "Semi-supervised labeling: a proposed methodology for labeling the twitter datasets." Multimedia Tools and Applications 81, no. 6 (2022): 7669-7683.

Dadvar Maral, and Kai Eckert. "Cyberbullying detection in social networks using deep learning based models; a reproducibility study." arXiv preprint arXiv:1812.08046 (2018)

Dadvar Maral, Dolf Trieschnigg, Roeland Ordelman, and Franciska de Jong. "Improving cyberbullying detection with user context." In European Conference on Information Retrieval, pp. 693-696. Springer, Berlin, Heidelberg, 2013.

Schmidt, Anna, and Michael Wiegand. "A survey on hate speech detection using natural language processing." In Proceedings of the Fifth International Workshop on Natural Language Processing for Social Media, April 3, 2017, Valencia, Spain, pp. 1-10. Association for Computational Linguistics, 2019.

Di Capua Michele, Emanuel Di Nardo, and Alfredo Petrosino. "Unsupervised cyber bullying detection in social networks." In 2016 23rd International conference on pattern recognition (ICPR), pp. 432-437. IEEE, 2016.

Malmasi, Shervin, and Marcos Zampieri. "Challenges in discriminating profanity from hate speech." Journal of Experimental & Theoretical Artificial Intelligence 30, no. 2 (2018): 187-202

Foong Yee Jang, and Mourad Oussalah. "Cyberbullying system detection and analysis." In 2017 European Intelligence and Security Informatics Conference (EISIC), pp. 40-46. IEEE, 2017.

Fortunatus Meisy. "Classifying cyber aggression in social media posts." PhD diss., Lincoln University, 2019.

Herodotou Herodotos, Despoina Chatzakou, and Nicolas Kourtellis. "A Streaming Machine Learning Framework for Online Aggression Detection on Twitter." In 2020 IEEE International Conference on Big Data (Big Data), pp. 5056-5067. IEEE, 2020.

Hogenboom, Alexander, Daniella Bal, Flavius Frasincar, Malissa Bal, Franciska De Jong, and Uzay Kaymak. "Exploiting emoticons in polarity classification of text." Journal of Web Engineering (2015): 022-040.

Hosseinmardi, Homa, Amir Ghasemian, Aaron Clauset, Markus Mobius, David M. Rothschild, and Duncan J. Watts. "Examining the consumption of radical content on YouTube." Proceedings of the National Academy of Sciences 118, no. 32 (2021): e210196711

Huang Qianjia, Vivek Kumar Singh, and Pradeep Kumar Atrey. "Cyber bullying detection using social and textual analysis." In Proceedings of the 3rd International Workshop on Socially-aware Multimedia, pp. 3-6. 2014.

Kumari, S. S. ., and K. S. . Rani. “Big Data Classification of Ultrasound Doppler Scan Images Using a Decision Tree Classifier Based on Maximally Stable Region Feature Points”. International Journal on Recent and Innovation Trends in Computing and Communication, vol. 10, no. 8, Aug. 2022, pp. 76-87, doi:10.17762/ijritcc.v10i8.5679.

Kouadri Wissam Mammar, Mourad Ouziri, Salima Benbernou, Karima Echihabi, Themis Palpanas, and Iheb Ben Amor. "Quality of sentiment analysis tools: The reasons of inconsistency." Proceedings of the VLDB Endowment 14, no. 4 (2020): 668-681.

Whittaker, Elizabeth, and Robin M. Kowalski. "Cyberbullying via social media." Journal of school violence 14, no. 1 (2015): 11-29.

Kowalski, Robin M., Susan P. Limber, and Annie McCord. "A developmental approach to cyberbullying: Prevalence and protective factors." Aggression and Violent Behavior 45 (2019): 20-32.

Sudhakar, C. V., & Reddy, G. U. . (2022). Land use Land cover change Assessment at Cement Industrial area using Landsat data-hybrid classification in part of YSR Kadapa District, Andhra Pradesh, India. International Journal of Intelligent Systems and Applications in Engineering, 10(1), 75–86. https://doi.org/10.18201/ijisae.2022.270

Mahlangu, Thabo, Chunling Tu, and Pius Owolawi. "A review of automated detection methods for cyberbullying." In 2018 International Conference on Intelligent and Innovative Computing Applications (ICONIC), pp. 1-5. IEEE, 2018.

Garg, D. K. . (2022). Understanding the Purpose of Object Detection, Models to Detect Objects, Application Use and Benefits. International Journal on Future Revolution in Computer Science &Amp; Communication Engineering, 8(2), 01–04. https://doi.org/10.17762/ijfrcsce.v8i2.2066

Al-Khater, Wadha Abdullah, Somaya Al-Maadeed, Abdulghani Ali Ahmed, Ali Safaa Sadiq, and Muhammad Khurram Khan. "Comprehensive review of cybercrime detection techniques." IEEE Access 8 (2020): 137293-137311.

Al-Garadi, Mohammed Ali, Mohammad Rashid Hussain, Nawsher Khan, Ghulam Murtaza, Henry Friday Nweke, Ihsan Ali, Ghulam Mujtaba, Haruna Chiroma, Hasan Ali Khattak, and Abdullah Gani. "Predicting cyberbullying on social media in the big data era using machine learning algorithms: review of literature and open challenges." IEEE Access 7 (2019): 70701-70718.

Terizi, Chrysoula, Despoina Chatzakou, Evaggelia Pitoura, Panayiotis Tsaparas, and Nicolas Kourtellis. "Modeling aggression propagation on social media." Online Social Networks and Media 24 (2021): 100137.

Rosa, Hugo, Nádia Pereira, Ricardo Ribeiro, Paula Costa Ferreira, Joao Paulo Carvalho, Sofia Oliveira, Luísa Coheur, Paula Paulino, AM Veiga Simão, and Isabel Trancoso. "Automatic cyberbullying detection: A systematic review." Computers in Human Behavior 93 (2019): 333-345.

Lee, Yeungjeom, Michelle N. Harris, and Jihoon Kim. "Gender Differences in Cyberbullying Victimization From a Developmental Perspective: An Examination of Risk and Protective Factors." Crime & Delinquency (2022): 00111287221081025.

Mladenović, Miljana, Vera Ošmjanski, and Staša Vujičić Stanković. "Cyber-aggression, cyberbullying, and cyber-grooming: a survey and research challenges." ACM Computing Surveys (CSUR) 54, no. 1 (2021): 1-42.

Samghabadi Niloofar Safi. "Automatic Detection of Nastiness and Early Signs of Cyberbullying Incidents on Social Media." PhD diss., University of Houston, 2020.

Sugandhi, Rekha, Anurag Pande, Abhishek Agrawal, and Husen Bhagat. "Automatic monitoring and prevention of cyberbullying." International Journal of Computer Applications 8 (2016): 17-19.

Talpur Bandeh Ali, and Declan O’Sullivan. "Multi-class imbalance in text classification: A feature engineering approach to detect cyberbullying in Twitter." In Informatics, vol. 7, no. 4, p. 52. Multidisciplinary Digital Publishing Institute, 2020.

Van Bruwaene, David, Qianjia Huang, and Diana Inkpen. "A multi-platform dataset for detecting cyberbullying in social media." Language Resources and Evaluation 54, no. 4 (2020): 851-874.

Vyawahare Madhura, and Madhumita Chatterjee, ‘‘Taxonomy of Cyberbullying Detection and Prediction Techniques in Online Social Networks.” In Data Communication and Networks, pp. 21-37. Springer, Singapore, 2020.

Wiguna, Tjhin, R. Irawati Ismail, Rini Sekartini, Noorhana Setyawati Winarsih Rahardjo, Fransiska Kaligis, Albert Limawan Prabowo, and Rananda Hendarmo. "The gender discrepancy in high-risk behaviour outcomes in adolescents who have experienced cyberbullying in Indonesia." Asian journal of psychiatry 37 (2018): 130-135.

N. A. Libre. (2021). A Discussion Platform for Enhancing Students Interaction in the Online Education. Journal of Online Engineering Education, 12(2), 07–12. Retrieved from http://onlineengineeringeducation.com/index.php/joee/article/view/49

Choi, Yoon-Jin, Byeong-Jin Jeon, and Hee-Woong Kim. "Identification of key cyberbullies: A text mining and social network analysis approach." Telematics and Informatics 56 (2021): 101504.

Murshed, Belal Abdullah Hezam, Jemal Abawajy, Suresha Mallappa, Mufeed Ahmed Naji Saif, and Hasib Daowd Esmail Al-Ariki. "DEA-RNN: A Hybrid Deep Learning Approach for Cyberbullying Detection in Twitter Social Media Platform." IEEE Access 10 (2022): 25857-25871.

Elsafoury, Fatma, Stamos Katsigiannis, Zeeshan Pervez, and Naeem Ramzan. "When the timeline meets the pipeline: A survey on automated cyberbullying detection." IEEE Access 9 (2021): 103541-103563.

Z. Zhang, D. Robinson, and J. Tepper, ‘‘Detecting hate speech on twitter using a convolution-GRU based deep neural network,’’ in The Semantic Web, A. Gangemi, R. Navigli, M.-E. Vidal, P. Hitzler, R. Troncy, L. Hollink, A. Tordai, and M. Alam, Eds. Cham, Switzerland: Springer, 2018, pp. 745–760.

Maity, Krishanu, and Sriparna Saha. "BERT-Capsule Model for Cyberbullying Detection in Code-Mixed Indian Languages." In International Conference on Applications of Natural Language to Information Systems, pp. 147-155. Springer, Cham, 2021.

Al-Garadi, Mohammed Ali, Kasturi Dewi Varathan, and Sri Devi Ravana. "Cybercrime detection in online communications: The experimental case of cyberbullying detection in the Twitter network." Computers in Human Behavior 63 (2016): 433-443

Yuvaraj, N., K. Srihari, Gaurav Dhiman, K. Somasundaram, Ashutosh Sharma, S. M. G. S. M. A. Rajeskannan, Mukesh Soni, Gurjot Singh Gaba, Mohammed A. AlZain, and Mehedi Masud. "Nature-inspired-based approach for automated cyberbullying classification on multimedia social networking." Mathematical Problems in Engineering 2021

Perera, Andrea, and Pumudu Fernando. "Accurate cyberbullying detection and prevention on social media." Procedia Computer Science 181 (2021): 605-611

Sultan, Daniyar, Shynar Mussiraliyeva, Aigerim Toktarova, Marat Nurtas, Zhalgasbek Iztayev, Lyazzat Zhaidakbaeva, Lazzat Shaimerdenova, Oxana Akhmetova, and Batyrkhan Omarov. "Cyberbullying and Hate Speech Detection on Kazakh-Language Social Networks." In 2021 7th IEEE Intl Conference on Big Data Security on Cloud, IEEE Intl Conference on High Performance and Smart Computing,(HPSC) and IEEE Intl Conference on Intelligent Data and Security (IDS), pp. 197-201. IEEE, 2021.

Research Methodology

Downloads

Published

16.12.2022

How to Cite

Vyawahare, M. ., & Govilkar, S. . (2022). Identifying Severity of Cyberbullying Using Scalable Labeled Multi-Platform Dataset. International Journal of Intelligent Systems and Applications in Engineering, 10(4), 201–210. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/2217

Issue

Section

Research Article

Most read articles by the same author(s)