A Novel Approach for Early Rumour Detection in Social Media Using ALBERT
Keywords:
ALBERT, Rumour detection, social media, Natural language processing, Deep learningAbstract
Rumours and misinformation can spread rapidly in social media platforms, leading to negative consequences such as panic, public confusion, and harm to reputations. Early detection and intervention are crucial to mitigate the impact of such events. In this paper, we propose a novel approach for early rumour detection in social media using ALBERT, a state-of-the-art transformer-based language model. Our method utilizes the pre-trained language representation of ALBERT to extract informative features from social media posts and detect rumours at an early stage, even before the rumour has gained widespread attention. Specifically, we fine-tune the ALBERT model on a large-scale dataset of social media posts annotated with rumour labels, using a binary classification task. We also experiment with different types of input representations, including plain text, hashtags, and user mentions, to investigate their effect on the performance. Our experiments show that our approach outperforms several baseline models, achieving an F1-score of 0.85 and an accuracy of 0.92 on a test set of social media posts from different platforms. We also conduct a detailed analysis of the learned representations and investigate the most informative features and patterns for early rumour detection. Our work provides a promising direction for early detection of rumours and misinformation in social media, which can help prevent their spread and mitigate their negative impact.
Downloads
References
Kumar S, Shah N (2018) False information on web and social media: a survey. arXiv:arXiv-1804
Shin J, Jian L, Driscoll K, Bar F (2018) The diffusion of misinformation on social media: Temporal pattern, message, and source. Comput Hum Behav 8:278–287
Bondielli A, Marcelloni F (2019) A survey on fake news and rumour detection techniques. Inform Sci 497:38–55
Ahmed H, Traore I, Saad S (2017) Detection of online fake news using N-gram analysis and machine learning techniques. In: International conference on intelligent, secure, and dependable systems in distributed and cloud environments. Springer, Cham, pp 127–138
Allcott H, Gentzkow M (2017) Social media and fake news in the 2016 election. J Econ Perspect 31(2):211–36
Shu K, Cui L,Wang S, Lee D, Liu H (2019) defend: Explainable fake news detection. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery and data mining, pp 395–405
Ghosh S, Shah C (2018) Towards automatic fake news classification. Proc Assoc Inf Sci Technol 55(1):805–807
Zhou X, Zafarani R (2018) Fake news: a survey of research, detection methods, and opportunities. arXiv:arXiv-1812
Vosoughi S, ’Neo Mohsenvand M, Roy D (2017) Rumor gauge: Predicting the veracity of rumors on Twitter. ACM Trans KnowlDiscov Data (TKDD) 11(4):1–36
Chen W, Zhang Y, Yeo CK, Lau CT, Sung Lee B (2018) Unsupervised rumor detection based on users’ behaviors using neural networks. Pattern Recogn Lett 105:226–233
Yang F, Liu Y, Xiaohui Y, YangM(2012) Automatic detection of rumor on SinaWeibo. In: Proceedings of the ACM SIGKDD workshop on mining data semantics, pp 1–7
Shu K, Mahudeswaran D, Wang S, Lee D, Liu H (2020) FakeNewsNet: A data repository with news content, social context, and spatio temporal information for studying fake news on social media. Big Data 8(3):171–188
Monteiro RA, Santos RLS, Pardo TAS, de Almeida TA, Ruiz EES, Vale OA (2018) Contributions to the study of fake news in portuguese: New corpus and automatic detection results. In: International conference on computational processing of the portuguese language. Springer, Cham, pp 324–334
Karimi H, Roy P, Saba-Sadiya S, Tang J (2018) Multi-source multi-class fake news detection. In: Proceedings of the 27th international conference on computational linguistics, pp 1546–1557
P´erez-Rosas Ver´onica, Kleinberg B, Lefevre A, Mihalcea R (2018) Automatic detection of fake news. In: Proceedings of the 27th international conference on computational linguistics, pp 3391–3401
Castillo C, Mendoza M, Poblete B (2011) Information credibility on twitter. In: Proceedings of the 20th international conference on world wide web, pp 675–684
Roy A, Basak K, Ekbal A, Bhattacharyya P (2018) A deep ensemble framework for fake news detection and classification. arXiv:arXiv-1811
Wang WY (2017) Liar, liar pants on fire: A new benchmark dataset for fake news detection. In: Proceedings of the 55th annual meeting of the association for computational linguistics (vol 2: short Papers), pp 422–426
Liu Y, Yi-Fang BW (2018) Early detection of fake news on social media through propagation path classification with recurrent and convolutional networks. In: Thirty-second AAAI conference on artificial intelligence
O’Brien N, Latessa S, Evangelopoulos G, Boix X (2018) The language of fake news: Opening the blackbox of deep learning based detectors
Ghanem B, Rosso P, Rangel F (2018) Stance detection in fake news a combined feature representation. In: Proceedings of the first workshop on fact extraction and Verification (FEVER), pp 66–71
Ruchansky N, Seo S, Liu Y (2017) Csi: A hybrid deep model for fake news detection. In: Proceedings of the 2017 ACM on conference on information and knowledge management. ACM, pp 797–806
Singh DSKR, Vivek RD, Ghosh I (2017) Automated fake news detection using linguistic analysis and machine learning. In: International conference on social computing, behavioral-cultural modeling, & prediction and behavior representation in modeling and simulation (SBP-BRiMS), pp 1–3
Jwa H, Oh D, Park K, Kang JM, Lim H (2019) exBAKE: Automatic fake news detection model based on bidirectional encoder representations from transformers (BERT). Appl Sci 9(19):4062
Weiss AP, Alwan A, Garcia EP, Garcia J (2020) Surveying fake news: Assessing university faculty’s fragmented definition of fake news and its impact on teaching critical thinking. Int J EducIntegr 16(1):1–30
Crestani F, Rosso P (2020) The role of personality and linguistic patterns in discriminating between fake news spreaders and fact checkers. In: Natural language processing and information systems: 25th international conference on applications of natural language to information systems, NLDB 2020, Saarbr¨ucken, Germany, vol 181. Springer Nature. Proceedings
Downloads
Published
How to Cite
Issue
Section
License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.