BERT Model Based Identification and Classification of Web Vulnerabilities Using Deep Learning Approach

Authors

  • Manjunatha K. M., M. Kempanna, Pushpa G., Rangaswamy M. G.

Keywords:

Bidirectional encoder representations from transformers, BERT, SQL Injections, SQLIA, cross site scripting, XSS, transformers.

Abstract

In recent years, researchers have been focused upon machine learning and machine language based models to predict and identify effects of their researches. In this research the vulnerabilities in web, using the machine learning model BERT (Bidirectional Encoder Representations from Transformers) with additional layers have been attempted. The datasets used for the model’s prediction and classification are SQLInjection (SQLI) (namely: attacks and benign) and Cross Site Scripting (XSS) datasets respectively. The developed BERT model predicts the vulnerabilities in the data and classifies them accordingly. The loss is estimated through cross entropy loss technique. The performance of the model is evaluated through metric evaluation method namely binary accuracy. The analyses and findings shows that the developed advanced BERT obtained higher accuracy (SQLI with 98% and XSS with 97% accuracies respectively), than the standard BERT model (SQLI with 87% and XSS with 84% accuracies respectively). The research concludes stating that an increased BERT layers based model performs significantly with higher accuracy in classification than the standard BERT as a transformer model.

Downloads

Download data is not yet available.

References

Alghawazi. M, Alghazzawi. D and Alarifi. S., (2022), “Detection of SQL Injection Attack Using Machine Learning Techniques: A Systematic Literature Review”, Journal of Cyber Security and Privacy,2: 764-777.

Azman. M.A, Marhusin. M.F and Sulaiman. R, (2021), “Machine Learning-Based Technique to Detect SQL Injection Attack”, Journal of Computer Science, 17(3): 296-303.

Barde. S.S, (2020), “Cross Site Scripting detection using Random Forest Bagging and Dataset Ensemble Modelling”, MSC thesis submitted to National College of Ireland – School of Computing, 1-19, August 2020.

Chen. D, Yan. Q, Wu. C and Zhao. J, (2020), “SQL Injection Attack Detection and Prevention Techniques Using Deep Learning”, Journal of Physics: Conference Series,1757(2021): 1-8.

Johari. R and Sharma. P, (2012), “A Survey On Web Application Vulnerabilities (SQLIA, XSS) Exploitation and Security Engine for SQL Injection”, In: 2012 International Conference on Communication Systems and Network Technologies (IEEE – Computer Society), 2012: 453-458.

Kumar, R. (2011), “Mitigating the authentication vulnerabilities in Web applications through security requirements,” Information and Communication Technologies (ICT),60: 651-663.

Lee. N.Q.K, Ho. Q-T, Nguyen. T-T-D and Ou. Y-Y, (2021), “A transformer architecture based on BERT and 2D convolutional neural network to identify DNA enhancers from sequence information”, Briefings in Bioinformatics, 2021: 1-22.

Press. O, Smith. N.A and Levy. O, (2020), “Improving Transformer Models by Reordering their Sub-layers”, In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020: 2996-3005. July 5 - 10, 2020.

Ross, K., (2018), “SQL Injection Detection Using Machine Learning Techniques and Multiple Data Sources”, Master's Projects, 650: 1-28. Available at https://scholarworks.sjsu.edu/etd_projects/650

Sukhanand. S and Sharma. P, (2017), “A Review Paper on SQL Injection and Cross Site Scripting Vulnerabilities”, International Journal of Creative Research Thoughts (IJCRT), 5(4): 3721-3724.

Kumar. A and Binu. S, (2018), “Proposed method for SQL injection detection and its prevention”, International Journal of Engineering and Technology, 7: 213.

Rahman. T.F.A, Buja. A.G, Abd. Kand Ali. F.M, (2017),“SQL Injection Attack Scanner Using Boyer-Moore String Matching Algorithm”,JCP, 12(2): 183-189.

Fang. Y, Peng. J, Liu. L,and Huang. C,(2018), “WOVSQLI: Detection of SQL Injection Behaviors Using Word Vector and LSTM”, In:Proceedings of the 2nd International Conference on Cryptography, Security and Privacy, Guiyang, China, 16 March 2018; ACM: Rochester, NY, USA. 170-174.

Gong. X, Zhou. Y, Bi. Y, He. M, Sheng. S, Qiu. H, He. R, and Lu. J,(2019), “EstimatingWeb Attack Detection via Model Uncertainty from Inaccurate Annotation”, In:Proceedings of the 2019 6th IEEE International Conference on Cyber Security and Cloud Computing (CSCloud), Paris, France, June 21-23 June.IEEE: Piscataway, NJ, USA. 53-58.

Abdulmalik. Y, (2021), “An Improved SQL Injection Attack Detection Model Using Machine Learning Techniques”, International Journal of Innovation in Computers11: 53-57.

Farooq. U,(2021), “Ensemble Machine Learning Approaches for Detection of SQL Injection Attack”,Teh. Glas, 2021(15): 112-120.

Dong. J, He. F, Guo. Y, et al.,(2020), “A Commodity Review Sentiment Analysis Based on BERT-CNN Model”, In: 2020 5th International Conference on Computer and Communication Systems (ICCCS), 2020: 143-147.

Bisht. P, Madhusudan. P and Venkatakrishnan. V.N, (2010), “CANDID: Dynamic Candidate Evaluations for Automatic Prevention of SQL Injection Attacks”, ACM Trans. Inf. Syst. Secur.,13(2):1-39.

Indrani Balasundaram, E.Ramaraj “An Authentication Scheme for Preventing SQL Injection Attack Using Hybrid Encryption (PSQLIA-HBE)”, European Journal of Scientific Research, 53(3): 359-368.

Srivastava. N, Hinton. G, Krizhevsky. A, Sutskever.I and Salakhutdinov. R,(2014), “Dropout: A simple way to prevent neural networks from overfitting”, Journal of Machine Learning Research, 15:1929-1958.

Dawadi. B.R, Adhikari. B, and Srivastava. D.K, (2023), “Deep Learning Technique-Enabled Web Application Firewall for the Detection of Web Attacks”, Sensors, 23: 2073.

Yang. X, Peng. G, Li. Z, Lyu. Y, Liu. S and Li. C, (2022), “Research on entity recognition and alignment of APT attack based on Bert and BiLSTM-CRF”, Journal of Communications, 43(6): 58-70.

Wong. H and Luo. T, (2020), “Man-in-the-Middle Attacks on MQTT-based IoT Using BERT Based Adversarial Message Generation”, In: KDD’20 Workshops (AIoT), August 24, SanDiego, CA. 1-6.

Lu. D, Fei. J and Liu. L, (2023), “A Semantic Learning-Based SQL Injection Attack Detection Technology”, Electronics. 12(6):1344.

Hasan. M, Balbahaith. Z and Tarique. M, (2019), “Detection of SQL Injection Attacks: A Machine Learning Approach," In: 2019 International Conference on Electrical and Computing Technologies and Applications (ICECTA), Ras Al Khaimah, United Arab Emirates, 1-6.

Falor. A, Hirani. M, Vedant. H, Mehta. P, and Krishnan. D, (2022), “A Deep Learning Approach for Detection of SQL Injection Attacks Using Convolutional Neural Networks”, In: Gupta, D., Polkowski, Z., Khanna, A., Bhattacharyya, S., Castillo, O. (eds) Proceedings of Data Analytics and Management. Lecture Notes on Data Engineering and Communications Technologies, Vol. 91. Springer, Singapore.

Shah. S.S.H, (2020),“Cross site scripting XSS dataset for deep learning”, Available athttps://www.kaggle.com/datasets/syedsaqlainhussain/cross-site-scripting-xss-dataset-for-deep-learning

Shah. S.S.H, (2021), “SQL Injection dataset”. Available athttps://www.kaggle.com/datasets/syedsaqlainhussain/sql-injection-dataset

PreciseSecurity.com (2019), “Cross-Site Scripting (XSS) Makes Nearly 40% of All Cyber Attacks in 2019”, Available at https://www.precisesecurity.com/articles/cross-site-scripting-xss-makes-nearly-40-of-all-cyber-attacks-in-2019/

Statista (2020), Vailshery. L.S, “Global web application critical vulnerability taxonomy 2022”.In Technology and Telecommunications article. Available at https://www.statista.com/statistics/806081/worldwide-application-vulnerability-taxonomy/

InvistiSecurity.com (2022), “Report: 35% of educational institutions have a SQLI vulnerability”. Available at https://venturebeat.com/security/report-35-of-educational-institutions-have-a-sqli-vulnerability/

Merritt. R, (2022), “What Is a Transformer Model?”. Available at https://blogs.nvidia.com/blog/2022/03/25/what-is-a-transformer-model/#:~:text=A%20transformer%20model%20is%20a,25%2C%202022%20by%20Rick%20Merritt

Application Defence Centre (ADC), 2015, "2015 Web Application Attack Report (WAAR)", available at http://www.imperva.com/docs/HII_Web_Application_Attack_Report_Ed6.pdf

Downloads

Published

26.03.2024

How to Cite

M. Kempanna, Pushpa G., Rangaswamy M. G., M. K. M. . (2024). BERT Model Based Identification and Classification of Web Vulnerabilities Using Deep Learning Approach. International Journal of Intelligent Systems and Applications in Engineering, 12(21s), 236–248. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/5415

Issue

Section

Research Article