Semantic and Linguistic Based Short Answer Scoring System
Keywords:
Semantic, short answer scoring, XLNet, LSTM, adversarial responsesAbstract
In natural language processing (NLP), automatic short answer scoring is an essential educational application. It can relieve the burden of manual assessment while enhancing the reliability and consistency of evaluations. These systems have shown good accuracy with the advancement of text embedding libraries and neural network models. However, the ultimate goal is to embedding given text (student responses) into vectors with coherence and semantics, and providing feedback to students. This paper presents a novel approach to address these challenges using semantic and linguistic-based embedding techniques. Specifically, we utilize XLNet, a transformer model, to convert essays into vectors. These vectors are trained on Long Short-Term Memory (LSTM) networks to capture the connectivity between sentences and their underlying semantics. To evaluate our approach, we employ our dataset, which comprises approximately 2500 responses from 650 students. This dataset is domain-specific and tailored to our specific requirements. Our model demonstrates outstanding performance on the training and testing datasets, achieving an impressive average QWK (Quadratic Weighted Kappa) score of 0.76. Additionally, our approach showcases superior results in comparison to other existing models. We further assessed the robustness of our models by testing them with adversarial responses, and the outcomes were found to be satisfactory.
Downloads
References
Uto, Masaki. "A review of deep-neural automated essay scoring models." Behaviormetrika 48.2 (2021): 459-484.
Rodriguez, P.U.; Jafari, A.; Ormerod, C.M. Language Models and Automated Essay Scoring. arXiv:1909.09482 [cs, stat] 2019.
Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural computation 1997, 9, 1735–80, doi:10.1162/neco.1997.9.8.1735.
Li, Zhaohui, Yajur Tomar, and Rebecca J. Passonneau. "A semantic feature-wise transformation relation network for automatic short answer grading." Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 2021.
Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova, K. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding. arXiv:1810.04805 [cs] 2019.
Manabe, Hitoshi, and Masato Hagiwara. "Expats: A toolkit for explainable automated text scoring." arXiv preprint arXiv:2104.03364 (2021).
Nadeem, Farah, et al. "Automated essay scoring with discourse-aware neural models." Proceedings of the fourteenth workshop on innovative use of NLP for building educational applications. 2019.
Yang, Ruosong, et al. "Enhancing automated essay scoring performance via fine-tuning pre-trained language models with combination of regression and ranking." Findings of the Association for Computational Linguistics: EMNLP 2020. 2020.
Ha, Le, et al. "Automated prediction of examinee proficiency from short-answer questions." (2020).
Hassan, Sarah, Aly A. Fahmy, and Mohammad El-Ramly. "Automatic short answer scoring based on paragraph embeddings." International Journal of Advanced Computer Science and Applications 9.10 (2018): 397-402.
Sung, Chul, Tejas Indulal Dhamecha, and Nirmal Mukhi. "Improving short answer grading using transformer-based pre-training." Artificial Intelligence in Education: 20th International Conference, AIED 2019, Chicago, IL, USA, June 25-29, 2019, Proceedings, Part I 20. Springer International Publishing, 2019.
Süzen, Neslihan, et al. "Automatic short answer grading and feedback using text mining methods." Procedia Computer Science 169 (2020): 726-743.
Mayfield, E.; Black, A.W. Should You Fine-Tune BERT for Automated Essay Scoring? In Proceedings of the Proceedings of the Fifteenth Workshop on Innovative Use of NLP for Building Educational Applications; Association for Computational Linguistics: Seattle, WA, USA → Online, 2020; pp. 151– 162.
Taghipour, Kaveh, and Hwee Tou Ng. "A neural approach to automated essay scoring." Proceedings of the 2016 conference on empirical methods in natural language processing. 2016.
Song, Wei, et al. "Multi-stage pre-training for automated Chinese essay scoring." Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020.
Ajay, Helen B., P. I. Tillet, and Ellis Batten Page. "Analysis of essays by computer (AEC-II)." US Department of Health, Education, and Welfare, Office of Education, National Center for Educational Research and Development, Washington, DC, Tech. Rep 10 (1973): 1-13.
Burstein, Jill. "The E-rater® scoring engine: Automated essay scoring with natural language processing." (2003).
Tay, Yi, et al. "Skipflow: Incorporating neural coherence features for end-to-end automatic text scoring." Proceedings of the AAAI conference on artificial intelligence. Vol. 32. No. 1. 2018.
Leacock, Claudia, and Martin Chodorow. "C-rater: Automated scoring of short-answer questions." Computers and the Humanities 37 (2003): 389-405.
Devlin, Jacob, et al. "Bert: Pre-training of deep bidirectional transformers for language understanding." arXiv preprint arXiv:1810.04805 (2018).
Riordan, Brian, Michael Flor, and Robert Pugh. "How to account for mispellings: Quantifying the benefit of character representations in neural content scoring models." Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications. 2019.
Mathias, Sandeep, and Pushpak Bhattacharyya. "ASAP++: Enriching the ASAP automated essay grading dataset with essay attribute scores." Proceedings of the eleventh international conference on language resources and evaluation (LREC 2018). 2018.
Dasgupta, Tirthankar, et al. "Augmenting textual qualitative features in deep convolution recurrent neural network for automatic essay scoring." Proceedings of the 5th Workshop on Natural Language Processing Techniques for Educational Applications. 2018.
Kumar, Yaman, et al. "Get it scored using autosas—an automated system for scoring short answers." Proceedings of the AAAI conference on artificial intelligence. Vol. 33. No. 01. 2019.
Zhu, Wilson, and Yu Sun. "Automated essay scoring system using multi-model machine learning." CS & IT Conference Proceedings. Vol. 10. No. 12. CS & IT Conference Proceedings, 2020.
Yang, Ruosong, et al. "Enhancing automated essay scoring performance via fine-tuning pre-trained language models with combination of regression and ranking." Findings of the Association for Computational Linguistics: EMNLP 2020. 2020.
Wang, Yongjie, et al. "On the use of bert for automated essay scoring: Joint learning of multi-scale essay representation." arXiv preprint arXiv:2205.03835 (2022).
Muangkammuen, Panitan, and Fumiyo Fukumoto. "Multi-task learning for automated essay scoring with sentiment analysis." Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing: Student Research Workshop. 2020.
Agrawal, Aman, and Suyash Agrawal. "Debunking Neural Essay Scoring." (2018).
Ormerod, Christopher M., Akanksha Malhotra, and Amir Jafari. "Automated essay scoring using efficient transformer-based language models." arXiv preprint arXiv:2102.13136 (2021).
Mr. Nikhil Surkar, Ms. Shriya Timande. (2012). Analysis of Analog to Digital Converter for Biomedical Applications. International Journal of New Practices in Management and Engineering, 1(03), 01 - 07. Retrieved from http://ijnpme.org/index.php/IJNPME/article/view/6
Goyal, A. ., Kanyal, H. S. ., & Sharma, B. . (2023). Analysis of IoT and Blockchain Technology for Agricultural Food Supply Chain Transactions. International Journal on Recent and Innovation Trends in Computing and Communication, 11(3), 234–241. https://doi.org/10.17762/ijritcc.v11i3.6342
Downloads
Published
How to Cite
Issue
Section
License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.