Comparative Analysis of Lexical Chain, Bidirectional Encoder Representations from Transformers (BERT) and Graph based Approaches for Condensation of Low Resource Language Documents

Pranjali  Deshpande; Sunita  Jahirabadkar

Comparative Analysis of Lexical Chain, Bidirectional Encoder Representations from Transformers (BERT) and Graph based Approaches for Condensation of Low Resource Language Documents

Authors

Pranjali Deshpande Department of Computer Engineering, MKSSS’s Cummins College of Engg. for Women, Pune-52, India
Sunita Jahirabadkar Department of Computer and Information Technology (CI), Department of Technology, SPPU, Pune-07, India

Keywords:

Automatic Summarization, BERT, Extractive Summarization, Graph based approach, Low Resource Languages, Lexical Chain

Abstract

Humans use language as their primary and exclusive means of communication. There are around 7000 different languages spoken in the world. Among them, Low Resource Languages (LRL) are ones that do not have the linguistic resources required to create statistical NLP applications. The most common way that people express and store their thoughts is through writing. Technological developments are making the world smaller by making distant communication more accessible. Owing to the rise in internet usage, fresh textual content is created every second. Not all of the information in this text is helpful. In light of this, document condensation or summarization is becoming a more important responsibility. There are two methods for creating summaries: extractive and abstractive. While essential phrases and sentences from the original document are kept in an extractive summary, an abstractive summary is created by reworking the main sentences. When it comes to LRL materials, summarizing becomes more difficult. The studies for condensation or automatic summarization of LRL documents using BERT, lexical chain and Graph based approach are the main topic of this study.

Downloads

Download data is not yet available.

References

Deshpande, Pranjali, and Sunita Jahirabadkar. "A Survey on Statistical Approaches for Abstractive Summarization of Low Resource Language Documents." Smart Trends in Computing and Communications. Springer, Singapore, 2022. 729-738.

Barzilay, Regina, and Michael Elhadad. "Using lexical chains for text summarization." Advances in automatic text summarization (1999): 111-121.

P. Deshpande and S. Jahirabadkar, "Study of Low Resource Language Document Extractive Summarization using Lexical chain and Bidirectional Encoder Representations from Transformers (BERT)," International Conference on Computational Performance Evaluation (ComPE), 2021, pp. 457-461.

MarathiWordNet: https://www.cfilt.iitb.ac.in/~wordnetbeta/marathiwn/wn.php

www.maharastnayak.in, an initiative by Vivek Magazine.

Devlin, Jacob, Chang, Ming-Wei, Lee, Kenton and Toutanova, Kristina. "BERT: Pre- training of Deep Bidirectional Transformers for Language Understanding.” Paper Presented at the meeting of the NAACL-HLT (1), 2019.

Vaswani, Ashish, Shazeer, Noam, Parmar, Niki, Uszkoreit, Jakob, Jones, Llion, Gomez, Aidan N., Kaiser, Lukasz and Polosukhin, Illia “Attention Is All You Need”. (2017). cite arxiv: 1706.03762

EI-Kassas, Wafaa S., et al. "EdgeSumm: Graph-based framework for automatic text summarization." Information Processing & Management 57.6 (2020): 102264.

Brants, Thorsten. "TnT-a statistical part-of-speech tagger." arXiv preprint cs/0003055 (2000).

www.nltk.org

Mandelbaum, Amit, and Adi Shalev. "Word embeddings and their use in sentence classification tasks." arXiv preprint arXiv:1610.08229 (2016).

M. M. Haider, M. A. Hossin, H. R. Mahi and H. Arif, "Automatic Text Summarization Using Gensim Word2Vec and K-Means Clustering Algorithm," 2020 IEEE Region 10 Symposium (TENSYMP), 2020, pp. 283-286, doi: 10.1109/TENSYMP50017.2020.9230670.

Downloads

Published

23.02.2024

How to Cite

Deshpande, P. ., & Jahirabadkar, S. . (2024). Comparative Analysis of Lexical Chain, Bidirectional Encoder Representations from Transformers (BERT) and Graph based Approaches for Condensation of Low Resource Language Documents. International Journal of Intelligent Systems and Applications in Engineering, 12(16s), 08–15. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/4777

Download Citation

Issue

Vol. 12 No. 16s (2024)

Section

Research Article

License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.

IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.

Comparative Analysis of Lexical Chain, Bidirectional Encoder Representations from Transformers (BERT) and Graph based Approaches for Condensation of Low Resource Language Documents

Authors

Keywords:

Abstract

Downloads

References

Downloads

Published

How to Cite

Issue

Section

License

Similar Articles

Announcements

Information for Authors

ijisae

Information

trindex