Enhancing Abstractive Text Summarization using Two-Staged Network for Telugu Language (EATS2N)

Authors

  • Srisudha Garugu Research Scholar, Department of Computer Science & System Engineering, , Andhra University College of Engineering (A), Andhra University, Visakhapatnam-530003, India
  • Lalitha Bhaskari Professor, Department of Computer Science & System Engineering, Coordinator, IQAC, , Andhra University College of Engineering (A), Andhra University, Visakhapatnam-530003, India

Keywords:

Summarization, Extractive Summary, Keywords, Telugu News Paper, Biology Textbook

Abstract

The process of producing a clear and short synopsis of lengthy texts without sacrificing the overall meaning by concentrating on the passages that provide important information is known as text summarization Extractive summaries that highlight a significant portion of the input texts frequently include crucial keywords. The vast majority of strategies for extractive summarization are based on the idea of locating keywords and extracting sentences that have a disproportionately high number of keywords compared to the others. The process of extracting keywords often involves identifying relevant terms that occur more frequently than other words and putting an emphasis on the most significant of them.Selecting keywords manually is challenging, susceptible to inaccuracies, and demands considerable time and attention.A technique that can automatically extract keywords from Telugu e-newspaper datasets was proposed by using this work. The keywords may then be used for text summarizing. The proposed method compares two different datasets, the telugu newspaper and the biology text book and the performance metrics are compared using the accuracy and ROGUE score values.

Downloads

Download data is not yet available.

References

C. Zhang, “Automatic keyword extraction from documents using conditional random fields,” Journal of Computational Information Systems, vol. 4 (3), 2008, pp. 1169-1180.

E. Hovy, C.-Y. Lin, “Automated text summarization and the summarize system,” in: Proceedings of a workshop held at Baltimore, ACL, 1998, pp. 197-214.

Mani, M. T. Maybury, “Advances in automatic text summarization,” Vol. 293, MIT Press, 1999.

G. Erkan, D. R. Radev, “Lexrank: graph-based lexical centrality as salience in text summarization,” Journal of Artificial Intelligence Research, 2004, pp. 457-479.

Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to sequence learning with neural networks. Advances in neural information processing systems, 27.

Chorowski, J. K., Bahdanau, D., Serdyuk, D., Cho, K., & Bengio, Y. (2015). Attention-based models for speech recognition. Advances in neural information processing systems, 28.

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30.

Guo, Y., Rennard, V., Xypolopoulos, C., & Vazirgiannis, M. (2021). BERTweetFR: Domain adaptation of pre-trained language models for French tweets. arXiv preprint arXiv:2109.10234.

Pàmies, M., Öhman, E., Kajava, K., & Tiedemann, J. (2020). LT@ Helsinki at SemEval-2020 Task 12: Multilingual or language-specific BERT?. arXiv preprint arXiv:2008.00805.

Kawintiranon, K., & Singh, L. (2022, June). PoliBERTweet: a pre-trained language model for analyzing political content on twitter. In Proceedings of the Thirteenth Language Resources and Evaluation Conference (pp. 7360-7367).

Akomeah, K. O., Kruschwitz, U., & Ludwig, B. (2021). University of Regensburg@ PAN: Profiling Hate Speech Spreaders on Twitter. In CLEF (Working Notes) (pp. 2083-2089).

M. Rush, S. Chopra, and J. Weston, ‘‘A neural attention model for abstractive sentence summarization,’’ in Proc. Conf. Empirical Methods Natural Lang. Process., 2015, pp. 1–11.

O. Vinyals, M. Fortunato, and N. Jaitly, ‘‘Pointer networks,’’ in Proc. 28th Int. Conf. Neural Inf. Process. Syst., vol. 2, 2015, pp. 2692–2700.

D. Bahdanau, J. Chorowski, D. Serdyuk, P. Brakel, and Y. Bengio, ‘‘End-to-end attention-based large vocabulary speech recognition,’’ in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., Mar. 2016, pp. 4945–4949.

C. Gulcehre, O. Firat, K. Xu, K. Cho, L. Barrault, H.-C. Lin, F. Bougares, H. Schwenk, and Y. Bengio, ‘‘On using monolingual corpora in neural machine translation,’’ 2015, arXiv:1503.03535.

J. Jin, P. Ji, and R. Gu, ‘‘Identifying comparative customer requirements from product online reviews for competitor analysis,’’ Eng. Appl. Artif. Intell., vol. 49, pp. 61–73, Mar. 2016.

R. Nallapati, B. Zhou, C. dos Santos, C. Gulcehre, and B. Xiang, ‘‘Abstractive text summarization using sequence-to-sequence RNNs and beyond,’’ in Proc. 20th SIGNLL Conf. Comput. Natural Lang. Learn., 2016, pp. 280–290.

M. Rush, S. Chopra, and J. Weston, ‘‘A neural attention model for abstractive sentence summarization,’’ in Proc. Conf. Empirical Methods Natural Lang. Process., 2015, pp. 1–11.

Hu, Q. Chen, and F. Zhu, ‘‘LCSTS: A large scale Chinese short text summarization dataset,’’ in Proc. Conf. Empirical Methods Natural Lang. Process., 2015, pp. 1967–1972.

J. Cheng and M. Lapata, ‘‘Neural summarization by extracting sentences and words,’’ in Proc. 54th Annu. Meeting Assoc. Comput. Linguistics (Long Papers), vol. 1, 2016, pp. 484–494.

M. A. Ranzato, S. Chopra, M. Auli, and W. Zaremba, ‘‘Sequence level training with recurrent neural networks,’’ in Proc. 4th Int. Conf. Learn. Represent. (ICLR), 2016, pp. 1–16.

See, P. J. Liu, and C. D. Manning, ‘‘Get to the point: Summarization with pointer-generator networks,’’ in Proc. 55th Annu. Meeting Assoc. Comput. Linguistics, vol. 1, 2017, pp. 1073–1083.

J. Jin, P. Ji, and R. Gu, ‘‘Identifying comparative customer requirements from product online reviews for competitor analysis,’’ Eng. Appl. Artif. Intell., vol. 49, pp. 61–73, Mar. 2016.

K. Yadav, A. Singh, M. Dhiman, R. Kaundal, A. Verma, and D. Yadav, ‘‘Extractive text summarization using deep learning approach,’’ Int. J. Inf. Technol., vol. 14, no. 5, pp. 2407–2415, 2022.

J. Li, W. Monroe, T. Shi, S. Jean, A. Ritter, and D. Jurafsky, ‘‘Adversarial learning for neural dialogue generation,’’ 2017, arXiv:1701.06547.

D. Bahdanau, K. Cho, and Y. Bengio, ‘‘Neural machine translation by jointly learning to align and translate,’’ 2014, arXiv:1409.0473.

Z. Yang, D. Yang, C. Dyer, X. He, A. Smola, and E. Hovy, ‘‘Hierarchical attention networks for document classification,’’ in Proc. Conf. North Amer. Chapter Assoc. Comput. Linguistics, Hum. Lang. Technol., 2016, pp. 1480–1489.

L. Xu, S. Lv, Y. Deng, and X. Li. A Weakly Supervised Surface Defect Detection Based on Convolutional Neural Network. IEEE Access, 8:42285–42296, 2020.

Erkan G, Radev DR (2004) LexRank: graph-based lexical centrality as salience in text summarization. J Artif Intell 22:457–479.

He Z et al. (2012) Document summarization based on data reconstruction. Twenty-Sixth AAAI Conference on Artificial Intelligence: 620–626.

Downloads

Published

24.03.2024

How to Cite

Garugu, S. ., & Bhaskari, L. . (2024). Enhancing Abstractive Text Summarization using Two-Staged Network for Telugu Language (EATS2N). International Journal of Intelligent Systems and Applications in Engineering, 12(19s), 686–695. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/5179

Issue

Section

Research Article