Global Attention on BiLSTMs with BPE for English to Telugu CLIR

Authors

  • B N V Narasimha Raju Research Scholar, Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Vaddeswaram-522502, India
  • K. V. V. Satyanarayana Professor, Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Vaddeswaram-522502, India
  • M. S. V. S. Bhadri Raju Professor, Department of Computer Science and Engineering, S R K R Engineering College, Bhimavaram-534204, India

Keywords:

Attention, Global Attention, Cross-language IR, Bidirectional LSTMs, Byte pair encoding, Preprocessing, NMT

Abstract

An effective Cross-Lingual Information Retrieval (CLIR), will heavily rely on the accurate translation of queries and this is typically accomplished through Neural Machine Translation (NMT). NMT serves as a widely utilized method for translating queries from one language to another. In the present work, NMT is used to translate a query in English to the Indian language Telugu. For performing translation, NMT requires a parallel corpus. However the English-Telugu parallel corpora are resource-poor, so it may not be possible to supply the required amount of parallel corpus. The NMT will struggle to handle problems like Out Of Vocabulary (OOV) in resource-poor languages. The Byte Pair Encoding (BPE) mechanism will be helpful in solving OOV problems in resource-poor languages. In BPE, it segments the rare words into subword units and tries to translate the subword units. In NMT, the efficiency of translation still has issues in handling Named Entity Recognition (NER). The NER problems can be fulfilled using Bidirectional Long Short-Term Memories (BiLSTMs). The BiLSTMs will be helpful for training the system in the forward and backward directions for the dataset, which helps in recognizing the named entities. These NMT mechanisms will be sufficient for handling sentences without having long-range dependencies, but they will face issues while handling long-range dependencies in the sentences. Global Attention is useful to address these challenges, which is an integration between the encoder and decoder in NMT. This global attention mechanism proves beneficial in enhancing the translation quality, particularly for source sentences with long-range dependencies. In NMT, the Bilingual Evaluation Understudy (BLEU) scores and other parameters have shown that the efficiency in translating the source sentences is higher for global Attention on BiLSTMS with BPE than in regular models.

Downloads

Download data is not yet available.

References

Karunesh Kumar Arora, Shyam S. Agrawal, “Pre-Processing of English-Hindi Corpus for Statistical Machine Translation,” Computación y Sistemas, pp. 725-737, 2017.

Richard Kimera, Daniela N. Rim, Heeyoul Choi, “Building a Parallel Corpus and Training Translation Models Between Luganda and English,” Journal of KIISE, Vol. 49, No. 11, pp. 1009-1016, 2022.

G. Eason, B. Noble, and I. N. Sneddon, “On certain integrals of Lipschitz-Hankel type involving products of Bessel functions,” Phil. Trans. Roy. Soc. London, vol. A247, pp. 529–551, April 1955. (references)

Thi-Vinh Ngo, Thanh-Le Ha, Phuong-Thai Nguyen, and Le-Minh Nguyen. “Overcoming the Rare Word Problem for low-resource language pairs in Neural Machine Translation,” In Proceedings of the 6th Workshop on Asian Translation, pp. 207–214, 2019.

Mengjiao Zhang and Jia Xu, “Byte-based Multilingual NMT for Endangered Languages,” In Proceedings of the 29th International Conference on Computational Linguistics, pp. 4407–4417, 2022.

Rico Sennrich, Barry Haddow and Alexandra Birch, “Neural Machine Translation of Rare Words with Subword Units,”In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, August 7-12, pp. 1715-1725, 2016, DOI: 10.18653/v1/P16-1162.

Mai Oudah, Amjad Almahairi and Nizar Habash, “The Impact of Preprocessing on Arabic-English Statistical and Neural Machine Translation,”ArXiv.org, Aug. 19-23, pp. 214-221, 2019.

B. N. V. Narasimha Raju, M. S. V. S. Bhadri Raju, K. V. V. Satyanarayana, “Effective preprocessing based neural machine translation for English to Telugu cross-language information retrieval,” IAES International Journal of Artificial Intelligence (IJ-AI), pp. 306-315, Vol. 10, No. 2, June 2021, DOI: 10.11591/ijai.v10.i2.pp306-315.

M. Schuster and K. K. Paliwal, "Bidirectional recurrent neural networks," IEEE Transactions on Signal Processing, vol. 45, no. 11, pp. 2673-2681, Nov. 1997.

Sutskever, I., Vinyals, O., and Le, Q, “Sequence to sequence learning with neural networks,” Proceedings of the 27th International Conference on Neural Information Processing Systems,Vol. 2, pp. 3104–3112, 2014.

Sébastien, J., Kyunghyun, C., Memisevic, R., and Bengio, Y, “On using very large target vocabulary for neural machine translation,” In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, pp. 1-10, 2015.

Ali Araabi, Christof Monz, and Vlad Niculae, “How Effective is Byte Pair Encoding for Out-Of-Vocabulary Words in Neural Machine Translation?,” In Proceedings of the 15th biennial conference of the Association for Machine Translation in the Americas (Volume 1: Research Track), pp. 117–130, 2022.

Aloka Fernando and Surangika Ranathunga, “Data Augmentation to Address Out of VocabularyProblem in Low Resource Sinhala English Neural Machine Translation,” In Proceedings of the 35th Pacific Asia Conference on Language, Information and Computation, pp. 61–70, 2021.

Longtu Zhang and Mamoru Komachi, “Neural Machine Translation of Logographic Language Using Sub-character Level Information,” In Proceedings of the Third Conference on Machine Translation: Research Papers, pp. 17–25, 2018.

Martin Sundermeyer, Tamer Alkhouli, Joern Wuebker, and Hermann Ney, “Translation Modeling with Bidirectional Recurrent Neural Networks,” Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 14–25, October 25-29, 2014, DOI: 10.3115/v1/D14-1003.

Ilya Sutskever, Oriol Vinyals, and Quoc V. Le, “Sequence to sequence learning with neural networks,” In Proceedings of the 27th International Conference on Neural Information Processing Systems, Vol. 2, pp. 3104–3112, 2014.

Hamidreza Ghader and Christof Monz, “What does Attention in Neural Machine Translation Pay Attention to?,” In Proceedings of the Eighth International Joint Conference on Natural Language Processing, Vol. 1, pp. 30–39, 2017.

Bahdanau, Dzmitry & Cho, Kyunghyun & Bengio, Y, “Neural Machine Translation by Jointly Learning to Align and Translate,” ArXiv. 1409, 2014.

Thang Luong, Hieu Pham, and Christopher D. Manning, “Effective Approaches to Attention-based Neural Machine Translation,” In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1412–1421, 2015.

Sébastien Jean, Kyunghyun Cho, Roland Memisevic, and Yoshua Bengio,” On Using Very Large Target Vocabulary for Neural Machine Translation,” In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Vol. 1, pp. 1–10, 2015.

Downloads

Published

07.01.2024

How to Cite

Raju, B. N. V. N. ., Satyanarayana, K. V. V. ., & Raju, M. S. V. S. B. . (2024). Global Attention on BiLSTMs with BPE for English to Telugu CLIR. International Journal of Intelligent Systems and Applications in Engineering, 12(10s), 152–159. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/4357

Issue

Section

Research Article