Text Simplification Improves Text Translation from Gujarati Regional Language to English: An Experimental Study

Authors

  • Dhawal Khem, Shailesh Panchal, Chetan Bhatt

Keywords:

Text Simplification, Text Translation, Text Readability, Natural Language Processing, Indian Language

Abstract

Text translation plays an important role in increasing the reach of information technology to the large portion of the population. The text translation helps to overcome the language barrier. An adequately translated and simplified text helps to improve the quality of communication. In recent times many researchers proposed research work on text translation.  However, the grammar and complex formation of regional languages are bottlenecks in effective text translation. Many researchers proposed text simplification before text translation. The text simplification improves the readability and understandability of the text. However, regional language simplification and translation is still a challenging task for the researchers. Gujarati is an Indian regional language. In this paper, an experimental setup is proposed for improved text translation from Gujarati language to English language. Results show that the text simplification improves the quality of translation. We also experimented with text translation of the Indian national language - Hindi showed an improvement in translation results.

Downloads

Download data is not yet available.

Author Biography

Dhawal Khem, Shailesh Panchal, Chetan Bhatt

Dhawal Khem*1, Shailesh Panchal2, Chetan Bhatt3

*1Ph.D Scholar, Computer Engineering, Gujarat Technology University, GTU, Ahmedabad(Gujarat), India. ORCID ID :  0000-0002-8064-5954

2Professor, PG-Cyber security, Graduate School of Engineering & Technology (GSET), Ahmedabad(Gujarat), India

3Professor, MCA Department, K. K. Shashtri College, Ahmedabad(Gujarat), India.

* Corresponding Author Email: khemdhawal@gmail.com

References

B. B. CK Bhensdadia Pushpak Bhattacharyya, “Introduction to Gujarati wordnet,” Third Natl. Workshop Indowordnet Proc., vol. 494, 2002.

C. Boitet, “The French National MT-Project: Technical organization and translation results of CALLIOPE-AERO,” Comput. Transl., vol. 1, no. 4, pp. 239–267, 1986, doi: 10.1007/BF00936424.

L. Feng, “Text simplification: A survey,” City Univ. N. Y. Tech Rep, pp. 7–23, 2008. Hautli-Janisz, “Pushpak Bhattacharyya: Machine translation,” Mach. Transl., vol. 29, no. 3–4, pp. 285–289, Dec. 2015, doi: 10.1007/s10590-015-9170-7.

G. V. Garje and G. K. Kharate, “Survey of Machine Translation Systems in India,” Int. J. Nat. Lang. Comput., vol. 2, no. 5, pp. 47–67, Oct. 2013, doi: 10.5121/ijnlc.2013.2504.

L. Feng, “Text Simplification: A Survey,” p. 35.

W. Contributors, “Gujarati Language,” Definitions, 2020. https://en.wikipedia.org/w/index.php?title=Gujarati_language&oldid=962021892 (accessed Jun. 08, 2020).

Wikipedia contributors, “Hindi Language,” in Definitions, Qeios, 2020. doi: 10.32388/W2U5JG.

R. Chandrasekar, C. Doran, and B. Srinivas, “Motivations and methods for text simplification,” in Proceedings of the 16th conference on Computational linguistics -, Morristown, NJ, USA, 1996, vol. 2, p. 1041. doi: 10.3115/993268.993361.

S. SPanchal, P. P Shukla, P. R Panchal, J. S Kolte, and B. H N, “Gujarati WordNet A Lexical Database,” Int. J. Comput. Appl., vol. 116, no. 20, pp. 6–8, 2015, doi: 10.5120/20450-2803.

C. Callison-Burch, P. Koehn, and M. Osborne, “Improved Statistical Machine Translation Using Paraphrases,” in Proceedings of the Human Language Technology Conference of the NAACL, Main Conference, New York City, USA, Jun. 2006, pp. 17–24. Accessed: Aug. 21, 2022. [Online]. Available: https://aclanthology.org/N06-1003

S. Mirkin, “Confidence-driven Rewriting for Improved Translation,” Sep. 2013, Accessed: Aug. 21, 2022. [Online]. Available: https://www.academia.edu/4090244/Confidence_driven_Rewriting_for_Improved_Translation

W. Aziz, M. Dymetman, L. Specia, and S. Mirkin, “Learning an Expert from Human Annotations in Statistical Machine Translation: the Case of Out-of-Vocabulary Words,” Saint Raphaël, France, May 2010. Accessed: Aug. 21, 2022. [Online]. Available: https://aclanthology.org/2010.eamt-1.31

S. Tyagi, D. Chopra, I. Mathur, and N. Joshi, “Classifier based text simplification for improved machine translation,” in 2015 International Conference on Advances in Computer Engineering and Applications, Mar. 2015, pp. 46–50. doi: 10.1109/ICACEA.2015.7164711.

S. Mirkin, S. Venkatapathy, M. Dymetman, and I. Calapodescu, “SORT: An Interactive Source-Rewriting Tool for Improved Translation,” in Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics: System Demonstrations, Sofia, Bulgaria, Aug. 2013, pp. 85–90. Accessed: Aug. 21, 2022. [Online]. Available: https://aclanthology.org/P13-4015

H. Saggion, E. Gómez-Martínez, E. Etayo, A. Anula, and L. Bourg, “Text Simplification in Simplext. Making Text More Accessible,” vol. 47, Sep. 2011.

G. H. Paetzold and L. Specia, “Text Simplification as Tree Transduction,” 2013. Accessed: Aug. 21, 2022. [Online]. Available: https://aclanthology.org/W13-4813

J. Ameta, N. Joshi, and I. Mathur, “Improving the quality of Gujarati-Hindi Machine Translation through part-of-speech tagging and stemmer-assisted transliteration,” Int. J. Nat. Lang. Comput., vol. 2, no. 3, pp. 49–54, Jun. 2013, doi: 10.5121/ijnlc.2013.2305.

P. Pimpale and R. Patel, “Reordering rules for English-Hindi SMT,” Apr. 2013, Accessed: Aug. 21, 2022. [Online]. Available: https://www.academia.edu/7421948/Reordering_rules_for_English_Hindi_SMT

J. N. Farr, J. J. Jenkins, and D. G. Paterson, “Simplification of Flesch Reading Ease Formula,” J. Appl. Psychol., vol. 35, no. 5, pp. 333–337, 1951, doi: 10.1037/h0062427.

M. Solnyshkina, R. Zamaletdinov, L. A. Gorodetskaya, and A. I. Gabitov, “Evaluating Text Complexity and Flesch-Kincaid Grade Level,” J. Soc. Stud. Educ. Res., vol. 8, pp. 238–248, Nov. 2017.

“THE GUNNING FOG READABILITY FORMULA.” https://readabilityformulas.com/gunning-fog-readability-formula.php (accessed Aug. 21, 2022).

K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu, “BLEU,” in Proceedings of the 40th Annual Meeting on Association for Computational Linguistics - ACL ’02, Morristown, NJ, USA, 2001, p. 311. doi: 10.3115/1073083.1073135.

Y. Zhang, S. Vogel, and A. Waibel, “Interpreting Bleu/NIST scores: How much improvement do we need to have a better system,” 2004.

Xu, W., Napoles, C., Pavlick, E., Chen, Q., & Callison-Burch, C. (2016). “Optimizing Statistical Machine Translation for Text Simplification”. Transactions of the Association for Computational Linguistics, 4, 401–415. Retrieved from https://cocoxu.github.io/publications/tacl2016-smt-simplification.pdf

Gujarati Rudhiprayog Ane Kahevat Sangrah. (n.d.). Retrieved from https://drive.google.com/uc?id=1gH7v1XoJ3f5ajsUg0Rz286LmNkKYfg2h&export=download

Ramesh, G., Doddapaneni, S., Bheemaraj, A., Jobanputra, M., Ak, R., Sharma, A., … Khapra, M. S. (2021). Samanantar: The Largest Publicly Available Parallel Corpora Collection for 11 Indic Languages. ArXiv [Cs.CL]. Retrieved from http://arxiv.org/abs/2104.05596

Bataa, B., & Altangerel, K. (2012). Word Sense Disambiguation in Gujarati Language. Proceedings - 2012 7th International Forum on Strategic Technology, IFOST 2012, (1), 44–47. doi:10.1109/IFOST.2012.6357625

BLEU Score Improved Graph

Downloads

Published

27.01.2023

How to Cite

Dhawal Khem, Shailesh Panchal, Chetan Bhatt. (2023). Text Simplification Improves Text Translation from Gujarati Regional Language to English: An Experimental Study. International Journal of Intelligent Systems and Applications in Engineering, 11(2s), 316–327. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/2699

Issue

Section

Research Article