BMIRTE: Design of a Bioinspired Model for Improving Readability of Translated Sentences via Ensemble Operations
Keywords:
Readability, Bioinspired, Corpus, Accuracy, Precision, Recall, Delay, Ensemble, OperationsAbstract
Sentence readability is determined via multiple metrics that include, Flesch Reading Ease, Fog Scale, Flesch-Kincaid Grade Levels, Smog Index, Coleman-Liau Index, Automated Readability Index, Dale-Chall Readability Score, Linear Write Formula, and their consensus. But individual use of these models results in uncertain sentence structures, which limits their usability levels. Moreover, scanning through every combination of these techniques to generate fused readability models is impractical and highly complex under real-time scenarios. To overcome these limitations, this text proposes design of a novel Bioinspired Model for Improving Readability of Translated Sentences via Ensemble Operations. The proposed model initially collects a set of translated texts, and applies stochastic ensemble readability testing via Genetic Algorithm (GA) based process. Due to use of stochastic modelling, the proposed optimizer is capable of identifying corpus specific readability evaluation techniques, that can be used to improve overall readability of multiple sentence types. To perform this task, a readability-based fitness function was evaluated, which assisted in identification of optimum ensemble operations. The model also tracks iterative performance levels of different ensemble combinations, which assists in incrementally improving real-time readability performance for different corpus types.The proposed model was evaluated on multiple translated corpuses, and it was observed that the proposed model outperformed various state-of-the-art methods in terms of readability accuracy, precision, recall, computational delay and memory requirement metrics. Due to which, it was observed to be capable of deployment for a wide variety of real-time post-processing scenarios for translated-texts.
Downloads
References
R. Guimarães Rodrigues, K. Tavares Rodrigues, R. Reis Gomes, L. Ferrari, E. Ogasawara and G. Paiva Guedes, "BRAPT: A New Metric for Translation Evaluation Based on Psycholinguistic Perspectives," in IEEE Latin America Transactions, vol. 18, no. 07, pp. 1264-1271, July 2020, doi: 10.1109/TLA.2020.9099768.
C. Leong, X. Liu, D. F. Wong and L. S. Chao, "Exploiting Translation Model for Parallel Corpus Mining," in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 29, pp. 2829-2839, 2021, doi: 10.1109/TASLP.2021.3105798.
H. Sun, R. Wang, K. Chen, M. Utiyama, E. Sumita and T. Zhao, "Unsupervised Neural Machine Translation With Cross-Lingual Language Representation Agreement," in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 28, pp. 1170-1182, 2020, doi: 10.1109/TASLP.2020.2982282.
T. Kano, S. Sakti and S. Nakamura, "End-to-End Speech Translation With Transcoding by Multi-Task Learning for Distant Language Pairs," in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 28, pp. 1342-1355, 2020, doi: 10.1109/TASLP.2020.2986886.
A. Chaturvedi, A. Chakrabarty, M. Utiyama, E. Sumita and U. Garain, "Ignorance is Bliss: Exploring Defenses Against Invariance-Based Attacks on Neural Machine Translation Systems," in IEEE Transactions on Artificial Intelligence, vol. 3, no. 4, pp. 518-525, Aug. 2022, doi: 10.1109/TAI.2021.3123931.
S. Lee, L. Liu and W. Choi, "Iterative Translation-Based Data Augmentation Method for Text Classification Tasks," in IEEE Access, vol. 9, pp. 160437-160445, 2021, doi: 10.1109/ACCESS.2021.3131446.
T. Hirasawa, M. Kaneko, A. Imankulova and M. Komachi, "Pre-Trained Word Embedding and Language Model Improve Multimodal Machine Translation: A Case Study in Multi30K," in IEEE Access, vol. 10, pp. 67653-67668, 2022, doi: 10.1109/ACCESS.2022.3185243.
A. Ohri and T. Schmah, "Machine Translation of Mathematical Text," in IEEE Access, vol. 9, pp. 38078-38086, 2021, doi: 10.1109/ACCESS.2021.3063715.
A. Ghafoor et al., "The Impact of Translating Resource-Rich Datasets to Low-Resource Languages Through Multi-Lingual Text Processing," in IEEE Access, vol. 9, pp. 124478-124490, 2021, doi: 10.1109/ACCESS.2021.3110285.
Y. Luo and Y. Xiang, "Application of Data Mining Methods in Internet of Things Technology for the Translation Systems in Traditional Ethnic Books," in IEEE Access, vol. 8, pp. 93398-93407, 2020, doi: 10.1109/ACCESS.2020.2994551.
P. Tran, T. Nguyen, D. -H. Vu, H. -A. Tran and B. Vo, "A Method of Chinese-Vietnamese Bilingual Corpus Construction for Machine Translation," in IEEE Access, vol. 10, pp. 78928-78938, 2022, doi: 10.1109/ACCESS.2022.3186978.
M. Alharbi, R. S. Laramee and T. Cheesman, "TransVis: Integrated Distant and Close Reading of Othello Translations," in IEEE Transactions on Visualization and Computer Graphics, vol. 28, no. 2, pp. 1397-1414, 1 Feb. 2022, doi: 10.1109/TVCG.2020.3012778.
Z. Li et al., "Text Compression-Aided Transformer Encoding," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 7, pp. 3840-3857, 1 July 2022, doi: 10.1109/TPAMI.2021.3058341.
T. Zheng, X. Wang and X. Xu, "A Novel Method of Detecting Chinese Rendered Text on Tilted Screen of Mobile Devices," in IEEE Access, vol. 8, pp. 25840-25847, 2020, doi: 10.1109/ACCESS.2020.2971617.
J. Zhao, W. Qi, W. Zhou, N. Duan, M. Zhou and H. Li, "Conditional Sentence Generation and Cross-Modal Reranking for Sign Language Translation," in IEEE Transactions on Multimedia, vol. 24, pp. 2662-2672, 2022, doi: 10.1109/TMM.2021.3087006.
J. Qiang and X. Wu, "Unsupervised Statistical Text Simplification," in IEEE Transactions on Knowledge and Data Engineering, vol. 33, no. 4, pp. 1802-1806, 1 April 2021, doi: 10.1109/TKDE.2019.2947679.
C. Park, C. Lee, Y. Yang and H. Lim, "Ancient Korean Neural Machine Translation," in IEEE Access, vol. 8, pp. 116617-116625, 2020, doi: 10.1109/ACCESS.2020.3004879.
R. Chandra and V. Kulkarni, "Semantic and Sentiment Analysis of Selected Bhagavad Gita Translations Using BERT-Based Language Framework," in IEEE Access, vol. 10, pp. 21291-21315, 2022, doi: 10.1109/ACCESS.2022.3152266.
Y. -S. Lim, E. -J. Park, H. -J. Song and S. -B. Park, "A Non-Autoregressive Neural Machine Translation Model With Iterative Length Update of Target Sentence," in IEEE Access, vol. 10, pp. 43341-43350, 2022, doi: 10.1109/ACCESS.2022.3169419.
J. Xu, W. Ding and H. Zhao, "Based on Improved Edge Detection Algorithm for English Text Extraction and Restoration From Color Images," in IEEE Sensors Journal, vol. 20, no. 20, pp. 11951-11958, 15 Oct.15, 2020, doi: 10.1109/JSEN.2020.2964939.
J. Guo, Z. Zhang, L. Xu, B. Chen and E. Chen, "Adaptive Adapters: An Efficient Way to Incorporate BERT Into Neural Machine Translation," in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 29, pp. 1740-1751, 2021, doi: 10.1109/TASLP.2021.3076863.
C. Liu, Y. Liu, L. Jin, S. Zhang, C. Luo and Y. Wang, "EraseNet: End-to-End Text Removal in the Wild," in IEEE Transactions on Image Processing, vol. 29, pp. 8760-8775, 2020, doi: 10.1109/TIP.2020.3018859.
Z. Lu et al., "Exploring Multi-Stage Information Interactions for Multi-Source Neural Machine Translation," in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 30, pp. 562-570, 2022, doi: 10.1109/TASLP.2021.3120592.
M. Yang, R. Wang, K. Chen, X. Wang, T. Zhao and M. Zhang, "A Novel Sentence-Level Agreement Architecture for Neural Machine Translation," in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 28, pp. 2585-2597, 2020, doi: 10.1109/TASLP.2020.3021347.
M. Alhawarat and A. O. Aseeri, "A Superior Arabic Text Categorization Deep Model (SATCDM)," in IEEE Access, vol. 8, pp. 24653-24661, 2020, doi: 10.1109/ACCESS.2020.2970504.
Y. Peng and J. Qi, "Reinforced Cross-Media Correlation Learning by Context-Aware Bidirectional Translation," in IEEE Transactions on Circuits and Systems for Video Technology, vol. 30, no. 6, pp. 1718-1731, June 2020, doi: 10.1109/TCSVT.2019.2907400.
X. She, J. Chen and G. Chen, "Joint Learning With BERT-GCN and Multi-Attention for Event Text Classification and Event Assignment," in IEEE Access, vol. 10, pp. 27031-27040, 2022, doi: 10.1109/ACCESS.2022.3156918.
S. Abbas, M. U. Khan, S. U. -J. Lee, A. Abbas and A. K. Bashir, "A Review of NLIDB With Deep Learning: Findings, Challenges and Open Issues," in IEEE Access, vol. 10, pp. 14927-14945, 2022, doi: 10.1109/ACCESS.2022.3147586.
H. Lee, G. Kim, Y. Hur and H. Lim, "Visual Thinking of Neural Networks: Interactive Text to Image Synthesis," in IEEE Access, vol. 9, pp. 64510-64523, 2021, doi: 10.1109/ACCESS.2021.3074973.
X. Zhang, X. Li, Y. Yang and R. Dong, "Improving Low-Resource Neural Machine Translation With Teacher-Free Knowledge Distillation," in IEEE Access, vol. 8, pp. 206638-206645, 2020, doi: 10.1109/ACCESS.2020.3037821.
R. Kizito, W. S. Okello and S. Kagumire, "Design and implementation of a Luganda text normalization module for a speech synthesis software program," in SAIEE Africa Research Journal, vol. 111, no. 4, pp. 149-154, Dec. 2020, doi: 10.23919/SAIEE.2020.9194384.
Downloads
Published
How to Cite
Issue
Section
License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.