Boosting Handwritten Arabic Text Recognition using Deep Autoencoders and Data Augmentation Techniques

Authors

  • Hicham Lamtougui Computer Science Department, Faculty of Sciences Dhar EL Mahraz, Sidi Mohamed Ben Abdellah University, Fez, Morocco
  • Hicham El Moubtahij Higher School of Technology, University of Ibn Zohr, Agadir, Morocco
  • Hassan Fouadi Computer Science Department, Faculty of Sciences Dhar EL Mahraz, Sidi Mohamed Ben Abdellah University, Fez, Morocco
  • Khalid Satori Computer Science Department, Faculty of Sciences Dhar EL Mahraz, Sidi Mohamed Ben Abdellah University, Fez, Morocco

Keywords:

handwritten characters, Deep Learning, data augmentation, Variational AE

Abstract

The recognition of handwritten characters and numbers is a complex challenge in the field of pattern recognition, especially for the Arabic language. While significant progress has been made for the automatic recognition of Latin handwritten characters, methods and approaches for the Arabic language remain insufficient. Deep learning technologies, in particular auto-encoders (AE), offer new perspectives for handwriting recognition. In this article, we introduce the different types of most popular AE, such as Convolutional AE (CAE), Sparse AE (SAE), Denoising AE (DAE), and Variational AE (VAE), and evaluate their performance on two reference databases: the Modified Arabic Digits dataBase (MADBase) and Arabic Handwritten Character Dataset (AHCD). Using data augmentation to improve results, the VAE algorithm showed higher accuracy than other Deep Learning algorithms on both databases, with very encouraging results of 98.77% for MADBase and 98.42% for AHCD.

Downloads

Download data is not yet available.

References

P.-Y. Yin, Pattern recognition, BoD–Books on Demand, 2009.

C.L. Liu, F. Yin, D.H. Wang, Q.F. Wang, Online and offline handwritten Chinese character recognition: Benchmarking on new databases, Pattern Recognit. 46 (2013) 155–162.

O.J. ONI, F.O. ASAHIAH, Computational modelling of an optical character recognition system for Yorùbá printed text images, Sci Afr. 9 (2020) e00415. https://doi.org/10.1016/j.sciaf.2020.e00415.

Elsawy, M. Loey, H.M. El-Bakry, A. El-Sawy, H. El-Bakry, Arabic Handwritten Characters Recognition using Convolutional Neural Network Master researchers View project Strategic Business Analytics and Alternative View project Arabic Handwritten Characters Recognition using Convolutional Neural Network, n.d. https://www.researchgate.net/publication/313891953.

Sherstinsky, Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network, Physica D. 404 (2020) 132306.

Majumdar, R. Singh, M. Vatsa, Face Verification via Class Sparsity Based Supervised Encoding, IEEE Trans Pattern Anal Mach Intell. 39 (2017) 1273–1280.

M. Chen, X. Shi, Y. Zhang, D. Wu, M. Guizani, Deep Feature Learning for Medical Image Analysis with Convolutional Autoencoder Neural Network, IEEE Trans Big Data. 7 (2017) 750–758. https://doi.org/10.1109/tbdata.2017.2717439.

P. Vincent, A Connection Between Score Matching and Denoising Autoencoders, n.d.

L. Zhang, Y. Lu, B. Wang, F. Li, Z. Zhang, Sparse Auto-encoder with Smoothed l1 Regularization, Neural Process Lett. 47 (2018) 829–839. https://doi.org/10.1007/s11063-017-9668-5.

D.P. Kingma, M. Welling, An Introduction to Variational Autoencoders, (2019). https://doi.org/10.1561/2200000056.

X.X. Niu, C.Y. Suen, A novel hybrid CNN-SVM classifier for recognizing handwritten digits, Pattern Recognit. 45 (2012) 1318–1325. https://doi.org/10.1016/j.patcog.2011.09.021.

S. Abdleazeem, E. El-Sherif, Arabic handwritten digit recognition, International Journal on Document Analysis and Recognition. 11 (2008) 127–141

Y. Lecun, E. Bottou, Y. Bengio, P. Haffner, Gradient-Based Learning Applied to Document Recognition, 1998.

J.H. and A.M. AlKhateeb, DBN-Based learning for Arabic handwritten digit recognition using DCT features, in: 2014: pp. 222–226.

H.E.-B.& M.L. Ahmed El-Sawy, CNN for Handwritten Arabic Digits Recognition Based on LeNet-5, in: Proceedings of the International Conference on Advanced Intelligent Systems and Informatics 2016, 2016: pp. 566–575.

M.A. Mudhsh, R. Almodfer, Arabic handwritten alphanumeric character recognition using very deep neural network, Information (Switzerland). 8 (2017). https://doi.org/10.3390/info8030105.

M. Loey, A. El-Sawy, H. El-Bakry, Deep Learning Autoencoder Approach for Handwritten Arabic Digits Recognition, n.d. http://datacenter.aucegypt.edu/shazeem/.

R.S. Alkhawaldeh, Arabic (Indian) digit handwritten recognition using recurrent transfer deep architecture, Soft Comput. 25 (2021) 3131–3141. https://doi.org/10.1007/s00500-020-05368-8.

Younis, Khaled S. Arabic hand-written character recognition based on deep convolutional neural networks. Jordanian Journal of Computers and Information Technology, 2017, vol. 3, no 3.

Boufenar, A. Kerboua, M. Batouche, Investigation on deep learning for off-line handwritten Arabic character recognition, Cogn Syst Res. 50 (2018) 180–195.

H. Alyahya, M.M. Ben Ismail, A. Al-Salman, Deep ensemble neural networks for recognizing isolated Arabic handwritten characters, ACCENTS Transactions on Image Processing and Computer Vision. 6 (2020) 68–79.

M. Shams, A.A. Elsonbaty, W.Z. Elsawy, Arabic Handwritten Character Recognition based on Convolution Neural Networks and Support Vector Machine, 2020. www.ijacsa.thesai.org.

N. Altwaijry, I. Al-Turaiki, Arabic handwriting recognition system using convolutional neural network, Neural Comput Appl. 33 (2021) 2249–2261. https://doi.org/10.1007/s00521-020-05070-8.

Al Bataineh, A. Mairaj, D. Kaur, Autoencoder based semi-supervised anomaly detection in turbofan engines, International Journal of Advanced Computer Science and Applications. 11 (2020).

G.E. Hinton, R.R. Salakhutdinov, Reducing the Dimensionality of Data with Neural Networks, 2006.

Azarang, H.E. Manoochehri, N. Kehtarnavaz, Convolutional Autoencoder-Based Multispectral Image Fusion, IEEE Access. 7 (2019) : pp/ 35673–35683.

X. Guo, X. Liu, E. Zhu, J. Yin, Deep Clustering with Convolutional Autoencoders, in: Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Springer Verlag, 2017: pp. 373–382. https://doi.org/10.1007/978-3-319-70096-0_39.

A.B. Shinde, J. Bagade, R. Bhimanpallewar, Y.H. Dandawate, Image Compression of Handwritten Devanagari Text Documents Using a Convolutional Autoencoder, International Journal of Intelligent Systems and Applications in Engineering. 11 (2023) 449–457.

Ng, CS294A Lecture notes Sparse autoencoder, n.d.

P. Vincent, H. Larochelle, Y. Bengio, P.-A. Manzagol, Extracting and Composing Robust Features with Denoising Autoencoders, n.d.

O.O. Karadag, O.E. Cicek, Empirical evaluation of the effectiveness of variational autoencoders on data augmentation for the image classification problem, International Journal of Intelligent Systems and Applications in Engineering. 8 (2020) 116–120.

D.P. Kingma, M. Welling, Auto-Encoding Variational Bayes, (2013). http://arxiv.org/abs/1312.6114.

Shorten, T.M. Khoshgoftaar, A survey on Image Data Augmentation for Deep Learning, J Big Data. 6 (2019). https://doi.org/10.1186/s40537-019-0197-0.

H. Lamtougui, H. El Moubtahij, H. Fouadi, K. Satori, An Efficient Hybrid Model for Arabic Text Recognition, Computers, Materials and Continua. 74 (2023) 2871–2888.

Lavanya, A. ., & Priya, N. S. . (2023). Enriched Model of Case Based Reasoning and Neutrosophic Intelligent System for DDoS Attack Defence in Software Defined Network based Cloud. International Journal on Recent and Innovation Trends in Computing and Communication, 11(4s), 141–148. https://doi.org/10.17762/ijritcc.v11i4s.6320

Mr. Anish Dhabliya. (2013). Ultra Wide Band Pulse Generation Using Advanced Design System Software . International Journal of New Practices in Management and Engineering, 2(02), 01 - 07. Retrieved from http://ijnpme.org/index.php/IJNPME/article/view/14

Downloads

Published

21.09.2023

How to Cite

Lamtougui, H. ., Moubtahij, H. E. ., Fouadi, H. ., & Satori, K. . (2023). Boosting Handwritten Arabic Text Recognition using Deep Autoencoders and Data Augmentation Techniques. International Journal of Intelligent Systems and Applications in Engineering, 11(4), 800–809. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/3613

Issue

Section

Research Article