Recognition of Historical Kannada Manuscripts using Convolution Neural Network

Authors

  • Rajithkumar B. K, B. V. Uma, H. S. Mohana

Keywords:

Historical Kannada documents, Image recognition, Document analysis, Cultural heritage preservation Convolutional Neural Network

Abstract

Document image analysis has emerged as a field of study with growing importance over the last few decades. Historical documents add another challenge of physical degradation that needs to be tackled in the pre-processing. The main focus of the present work is the classification and identification of Kannada old stone inscription characters. The characters are segmented into lines, words, and characters for easier processing. The segmented images are then preprocessed in order to extract the essential features and remove the redundancies in the image. The preprocessed data is then augmented in order to compensate for the lack of datasets, and the existing dataset is trained in order to create data for the training phase. The machine learning model, Convolutional Neural Network (CNN), is selected. The classifiers based on each model are trained, and the performance of each model is evaluated. The model developed for recognizing Kannada characters achieved a validation accuracy of 95.9%. This outcome demonstrates a significant achievement in processing and digitizing ancient Kannada scripts, considering the complex nature of the language and the diverse characteristics of individual handwriting.

Downloads

Download data is not yet available.

References

Rajithkumar B K et al, “Template matching method for recognition of stone inscripted Kannada characters of different time frames based on correlation analysis”, International Journal of Electrical and Computer Engineering (IJECE) Vol. 4, No. 5, October 2014, pp. 719~729,ISSN: 2088-8708.

Sridevi T.N, et al. “Deep Convolution Neural Network for Degraded Printed Kannada Character Recognitions”. Indian Journal of Computer Science and Engineering. Volume 12 No. 3 May-Jun 2021. DOI: 10.21817/indices/2021/v12i3/211203187

Rajithkumar B K et al, “Extraction of Stone In-scripted Kannada Characters Using Sift Algorithm Based Image Mosaic”, International Journal of Electronics & Communication Technology, Volume 5, Issue 2, April - June 2014.

Rajithkumar B K et al., “Era Identification and Recognition of Stone In-scripted Kannada Characters Using Artificial Neural Networks”:2nd National Conference on Innovation in Computing and Communication Technology, March, 2014.

Haoming Zhang. “Ancient Stone Inscription Image Denoising and Inpainting Methods Based on Deep Neural Networks”. Discrete Dynamics in Nature and Society Vol. 1, 2021. DOI: 10.1155/2021/7675611

Chandrakala, H. T, “Deep Convolution Neural Networks for Recognition of Historical Handwritten Kannada Characters”, In Frontiers in Intelligent Computing: Theory and Applications (pp. 69-77). Springer, Singapore. 2021

Thippeswamy, G. “Recognition of Historical Handwritten Kannada Characters Using Local Binary Pattern Features”. International Journal of Natural Computing Research (IJNCR), 2020

F. Lombardi, “Deep learning for historical document analysis and recognition—a survey,” National Library of Medicine, vol. 10, 2020.

M. R. Gupta, N. P. Jacobson, and E. K. Garcia, “OCR binarization and image pre-processing for searching historical documents,” Pattern Recognition, vol. 40, 2007.

J. Martine, L. Lenc, and P. Kr al, “Building an efficient OCR system for historical documents with little training data,” Neural Computing and Applications, vol. 32, 2020.

R. Manmatha and N. Srimal, “Scale space technique for word segmentation in handwritten documents,” in Scale-Space Theories in Computer Vision, Springer Berlin Heidelberg, 1999.

G. Chen, Q. Chen, X. Zhu, and Y. Chen, “A study of historical documents denoising,” in 2017 10th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (CISP-BMEI), 2017.

P. Sharan, S. Aitha, A. Kumar, A. Trivedi, A. Augustine, and S. R. K. Sarvadevabhatla, “Palmira: A deep deformable network for instance segmentation of dense and uneven layouts in handwritten manuscripts,” CoRR, 2021.

R. I. Minyue Dai Carrie Yang and M. J. Brown., “Experiments with early modern manuscripts and computer-aided transcription,” Pattern Recognition Letters, 2018.

Kshetry and R. Lal, “Image pre-processing and modified adaptive thresholding for improving OCR,” ArXiv, 2021.

M. Shen and H. Lei, “Improving OCR performance with background image elimination,” in 2015 12th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), 2015

T. Blanke, M. Bryant, and M. Hedges, “Open source optical character recognition for historical research,” Journal of Documentation, vol. 68, 2012.

B. J. Bipin Nair, N. Shobharani, N. R. Sreekumar, and G. Ashok, “A two phase denoising approach to remove uneven illumination from ancient note book images,” in 2021 7th International Conference on Advanced Computing and Communication Systems (ICACCS), vol. 1, 2021.

K. Saddami, K. Munadi, Y. Away, and F. Arnia, “Effective and fast binarization method for combined degradation on ancient documents,” Heliyon, vol. 5, 2019.

C. Tensmeyer and T. Martinez, “Historical document image binarization: A review,” SN Computer Science, vol. 1, 2020.

J. A. S anchez, V. Romero, A. H. Toselli, M. Villegas, and E. Vidal, “A set of benchmarks for handwritten text recognition on historical documents,” Pattern Recognition, vol. 94, 2019.

M. Almeida, R. Lins, R. Bernardino, D. Jesus, and B. Lima, “A new binarization algorithm for historical documents,” Journal of Imaging, vol. 4, 2018.

W. Xiong and L. Zhou, “An enhanced binarization framework for degraded historical document images,” EURASIP Journal on Image and Video Processing, vol. 13, 2021

S. Lu and C. L. Tan, “Script and language identification in noisy and degraded document images,” IEEE transactions on pattern analysis and machine intelligence, vol. 30, 2008.

S. Vijayarani and A. Sakila, “Multi-language script identification from document images,” International Research Journal of Modernization in Engineering Technology and Science, vol. 3, 2021

Kumar, H. S et al., “ Versatile OCR for Documents in any Language Printed in Kannada Script”. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP),2020.

Monisha, G. S. et al,: “Effective Survey on Handwriting Character Recognition”. In Computational Method and Data Engineering. Springer, Singapore.2021

Sandhya, N., & Krishnan, R. “Broken Kannada character recognition a neural network based approach”, International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT).2016, March: (pp. 2047-2050). IEEE.

Sandhya, N., Krishnan, R., Babu, D. R., & Rao, N. B.,. “An efficient approach for handling degradation in character recognition.”, International Journal of Advanced Intelligence Paradigms,”.2019, 14(1-2), 14-29.

Aradhya, V. M., Kumar, G. H., Noushath, S., &Shivakumara, P. “Fisher linear discriminant analysis based technique useful for efficient character recognition”, Fourth International Conference on Intelligent Sensing and Information Processing,2006, (pp. 49-52). IEEE

Sandhya, N., Krishnan, R., &Babu, D. R. “A novel local enhancement technique for rebuilding Broken characters in a degraded Kannada script”. In 2015 IEEE International Advance Computing Conference (IACC),2015 June (pp. 176-179). IEEE

Downloads

Published

16.03.2024

How to Cite

B. V. Uma, H. S. Mohana, R. B. K. . (2024). Recognition of Historical Kannada Manuscripts using Convolution Neural Network. International Journal of Intelligent Systems and Applications in Engineering, 12(3), 1138–1147. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/5393

Issue

Section

Research Article