Integrative Deep Learning Strategy for Table Structure Classification and Recognition

Authors

  • Sridhar Patthi, Sivaneasan Bala, T Sathish Kumar, Prasun Chakrabarti

Keywords:

Classification, structures, endorsement, learning, machine intelligence, machine learning, machine vision, image identification

Abstract

This paper discusses a unique method that takes advantage of deep neural networks to identify tables in documents. Traditional methods of table detection rely on dataset-specific heuristics that are prone to mistake. These existing methods are giving leverages to data for identifying tables with any arrangement. Table-structure recognition (TSR) is the arch focus of the investigation to identify the technique of digital document with table cell specification reproducibility and replicability. The proposed model works in identifying and specifying the extraction of table cell in structured, semi-structured and unstructured visual formats. Tables are crucial in presenting structural and semi-structural data which help in retrieval and analysis in databases. Table recognition involves identifying both the logical structure (cell relationships and spanning) and the physical structure (bounding boxes or cell content locations). Existing methods excel at appropriate predicting logical structures but at a high struggle with accurate physical structures like bounding boxes, which are vital for tasks like text extraction or table quality assurance. This proposal introduces a sequential coordinate decoding approach to enhance the accuracy of bounding box predictions by incorporating more visual information. While the coordinate sequence decoder provides a global context by leveraging the logical structure decoder's representation. This is deficient in local visual details, and is essential for accurate bounding box predictions made. This work with deep learning techniques solves the discussed challenge with an accuracy of 88.75%.

Downloads

Download data is not yet available.

References

Ting Chen, Saurabh Saxena, Lala Li, David J. Fleet, and Geoffrey Hinton. Pix2seq: A language modeling framework for object detection. In International Conference on Learning Representations, 2022. 4.

Zewen Chi, Heyan Huang, HengDa Xu, Houjin Yu, Wanxuan Yin, and XianLing Mao. Complicated table structure recognition, 2019. 5, 6, 7

St ́ephane Clinchant, Herv ́e D ́ejean, Jean-Luc Meunier,Eva Maria Lang, and Florian Kleber. Comparing machine learning approaches for table recognition in historical register books. 2018 13th IAPR International Workshop on Document Analysis Systems (DAS), pages 133–138, 2018.

Jifeng Dai, Yi Li, Kaiming He, and Jian Sun. R-fcn: Object detection via region-based fully convolutional networks. In Advances in Neural Information Processing Systems, pages 379–387. Curran Associates Inc., 2016.

Yuntian Deng, Anssi Kanervisto, Jeffrey Ling, and Alexander M. Rush. Image-to-markup generation with coarse-to-fine attention. In Proceedings of the 34th International Conference on Machine Learning, pages 980–989, 2017.

Yuntian Deng, David Rosenberg, and Gideon Mann. Challenges in end-to-end neural scientific table recognition. In 2019 International Conference on Document Analysis and Recognition, pages 894–901, 2019.

Pascal Fischer, Alen Smajic, Giuseppe Abrami, and Alexander Mehler. Multi-type-td-tsr - extracting tables from document images using a multi-stage pipeline for table detection and table structure recognition: From ocr to structured table representations. In Stefan Edelkamp, Ralf M ̈oller, and Elmar Rueckert, editors, KI 2021: Advances in Artificial Intelligence, pages 95–108. Springer International Publishing, 2021.

Liangcai Gao, Yilun Huang, Herv ́e D ́ejean, Jean-Luc Meunier, Qinqin Yan, Yu Fang, Florian Kleber, and Eva Lang. Icdar 2019 competition on table detection and recognition (ct-dar). In 2019 International Conference on Document Analysis and Recognition, pages 1510–1515, 2019.

Max G ̈obel, Tamir Hassan, Ermelinda Oro, and Giorgio Orsi. A methodology for evaluating algorithms for table understanding in pdf documents. In Proceedings of the 2012 ACM Symposium on Document Engineering, page 45–48. Association for Computing Machinery, 2012.

Zengyuan Guo, Yuechen Yu, Pengyuan Lv, Chengquan Zhang, Haojie Li, Zhihui Wang, Kun Yao, Jingtuo Liu, and Jingdong Wang. TRUST: An Accurate and End-to-End Table structure Recognizer Using Splitting-based Transformers, Aug. 2022.

Max G ̈obel, Tamir Hassan, Ermelinda Oro, and Giorgio Orsi. Icdar 2013 table competition. In 2013 12th International Conference on Document Analysis and Recognition, pages1449–1453, 2013.

Kaiming He, Georgia Gkioxari, Piotr Dollar, and Ross Girshick. Mask r-cnn. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Oct 2017.

Antonio Jimeno-Yepes, Xu Zhong, and Douglas Burdick. Icdar 2021 competition on scientific literature parsing. arXive-prints, page arXiv:2106.14616, 2021.

Saqib Ali Khan, Syed Khalid, Muhammad Ali Shahzad, and Faisal Shafait. Table structure extraction with bi-directional gated recurrent unit networks. In 2019 International Conference on Document Analysis and Recognition (ICDAR), pages 1366–1371, 2019.

Enuji Lee, Jaewoo Park, Hyung Il Koo, and Nam Ik Cho. Deep-learning and graph-based approach to table structure recognition. Multimedia Tools and Applications, 81:5827-5848, 2022.

Minghao Li, Lei Cui, Shaohan Huang, Furu Wei, Ming Zhou, and Zhoujun Li. TableBank: Table benchmark for image-based table detection and recognition. In Proceedings of the 12th Language Resources and Evaluation Conference, pages 1918–1925. European Language Resources Association, May 2020.

Schreiber, Sebastian, et al. "Deepdesrt: Deep learning for detection and structure recognition of tables in document images." 2017 14th IAPR international conference on document analysis and recognition (ICDAR). Vol. 1. IEEE, 2017.

Siddiqui, Shoaib Ahmed, et al. "Deeptabstr: Deep learning-based table structure recognition." 2019 international conference on document analysis and recognition (ICDAR). IEEE, 2019.

Gilani, Azka, et al. "Table detection using deep learning." 2017 14th IAPR international conference on document analysis and recognition (ICDAR). Vol. 1. IEEE, 2017.

Hashmi, Khurram Azeem, et al. "Current status and performance analysis of table recognition in document images with deep neural networks." IEEE Access 9 (2021): 87663-87685.

Paliwal, Shubham Singh, et al. "Tablenet: Deep learning model for end-to-end table detection and tabular data extraction from scanned document images." 2019 International Conference on Document Analysis and Recognition (ICDAR). IEEE, 2019.

Nishida, Kyosuke, et al. "Understanding the semantic structures of tables with a hybrid deep neural network architecture." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 31. No. 1. 2017.

Rashid, Sheikh Faisal, et al. "Table recognition in heterogeneous documents using machine learning." 2017 14th IAPR International conference on document analysis and recognition (ICDAR). Vol. 1. IEEE, 2017.

Hashmi, Khurram Azeem, et al. "Guided table structure recognition through anchor optimization." IEEE Access 9 (2021): 113521-113534.

Shigarov, Alexey, Andrey Mikhailov, and Andrey Altaev. "Configurable table structure recognition in untagged PDF documents." Proceedings of the 2016 ACM symposium on document engineering. 2016.

Siddiqui, Shoaib Ahmed, et al. "Rethinking semantic segmentation for table structure recognition in documents." 2019 international conference on document analysis and recognition (ICDAR). IEEE, 2019.

Lin, Weihong, et al. "Tsrformer: Table structure recognition with transformers." Proceedings of the 30th ACM International Conference on Multimedia. 2022.

Downloads

Published

12.06.2024

How to Cite

Sridhar Patthi. (2024). Integrative Deep Learning Strategy for Table Structure Classification and Recognition. International Journal of Intelligent Systems and Applications in Engineering, 12(4), 5182–5191. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/7313

Issue

Section

Research Article