10.1145/1860559.1860617acmconferencesArticle/Chapter ViewAbstractPublication PagesdocengConference Proceedingsconference-collections
poster

Towards a common evaluation strategy for table structure recognition algorithms

Online:21 September 2010Publication History

ABSTRACT

A number of methods for evaluating table structure recognition systems have been proposed in the literature, which have been used successfully for automatic and manual optimization of their respective algorithms. Unfortunately, the lack of standard, ground-truthed datasets coupled with the ambiguous nature of how humans interpret tabular data has made it difficult to compare the obtained results between different systems developed by different research groups.

With reference to these approaches, we describe our experiences in comparing our algorithm for table detection and structure recognition to another recently published system using a freely available dataset of 75 PDF documents. Based on examples from this dataset, we define several classes of errors and propose how they can be treated consistently to eliminate ambiguities and ensure the repeatability of the results and their comparability between different systems from different research groups.

References

  1. }}F. Cesarini, S. Marinai, L. Sarti, and G. Soda. Trainable table location in document images. In Proc. of ICPR 2002, Vol. 3, pp. 236--240, 2002.Google ScholarGoogle ScholarCross RefCross Ref
  2. }}T. Hassan. Evaluating Table Structure Recognition Algorithms. PRIP Technical Report #125, ftp://ftp.prip.tuwien.ac.at/pub/publications/trs/tr125.pdf July 201.Google ScholarGoogle Scholar
  3. }}T. Hassan and R. Baumgartner. Table recognition and understanding from PDF files. In Proc. of ICDAR 2007. vol. 2, pp. 1143--1147, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. }}J. Hu, R. Kashi, D. Lopresti, and G. Wilfong. Table structure recognition and its evaluation. In Proc. of DR VIII, 2001.Google ScholarGoogle Scholar
  5. }}J. Hu, R. Kashi, D. Lopresti, and G. Wilfong. Evaluating the performance of table processing algorithms. Intl. J. of Doc. Anal. and Recog., 4(3):140--153, March 2002.Google ScholarGoogle ScholarCross RefCross Ref
  6. }}J. Hu, R. Kashi, D. Lopresti, G. Wilfong, and G. Nagy. Why table ground-truthing is hard. In Proc. of ICDA. 2001, pp. 129--133, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. }}T. Kieninger and A. Dengel. Applying the T-Recs table recognition system to the business letter domain. In Proc. of ICDAR 2001, pp. 518--522, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. }}T. Kieninger and A. Dengel. An approach towards benchmarking of table structure recognition results. In Proc. of ICDAR 2005, pp. 1232--1236, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. }}M. Ruffolo and E. Oro. PDF-TREX: An approach for recognizing and extracting tables from PDF documents. In Proc. of ICDAR 2009, pp. 906--910, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. }}M. Ruffolo and E. Oro. PDF-TREX dataset. http://staff.icar.cnr.it/ruffolo/files/PDF-TREX/Dataset.zip accessed Sept. 2005.Google ScholarGoogle Scholar
  11. }}B. Yildiz, K. Kaiser, and S. Miksch. pdf2table: A method to extract table information from PDF files. In Proc. of Indian Intl. Conf. on AI 2005, pp. 1773--1785, 2005..Google ScholarGoogle Scholar

Index Terms

  1. Towards a common evaluation strategy for table structure recognition algorithms

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!