Random Table and Its Ground Truth Automatic Generation: A Tool for Table Understanding Research

نویسندگان

  • Yalin Wang
  • Ihsin T. Phillips
  • Robert Haralick
چکیده

We developed a software tool to assist table understanding research. It can analyze any given table ground truth and generate documents that include similar table elements while have more variety on both table and non-table parts. Based on our novel content matching ground truthing idea, the table ground truth data for the generated table elements become available with little manual work. The validity of the proposed strategy was confirmed by our table detection algorithm development. We made this software package publicly available.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Table Ground Truth Generation and a Background-Analysis-Based Table Structure Extraction Method

In this paper, we first describe an automatic table ground truth generation system which can efficiently generate a large amount of accurate table ground truth suitable for the development of table detection algorithms. Then a novel background-analysis-based, coarse-to-fine table identification algorithm and an X-Y cut table decomposition algorithm are described. We discuss an experimental prot...

متن کامل

Table Metadata: Headers, Augmentations and Aggregates

A sample of 200 web tables was interactively converted into layout-independent Augmented Wang Notation (AWN) using the Table Abstraction Tool (TAT). The resulting XML ground-truth files list for each table (1) cell contents, (2) relationships between the hierarchical column and row headers and the value/content/data cells, (3) designators for aggregates like totals and averages, and (4) ancilla...

متن کامل

Table structure understanding and its performance evaluation

With the large number of existing documents and the increasing speed in the production of new documents, finding efficient methods to process these documents for their content retrieval and storage becomes critical. Tables are a popular and efficient document element type. Therefore, table structure understanding is an important problem in the document layout analysis field. This paper presents...

متن کامل

Application of Regional Input-Output Table Generated by GRIT Method for Examining Employment Generation and Importance of Housing Sector in Isfahan Province

Regional Input-output tables historical background goes back to 1950 when its idea was proposed by Walter Isard. In Iran, regional accounts generation started in the fifth development plan before the Islamic Revolution, but these accounts have been generated only for some provinces. The generation and completion of these accounts and tables on the basis of surveying and mechanical methods or a ...

متن کامل

Automatic understanding of group behavior using fuzzy temporal logic

Automatic behavior understanding refers to the generation of situation descriptions from machine perception. World models created through machine perception can be used by a reasoning engine to deduce knowledge about the observed scene. For this study, the required machine perception is annotated, allowing us to focus on the reasoning problem. The applied reasoning engine is based on fuzzy metr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001