Random Table and Its Ground Truth Automatic Generation: A Tool for Table Understanding Research
نویسندگان
چکیده
We developed a software tool to assist table understanding research. It can analyze any given table ground truth and generate documents that include similar table elements while have more variety on both table and non-table parts. Based on our novel content matching ground truthing idea, the table ground truth data for the generated table elements become available with little manual work. The validity of the proposed strategy was confirmed by our table detection algorithm development. We made this software package publicly available.
منابع مشابه
Automatic Table Ground Truth Generation and a Background-Analysis-Based Table Structure Extraction Method
In this paper, we first describe an automatic table ground truth generation system which can efficiently generate a large amount of accurate table ground truth suitable for the development of table detection algorithms. Then a novel background-analysis-based, coarse-to-fine table identification algorithm and an X-Y cut table decomposition algorithm are described. We discuss an experimental prot...
متن کاملTable Metadata: Headers, Augmentations and Aggregates
A sample of 200 web tables was interactively converted into layout-independent Augmented Wang Notation (AWN) using the Table Abstraction Tool (TAT). The resulting XML ground-truth files list for each table (1) cell contents, (2) relationships between the hierarchical column and row headers and the value/content/data cells, (3) designators for aggregates like totals and averages, and (4) ancilla...
متن کاملTable structure understanding and its performance evaluation
With the large number of existing documents and the increasing speed in the production of new documents, finding efficient methods to process these documents for their content retrieval and storage becomes critical. Tables are a popular and efficient document element type. Therefore, table structure understanding is an important problem in the document layout analysis field. This paper presents...
متن کاملApplication of Regional Input-Output Table Generated by GRIT Method for Examining Employment Generation and Importance of Housing Sector in Isfahan Province
Regional Input-output tables historical background goes back to 1950 when its idea was proposed by Walter Isard. In Iran, regional accounts generation started in the fifth development plan before the Islamic Revolution, but these accounts have been generated only for some provinces. The generation and completion of these accounts and tables on the basis of surveying and mechanical methods or a ...
متن کاملAutomatic understanding of group behavior using fuzzy temporal logic
Automatic behavior understanding refers to the generation of situation descriptions from machine perception. World models created through machine perception can be used by a reasoning engine to deduce knowledge about the observed scene. For this study, the required machine perception is annotated, allowing us to focus on the reasoning problem. The applied reasoning engine is based on fuzzy metr...
متن کامل