TextDC: Exploring Multidimensional Text Detection via a New Benchmark and Solution
نویسندگان
چکیده
Text detection has been significantly boosted by the development of deep neural networks but most existing methods focus on a single kind text instance (i.e., overlaid text, layered scene text). In this paper, we expand task from dimension to multiple dimensions, thus providing multi-type descriptions for and content analysis videos. Specifically, establish new detect classify instances simultaneously, termed TextDC. As far as know, benchmarks cannot meet requirements proposed task. To end, collect large-scale classification dataset, named Text3C, which is annotated using multilingual labels, location information, categories. Together with collected introduce multi-stage strict evaluation metric, penalizes approaches missing instances, false positive detection, inaccurate boxes, error categories, developing benchmark TextDC addition, extend several state-of-the-art detectors modifying prediction head solve Then, generalized framework designed formulated. Extensive experiments updated are conducted established verify solvability task, challenges effectiveness solution.
منابع مشابه
A new multidimensional model with text dimensions: definition and implementation
We present a new multidimensional model with textual dimensions based on a knowledge structure extracted from the texts, where any textual attribute in a database can be processed, and not only XML texts. This dimension allows to treat the textual data in the same way as the non-textual one in an automatic way, without user’s intervention, so all the classical operations in the multidimensional...
متن کاملRCV1: A New Benchmark Collection for Text Categorization Research
Reuters Corpus Volume I (RCV1) is an archive of over 800,000 manually categorized newswire stories recently made available by Reuters, Ltd. for research purposes. Use of this data for research on text categorization requires a detailed understanding of the real world constraints under which the data was produced. Drawing on interviews with Reuters personnel and access to Reuters documentation, ...
متن کاملTextZoo, a New Benchmark for Reconsidering Text Classification
Text representation is a fundamental concern in Natural Language Processing, especially in text classification. Recently, many neural network approaches with delicate representation model (e.g. FASTTEXT, CNN, RNN and many hybrid models with attention mechanisms) claimed that they achieved state-of-art in specific text classification datasets. However, it lacks an unified benchmark to compare th...
متن کاملsolution of security constrained unit commitment problem by a new multi-objective optimization method
چکیده-پخش بار بهینه به عنوان یکی از ابزار زیر بنایی برای تحلیل سیستم های قدرت پیچیده ،برای مدت طولانی مورد بررسی قرار گرفته است.پخش بار بهینه توابع هدف یک سیستم قدرت از جمله تابع هزینه سوخت ،آلودگی ،تلفات را بهینه می کند،و هم زمان قیود سیستم قدرت را نیز برآورده می کند.در کلی ترین حالتopf یک مساله بهینه سازی غیر خطی ،غیر محدب،مقیاس بزرگ،و ایستا می باشد که می تواند شامل متغیرهای کنترلی پیوسته و گ...
Multidimensional Database Design via Schema Transformation: Turning TPC-H into the TPC-H*d Multidimensional Benchmark
Compared to relational databases, multidimensional database systems enhance data presentation and navigation through intuitive spreadsheet like views and increase performance through aggregated data. In this paper, we present a framework for automating multidimensional database schema design. We successfully used the framework to revolve the well known TPC-H benchmark to become a multidimension...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Electronics
سال: 2022
ISSN: ['2079-9292']
DOI: https://doi.org/10.3390/electronics12010159