Unsupervised Numerical Information Extraction via Exploiting Syntactic Structures
نویسندگان
چکیده
Numerical information plays an important role in various fields such as scientific, financial, social, statistics, and news. Most prior studies adopt unsupervised methods by designing complex handcrafted pattern-matching rules to extract numerical information, which can be difficult scale the open domain. Other supervised require extra time, cost, knowledge design, understand, annotate training data. To address these limitations, we propose QuantityIE, a novel approach extracting structured representations exploiting syntactic features of both constituency parsing (CP) dependency (DP). The extraction results may also serve distant supervision for zero-shot model training. Our outperforms existing from two perspectives: (1) are simple yet effective, (2) more self-contained. We further retrieval based on QuantityIE answer analytical queries. Experimental demonstrate effectiveness with high fidelity.
منابع مشابه
Exploiting Rich Syntactic Information for Relationship Extraction from Biomedical Articles
This paper proposes a ternary relation extraction method primarily based on rich syntactic information. We identify PROTEIN-ORGANISM-LOCATION relations in the text of biomedical articles. Different kernel functions are used with an SVM learner to integrate two sources of information from syntactic parse trees: (i) a large number of syntactic features that have been shown useful for Semantic Rol...
متن کاملExploiting Syntactic and Semantic Information for Relation Extraction from Wikipedia
The exponential growth of Wikipedia recently attracts the attention of a large number of researchers and practitioners. One of the current challenge on Wikipedia is to make the encyclopedia processable for machines. In this paper, we deal with the problem of extracting relations between entities from Wikipedia’s English articles, which can straightforwardly be transformed into Semantic Web meta...
متن کاملExploiting Rich Syntactic Information for Relation Extraction from Biomedical Articles∗
This paper proposes a ternary relation extraction method primarily based on rich syntactic information. We identify PROTEIN-ORGANISM-LOCATION relations in the text of biomedical articles. Different kernel functions are used with an SVM learner to integrate two sources of information from syntactic parse trees: (i) a large number of syntactic features that have been shown useful for Semantic Rol...
متن کاملMeanings via Syntactic Structures
Chomsky (1957) o ered prescient suggestions about how to formulate theories of understanding for the spoken languages that human children can naturally acquire. We can view his proposal as a prolegomenon to a theory of meaning that combines a layered theory of syntax with an account of how humans can naturally use expressions in acts of referring, asserting, querying, and so on; cp. Austin (196...
متن کاملExploiting Hierarchical Structures for Unsupervised Feature Selection
Feature selection has been proven to be effective and efficient in preparing high-dimensional data for many mining and learning tasks. Features of real-world high-dimensional data such as words of documents, pixels of images and genes of microarray data, usually present inherent hierarchical structures. In a hierarchical structure, features could share certain properties. Such information has b...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Electronics
سال: 2023
ISSN: ['2079-9292']
DOI: https://doi.org/10.3390/electronics12091977