نتایج جستجو برای: entity resolution

تعداد نتایج: 429428  

2015
John R. Talburt

Inverted indexing is a commonly used technique for improving the performance of entity resolution algorithms by reducing the number of pair-wise comparisons necessary to arrive at acceptable results. This chapter describes how inverted indexing can also be used as a data partitioning strategy to perform entity resolution on large datasets in a distributed processing environment. This chapter di...

2015
Fumiko Kobayashi John R. Talburt

Entity Identity Information Management (EIIM) systems provide the information technology support for Master Data Management (MDM) systems. One of the most important configurations of an EIIM system is identity resolution. In an identity resolution configuration, the EIIM system accepts a batch of entity references and returns the corresponding entity identifiers. However, these batch EIIM syste...

2014
Mingyuan Cui Qing Wang Huizhi Liang

Entity resolution, or record linkage, is the process that identifies data records over one or more datasets which refer to the same real world entity. To deal with large datasets, many real-life applications require scalable and high-quality entity resolution techniques. Blocking techniques can help to scale-up the entity resolution process. Locality sensitive hashing (LSH) is an approximate bl...

2007
Dongwon Lee C. Lee Giles David Reese Sandeep Purao Wang-Chien Lee Piotr Berman Reka Albert Raj Acharya

Real data are “dirty.” Despite active research on integrity constraints enforcement and data cleaning, real data in real database applications are still dirty. To make matters worse, both diverse formats/usages of modern data and demands for largescale data handling make this problem even harder. In particular, to surmount the challenges for which conventional solutions against this problem no ...

2013
Hakan Kardes Deepak Konidena Siddharth Agrawal Micah Huff Ang Sun

Entity Resolution is the task of identifying which records in a database refer to the same entity. A standard machine learning pipeline for the entity resolution problem consists of three major components: blocking, pairwise linkage, and clustering. The blocking step groups records by shared properties to determine which pairs of records should be examined by the pairwise linker as potential du...

2013
Cheng Chen Josh Hanna John R. Talburt Mathias Brochhausen William R. Hogan

Referent Tracking (RT) is an ontology-based approach to tracking individual persons, processes, diseases, prescriptions, etc. Each such individual is assigned a unique identifier, called an Instance Unique Identifier (IUI). Assignment of duplicate IUIs to a single entity is highly problematic in practical applications, as is the assignment of one IUI to two different entities. To address these ...

2012
Heeyoung Lee Marta Recasens Angel X. Chang Mihai Surdeanu Daniel Jurafsky

We introduce a novel coreference resolution system that models entities and events jointly. Our iterative method cautiously constructs clusters of entity and event mentions using linear regression to model cluster merge operations. As clusters are built, information flows between entity and event clusters through features that model semantic role dependencies. Our system handles nominal and ver...

2016
Vasilis Efthymiou Kostas Stefanidis Vassilis Christophides

Entity resolution aims to identify descriptions of the same entity within or across knowledge bases. In this work, we present the Minoan ER platform for resolving entities described by linked data in the Web (e.g., in RDF). To reduce the required number of comparisons, Minoan ER performs blocking to place similar descriptions into blocks and executes comparisons to identify matches only between...

Journal: :CoRR 2017
Beidi Chen Anshumali Shrivastava Rebecca C. Steorts

Entity resolution identifies and removes duplicate entities in large, noisy databases and has grown in both usage and new developments as a result of increased data availability. Nevertheless, entity resolution has tradeoffs regarding assumptions of the data generation process, error rates, and computational scalability that make it a difficult task for real applications. In this paper, we focu...

2010
Christopher Dozier Ravikumar Kondadadi Marc Light Arun Vachher Sriharsha Veeramachaneni Ramdev Wudali

Named entities in text are persons, places, companies, etc. that are explicitly mentioned in text using proper nouns. The process of finding named entities in a text and classifying them to a semantic type, is called named entity recognition. Resolution of named entities is the process of linking a mention of a name in text to a pre-existing database entry. This grounds the mention in something...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید