UC3M_13: Disambiguation of Person Names Based on the Composition of Simple Bags of Typed Terms
نویسندگان
چکیده
This paper describes a system designed to disambiguate person names in a set of Web pages. In our approach Web documents are represented as different sets of features or terms of different types (bag of words, URLs, names and numbers). We apply Agglomerative Vector Space clustering that uses the similarity between pairs of analogous feature sets. This system achieved a value of 66% for Fα=0.2 and a value of 48% for Fα=0.5 in the Web People Search Task at SemEval-2007 (Artiles et al., 2007).
منابع مشابه
بهبود صحت ابهامزدایی نام نویسنده با استفاده از خوشهبندی تجمّعی
Today, digital libraries are important academic resources including millions of citations and bibliographic essential information such as titles, author's names and location of publications. From the view of knowledge accumulation management, the ability to search fast, accurate, desired contents, has a great importance. The complexity and similarity in these resources cause many challenges and...
متن کاملThe Comparison of Typed and Handwritten Essays of Iranian EFL Students in terms of Length, Spelling, and Grammar
This study attempted to compare typed and handwritten essays of Iranian EFL students in terms of length, spelling, and grammar. To administer the study, the researchers utilized Alice Touch Typing Tutor software to select 15 upper intermediate students with higher ability to write two essays: one typed and the other handwritten. The students were both males and females between the ages of 22 to...
متن کاملبررسی نقش انواع بافتار همنویسهها در تعیین شباهت بین مدارک
Aim: Automatic information retrieval is based on the assumption that texts contain content or structural elements that can be used in word sense disambiguation and thereby improving the effectiveness of the results retrieved. Homographs are among the words requiring sense disambiguation. Depending on their roles and positions in texts, homograph contexts could be divided to different types, wit...
متن کاملExplore Chinese Encyclopedic Knowledge to Disambiguate Person Names
This paper presents the HITSZ-PolyU system in the CIPS-SIGHAN bakeoff 2012 Task 3, Chinese Personal Name Disambiguation. This system leveraged the Chinese encyclopedia Baidu Baike (Baike) as the external knowledge to disambiguate the person names. Three kinds of features are extracted from Baike. They are the entities’ texts in Baike, the entities’ work-of-art words and titles in the Baike. Wit...
متن کاملWord Sense Disambiguation based on term to term similarity in a context space
This paper describes the exemplar based approach presented by UNED at Senseval-3. Instead of representing contexts as bags of terms and defining a similarity measure between contexts, we propose to represent terms as bags of contexts and define a similarity measure between terms. Thus, words, lemmas and senses are represented in the same space (the context space), and similarity measures can be...
متن کامل