Exploring Wikipedia's Category Graph for Query Classification

نویسندگان

  • Milad Alemzadeh
  • Richard Khoury
  • Fakhri Karray
چکیده

Wikipedia’s category graph is a network of 400,000 interconnected category labels, and can be a powerful resource for many classification tasks. However, its size and the lack of order can make it difficult to navigate. In this paper, we present a new algorithm to efficiently explore this graph and discover accurate classification labels. We implement our algorithm as the core of a query classification system and demonstrate its reliability using the KDD CUP 2005 competition as a benchmark.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Extending a multilingual Lexical Resource by bootstrapping Named Entity Classification using Wikipedia's Category System

Named Entity Recognition and Classification (NERC) is a well-studied NLP task which is typically approached using machine learning algorithms that rely on training data whose creation usually is expensive. The high costs result in the lack of NERC training data for many languages. An approach to create a multilingual NE corpus was presented in Wentland et al. (2008). The resulting resource call...

متن کامل

Wikipedia as an Ontology for Describing Documents

Identifying topics and concepts associated with a set of documents is a task common to many applications. It can help in the annotation and categorization of documents and be used to model a person's current interests for improving search results, business intelligence or selecting appropriate advertisements. One approach is to associate a document with a set of topics selected from a fixed ont...

متن کامل

QEA: A New Systematic and Comprehensive Classification of Query Expansion Approaches

A major problem in information retrieval is the difficulty to define the information needs of user and on the other hand, when user offers your query there is a vast amount of information to retrieval. Different methods , therefore, have been suggested for query expansion which concerned with reconfiguring of query by increasing efficiency and improving the criterion accuracy in the information...

متن کامل

Exploring Large RDF Datasets using a Faceted Search

We propose a facet-based RDF data exploration mechanism that lets the user visualize large RDF datasets by successively refining a query. The novel aspects of our work are: i) the SPARQL query pattern is visualized as a query graph, ii) the successive refinements are visualized in a query refinement graph, and iii) the result triples are visualized as a result RDF graph. The scheme is scalable ...

متن کامل

Coupling Materialized View Selection to Multi Query Optimization: Hyper Graph Approach

Materialized views are queries whose results are stored and maintained in order to facilitate access to data in their underlying base tables of extremely large databases. Selecting the best materialized views for a given query workload is a hard problem. Studies on view selection have considered sharing common sub expressions and other multi-query optimization techniques. Multi-Query Optimizati...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011