Semantic Suffix Tree Clustering

نویسندگان

  • Jongkol Janruang
  • Sumanta Guha
چکیده

This paper proposes a new algorithm, called Semantic Suffix Tree Clustering (SSTC), to cluster web search results containing semantic similarities. The distinctive methodology of the SSTC algorithm is that it simultaneously constructs the semantic suffix tree through an on-depth and on-breadth pass by using semantic similarity and string matching. The semantic similarity is derived from the WordNet lexical database for the English language. SSTC uses only subject-verb-object classification to generate clusters and readable labels. The algorithm also implements directed pruning to reduce the sub-tree sizes and to separate semantic clusters. Experimental results show that the proposed algorithm has better performance than conventional Suffix Tree Clustering (STC).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semantic Suffix Net Clustering for Search Results

Suffix Tree Clustering (STC) uses the suffix tree structure to find a set of snippets that share a common phrase and uses this information to propose clusters. As a result, STC is a fast incremental algorithm for automatic clustering and labeling but it cannot cluster semantically similar snippets. However, the meaning of the words is indeed an important property that relates them to other word...

متن کامل

Improving Web Search Results Using Semantic Clustering

This paper consider the problem of search engine that are not capable of retrieving appropriate result on query given. Most of the users are not able to give the appropriate query to get what exactly they wanted to retrieve. So the search engine retrieves a massive list of data, which are ranked by the page rank algorithm or relevancy algorithm or human judgment algorithm. If the relevant resul...

متن کامل

Clustering of Web Search Results Using Semantic

Clustering is related to data mining for information retrieval. Relevant information is retrieved quickly while doing the clustering of documents. It organizes the documents into groups; each group contains the documents of similar type content. Different clustering algorithms are used for clustering the documents such as partitioned clustering (K-means Clustering) and Hierarchical Clustering (...

متن کامل

Suffix Tree Based Incremental Web Services Clustering Method

How to discover the Web Services which fit to the requesters’ needs from an increasingly large-scale services registry in a fast and accurately way is a key problem in Web Services research. Applying clustering method to improve service index structure in service registry is a feasible way to this problem. For the existing service clustering methods are time-consuming, only produce static and s...

متن کامل

A semantics-based method for clustering of Chinese web search results

Information explosion is a critical challenge to the development of modern information systems. In particular, when the application of an information system is over the Internet, the amount of information over the web has been increasing exponentially and rapidly. Search engines, such as Google and Baidu, are essential tools for people to find the information from the Internet. Valuable informa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010