A Reasonable Rough Approximation for Clustering Web Users

نویسندگان

  • Duoqian Miao
  • Min Chen
  • Zhihua Wei
  • Qiguo Duan
چکیده

Due to the uncertainty in accessing web pages, analysis of web logs faces some challenges. Several rough k-means cluster algorithms have been proposed and successfully applied to web usage mining. However, they did not explain why rough approximations of these cluster algorithms were introduced. This paper analyzes the characteristics of the data in the boundary areas of clusters, and then a rough k-means cluster algorithm based on a reasonable rough approximation (RKMrra) is proposed. Finally RKMrra is applied to web access logs. In the experiments RKMrra compares to Lingras and West algorithm and Peters algorithm with respect to five characteristics. The results show that RKMrra discovers meaningful clusters of web users and its rough approximation is more reasonable.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Neighborhood Clustering of Web Users With Rough K-Means

Data collection and analysis in web mining faces certain unique challenges. Due to a variety of reasons inherent in web browsing and web logging, the likelihood of bad or incomplete data is higher than conventional applications. The analytical techniques in web mining need to accommodate such data. Fuzzy and rough sets provide the ability to deal with incomplete and approximate information. Fuz...

متن کامل

A density based clustering approach to distinguish between web robot and human requests to a web server

Today world's dependence on the Internet and the emerging of Web 2.0 applications is significantly increasing the requirement of web robots crawling the sites to support services and technologies. Regardless of the advantages of robots, they may occupy the bandwidth and reduce the performance of web servers. Despite a variety of researches, there is no accurate method for classifying huge data ...

متن کامل

Rough clustering of sequential data

This paper presents a new indiscernibility-based rough agglomerative hierarchical clustering algorithm for sequential data. In this approach, the indiscernibility relation has been extended to a tolerance relation with the transitivity property being relaxed. Initial clusters are formed using a similarity upper approximation. Subsequent clusters are formed using the concept of constrained-simil...

متن کامل

A Framework of Rough Clustering for Web Transactions

Grouping web transactions into clusters is important in order to obtain better understanding of user’s behavior. Currently, the rough approximation-based clustering technique has been used to group web transactions into clusters. However, the processing time is still an issue due to the high complexity for finding the similarity of upper approximations of a transaction which used to merge betwe...

متن کامل

RoCeT: Rough Clustering for web Transactions

Grouping web transactions into clusters is important in order to obtain better understanding of user’s behavior. Currently, the rough approximation-based clustering technique has been used to group web transactions into clusters. However, the processing time is still an issue due to the high complexity for finding the similarity of upper approximations of a transaction which used to merge betwe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006