A new outlier detection approach to discover low hit web pages using sequential frequent pattern mining to improve website’s design
نویسندگان
چکیده
The Internet offers huge volume of data to the users and grows rapidly every day. The web server creates log files regarding details about the page, IP address of the user, browser, and operating system used and time/date stamp regarding browsing patterns and this data is mined to extract useful information using web usage mining. The primary objective of this paper is to find the low hit pages of a website from the log files using finding outliers in sequential mining concept. To cater to the need of this objective, a new algorithm named “Detect Anomaly in Sequential Pattern Algorithm (DASPAT)” is proposed. The proposed algorithm creates candidates using Apriori like approach and discovers the unusual browsing behavior of the users, and the detected UBB are treated as outliers. This paper introduces a new approach to find the low hit web pages in tandem to enable the designers to understand how the user browses the site and allow them to redesign the web site.
منابع مشابه
High Fuzzy Utility Based Frequent Patterns Mining Approach for Mobile Web Services Sequences
Nowadays high fuzzy utility based pattern mining is an emerging topic in data mining. It refers to discover all patterns having a high utility meeting a user-specified minimum high utility threshold. It comprises extracting patterns which are highly accessed in mobile web service sequences. Different from the traditional fuzzy approach, high fuzzy utility mining considers not only counts of mob...
متن کاملOulier Analysis Using Frequent Pattern Mining – A Review
An outlier in a dataset is an observation or a point that is considerably dissimilar to or inconsistent with the remainder of the data. Detection of such outliers is important for many applications and has recently attracted much attention in the data mining research community. In this paper, we present a new method to detect outliers by discovering frequent patterns (or frequent item sets) fro...
متن کاملFP-outlier: Frequent pattern based outlier detection
An outlier in a dataset is an observation or a point that is considerably dissimilar to or inconsistent with the remainder of the data. Detection of such outliers is important for many applications and has recently attracted much attention in the data mining research community. In this paper, we present a new method to detect outliers by discovering frequent patterns (or frequent itemsets) from...
متن کاملUtility Pattern Approach for Mining High Utility Log Items from Web Log Data
. Mining frequent log items is an active area in data mining that aims at searching interesting relationships between items in databases. It can be used to address a wide variety of problems such as discovering association rules, sequential patterns, correlations and much more. Weblog that analyzes a Web site's access log and reports the number of visitors, views, hits, most frequently visited ...
متن کاملOutlier Detection in Wireless Sensor Networks Using Distributed Principal Component Analysis
Detecting anomalies is an important challenge for intrusion detection and fault diagnosis in wireless sensor networks (WSNs). To address the problem of outlier detection in wireless sensor networks, in this paper we present a PCA-based centralized approach and a DPCA-based distributed energy-efficient approach for detecting outliers in sensed data in a WSN. The outliers in sensed data can be ca...
متن کامل