Pre-processing of Web Logs for Mining World Wide Web Browsing Patterns
نویسنده
چکیده
Web usage mining is a type of web mining, which exploits data mining techniques to extract required information from navigational behaviour of WWW users. Hence the data must be preprocessed to improve the efficiency and ease of the mining process. So it is important to pre-process before applying data mining techniques to discover user access patterns from web logs. The main task of data pre-processing is to remove noisy and irrelevant data, and to reduce data size for the pattern discovery phase. This paper mainly focus on the first phase of web usage mining i.e data preprocessing with activities like field extraction and data cleaning algorithms. Field extraction algorithm used for separating the single line of the web log file into fields. Data cleaning algorithm eliminates the inconsistent and unnecessary items in the analyzed data.
منابع مشابه
Extraction of Frequent Patterns from Web Logs using Web Log Mining Techniques
World Wide Web is a huge repository of web pages and links. It provides profusion of information for the Internet users. The growth of web is tremendous as approximately one million pages are added daily. User’s accesses are recorded in web logs. Because of the incredible usage of web, the web log files are growing at a faster rate and the size is becoming huge. Web data mining is the applicati...
متن کاملAn Efficient Algorithm for Data Cleaning of Web Logs with Spider Navigation Removal
The World Wide Web is growing massively larger with the exponential growth of websites providing the user with heaps of information. Text files called as web logs are used to store the clicks of a user whenever a user visits a website. Web usage mining is a stream of web mining that involves the applications of mining techniques to be applied on the server logs containing the user clickstreams....
متن کاملAn Effective System for Mining Web Log
The WWW provides a simple yet effective media for users to search, browse, and retrieve information in the Web. Web log mining is a promising tool to study user behaviors, which could further benefit web-site designers with better organization and services. Although there are many existing systems that can be used to analyze the traversal path of web-site visitors, their performance is still fa...
متن کاملData Preparation for Mining World Wide Web Browsing
The World Wide Web (WWW) continues to grow at an astounding rate in both the sheer volume of traac and the size and complexity of Web sites. The complexity of tasks such as Web site design, Web server design, and of simply navigating through a Web site have increased along with this growth. An important input to these design tasks is the analysis of how a Web site is being used. Usage analysis ...
متن کاملSelf-organizing map based web pages clustering using web logs
A Web-based business always wants to have the ability to track users’ browsing behavior history. This ability can be achieved by using Web log mining technologies. In this paper, we introduce a Self-Organizing Map (SOM) based approach to mining Web log data. The SOM network maps the web pages into a two-dimensional map based on the users’ browsing history. Web pages with the similar browsing pa...
متن کامل