Pre-processing of Web Logs for Mining World Wide Web Browsing Patterns

نویسنده

  • Yogish H K
چکیده

Web usage mining is a type of web mining, which exploits data mining techniques to extract required information from navigational behaviour of WWW users. Hence the data must be preprocessed to improve the efficiency and ease of the mining process. So it is important to pre-process before applying data mining techniques to discover user access patterns from web logs. The main task of data pre-processing is to remove noisy and irrelevant data, and to reduce data size for the pattern discovery phase. This paper mainly focus on the first phase of web usage mining i.e data preprocessing with activities like field extraction and data cleaning algorithms. Field extraction algorithm used for separating the single line of the web log file into fields. Data cleaning algorithm eliminates the inconsistent and unnecessary items in the analyzed data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Extraction of Frequent Patterns from Web Logs using Web Log Mining Techniques

World Wide Web is a huge repository of web pages and links. It provides profusion of information for the Internet users. The growth of web is tremendous as approximately one million pages are added daily. User’s accesses are recorded in web logs. Because of the incredible usage of web, the web log files are growing at a faster rate and the size is becoming huge. Web data mining is the applicati...

متن کامل

An Efficient Algorithm for Data Cleaning of Web Logs with Spider Navigation Removal

The World Wide Web is growing massively larger with the exponential growth of websites providing the user with heaps of information. Text files called as web logs are used to store the clicks of a user whenever a user visits a website. Web usage mining is a stream of web mining that involves the applications of mining techniques to be applied on the server logs containing the user clickstreams....

متن کامل

An Effective System for Mining Web Log

The WWW provides a simple yet effective media for users to search, browse, and retrieve information in the Web. Web log mining is a promising tool to study user behaviors, which could further benefit web-site designers with better organization and services. Although there are many existing systems that can be used to analyze the traversal path of web-site visitors, their performance is still fa...

متن کامل

Data Preparation for Mining World Wide Web Browsing

The World Wide Web (WWW) continues to grow at an astounding rate in both the sheer volume of traac and the size and complexity of Web sites. The complexity of tasks such as Web site design, Web server design, and of simply navigating through a Web site have increased along with this growth. An important input to these design tasks is the analysis of how a Web site is being used. Usage analysis ...

متن کامل

Self-organizing map based web pages clustering using web logs

A Web-based business always wants to have the ability to track users’ browsing behavior history. This ability can be achieved by using Web log mining technologies. In this paper, we introduce a Self-Organizing Map (SOM) based approach to mining Web log data. The SOM network maps the web pages into a two-dimensional map based on the users’ browsing history. Web pages with the similar browsing pa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013