A Model of Website Usage Visualization Estimated on Clickstream Data with Apache Flume Using Improved Markov Chain Approximation

نویسنده

  • AMJAD JUMAAH FRHAN
چکیده

Visualization of the website clickstream data has been a pivotal process as it aids in defining the user preferences. It includes the processes of gathering, investigating and reporting about the web pages that are being viewed by the users. Clickstream visualization is primarily employed by organizations which focuses on gaining the user preferences and improve their products or services towards achieving maximum satisfaction of users. Most existing visualization tools come up short in helping the organizations achieve this goal. Markov chain model is the commonly utilized method for developing data visualization tools. However the issues such as occlusion and inability to provide clear data visualization display makes the tools volatile. This paper aims at developing a visualization tool named as WebClickviz by resolving the above mentioned issues by improving the Markov chain modelling. A heuristic method of Kolmogorov– Smirnov distance and maximum likelihood estimator is introduced for improving the clear display of visualization. These concepts are employed between the underlying distribution states to minimize the Markov distribution. The proposed model named as WebClickviz is performed in Hadoop Apache Flume which is a highly advanced tool. Through the experiments conducted on evaluation dataset, it can be shown that the proposed model outperforms the existing models with higher visualization accuracy. Key-Words: Clickstream data, Data Visualization, Hadoop, WebClickviz, Apache Flume, Markov chain, Kolmogorov– Smirnov distance, maximum likelihood estimator, heuristic approximation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Analysis of the Spell of Rainy Days in Lake Urmia Basin using Markov Chain Model

In this study, the Frequency and the spell of rainy days was analyzed in Lake Uremia Basin using Markov chain model. For this purpose, the daily precipitation data of 7 synoptic stations in Lake Uremia basin were used for the period 1995- 2014. The daily precipitation data at each station were classified into the wet and dry state and the fitness of first order Markov chain on data series was e...

متن کامل

Variable Length Markov Chains for Web Usage Mining

Web usage mining is usually defined as the discipline that concentrates on developing techniques that model and study users’ Web navigation behavior by means of analyzing data obtained from user interactions with Web resources; see (Mobasher, 2006; Liu, 2007) for recent reviews on web usage mining. When users access Web resources they leave a trace behind that is stored in log files, such trace...

متن کامل

Discovery of Significant Usage Patterns from Clusters of Clickstream Data

Discovery of usage patterns from Web data is one of the primary purposes for Web Usage Mining. In this paper, a variation of “user preferred navigational trail” called Significant Usage Pattern (SUP) is proposed. SUPs are patterns that are extracted from clustered abstracted clickstream data, with a higher normalized probability of occurrence and may begin/end with specific Web page(s). The nov...

متن کامل

Optimizing Red Blood Cells Consumption Using Markov Decision Process

In healthcare systems, one of the important actions is related to perishable products such as red blood cells (RBCs) units that its consumption management in different periods can contribute greatly to the optimality of the system. In this paper, main goal is to enhance the ability of medical community to organize the RBCs units’ consumption in way to deliver the unit order timely with a focus ...

متن کامل

Analysis of Users’ Web Navigation Behavior using GRPA with Variable Length Markov Chains

With the never-ending growth of Web services and Web-based information systems, the volumes of click stream and user data collected by Web-based organizations in their daily operations has reached enormous proportions. Analyzing such huge data can help to evaluate the effectiveness of promotional campaigns, optimize the functionality of Web-based applications, and provide more personalized cont...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017