Probabilistic Relational Models of On-line User Behavior Early Explorations
نویسندگان
چکیده
We propose the usefulness of probabilistic relational methods for modeling user behavior at web sites. Web logs (aka "click streams"), server logs, and other data sources, taken as datasets for traditional machine learning algorithms, violate the iid assumption of most algorithms. Requests ("clicks") are not independent within a session, sessions for a visitor are not independent of one another, and page types, in their interaction with behavioral profile, are highly correlated, both by static link structure and dynamic navigation sequence. We introduce probabilistic relational modeling, and compare a series of increasingly sophisticated models, ranging from a simple, non-relational Bayesian network model of a click through traditional Hidden Markov Models (HMMs) to a new representation, a fully relational extensions of HMMs, that includes visitors, sessions, clicks, and pages as participating entities. We measure the performance of the series of models on the task of predicting whether or not the current request is the last in a session. Results show a significant increase in performance, as measured by ROC AUC (area under the curve).
منابع مشابه
Relational Bayesian models of on-line user behavior
We examine the utility of relational probabilistic methods for modeling user behavior at web sites. Web logs (aka "click streams"), taken as datasets for traditional machine learning algorithms, violate the iid assumption of most algorithms. Requests ("clicks") are not independent within a session, sessions for a visitor are not independent of one another, and page types, in their interaction w...
متن کاملA Probabilistic Approach to Modeling Socio - Behavioral Interactions by Arti Ramesh
Title of dissertation: A Probabilistic Approach to Modeling Socio-Behavioral Interactions Arti Ramesh, Doctor of Philosophy, 2016 Dissertation directed by: Professor Lise Getoor Department of Computer Science In our ever-increasingly connected world, it is essential to build computational models that represent, reason, and model the underlying characteristics of real-world networks. Data genera...
متن کاملHINRec: Scalable Recommendation in Heterogeneous Information Networks
We develop HINRec, a new recommendation model which is capable of incorporating extra relational information present in heterogeneous information networks (HINs) to improve recommendation quality. HINRec models sparse node behaviors in HINs, scaling with the total number of edges of all relations which participate in inference. HINRec explicitly models correlations in node behavior between diff...
متن کاملCompiling Relational Database Schemata into Probabilistic Graphical Models
A majority of scientific and commercial data is stored in relational databases. Probabilistic models over such datasets would allow probabilistic queries, error checking, and inference of missing values, but to this day machine learning expertise is required to construct accurate models. Fortunately, current probabilistic programming tools ease the task of constructing such models [1, 2, 3, 4, ...
متن کاملEcosystem Analysis Using Probabilistic Relational Modeling
In this paper, we present the results of initial explorations into the application of relational model discovery methods to building comprehensive ecosystem models from data. Working with collaborators at the USGS Biological Resources Discipline and at the Environmental Protection Agency, we are engaged in two projects that apply relational probabilistic model discovery to building “community-l...
متن کامل