Score As You Lift (SAYL): A Statistical Relational Learning Approach to Uplift Modeling

نویسندگان

  • Houssam Nassif
  • Finn Kuusisto
  • Elizabeth S. Burnside
  • David Page
  • Jude W. Shavlik
  • Vítor Santos Costa
چکیده

We introduce Score As You Lift (SAYL), a novel Statistical Relational Learning (SRL) algorithm, and apply it to an important task in the diagnosis of breast cancer. SAYL combines SRL with the marketing concept of uplift modeling, uses the area under the uplift curve to direct clause construction and final theory evaluation, integrates rule learning and probability assignment, and conditions the addition of each new theory rule to existing ones. Breast cancer, the most common type of cancer among women, is categorized into two subtypes: an earlier in situ stage where cancer cells are still confined, and a subsequent invasive stage. Currently older women with in situ cancer are treated to prevent cancer progression, regardless of the fact that treatment may generate undesirable side-effects, and the woman may die of other causes. Younger women tend to have more aggressive cancers, while older women tend to have more indolent tumors. Therefore older women whose in situ tumors show significant dissimilarity with in situ cancer in younger women are less likely to progress, and can thus be considered for watchful waiting. Motivated by this important problem, this work makes two main contributions. First, we present the first multi-relational uplift modeling system, and introduce, implement and evaluate a novel method to guide search in an SRL framework. Second, we compare our algorithm to previous approaches, and demonstrate that the system can indeed obtain differential rules of interest to an expert on real data, while significantly improving the data uplift.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Uplift Modeling with ROC: An SRL Case Study

Uplift modeling is a classification method that determines the incremental impact of an action on a given population. Uplift modeling aims at maximizing the area under the uplift curve, which is the difference between the subject and control sets’ area under the lift curve. Lift and uplift curves are seldom used outside of the marketing domain, whereas the related ROC curve is frequently used i...

متن کامل

Pessimistic Uplift Modeling

Uplift modeling is a machine learning technique that aims to model treatment effects heterogeneity. It has been used in business and health sectors to predict the effect of a specific action on a given individual. Despite its advantages, uplift models show high sensitivity to noise and disturbance, which leads to unreliable results. In this paper we show different approaches to address the prob...

متن کامل

Model Selection Scores for Multi - Relational Bayesian Networks ∗

Many organizations maintain their data in a relational database, which contains information about entities, their attributes, relationships among the entities, and attributes of the relationships. Statistical-relational learning (SRL) aims to generalize traditional single-table machine learning methods for multi-relational data. Many SRL models are defined using a combination of graphs and firs...

متن کامل

Causal Inference and Uplift Modeling A review of the literature

Uplift modeling refers to the set of techniques used to model the incremental impact of an action or treatment on a customer outcome. Uplift modeling is therefore both a Causal Inference problem and a Machine Learning one. The literature on uplift is split into 3 main approaches–the Two-Model approach, the Class Transformation approach and modeling uplift directly. Unfortunately, in the absence...

متن کامل

Uplift Modeling in Direct Marketing

Marketing campaigns directed to randomly selected customers often generate huge costs and a weak response. Moreover, such campaigns tend to unnecessarily annoy customers and make them less likely to answer to future communications. Precise targeting of marketing actions can potentially results in a greater return on investment. Usually, response models are used to select good targets. They aim ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD

دوره 8190  شماره 

صفحات  -

تاریخ انتشار 2013