Producing Scores for Customers via Ensembling SVM

نویسندگان

  • Xinjian Guo
  • Guangtong Zhou
  • Cailing Dong
  • Yilong Yin
چکیده

Supervised by Dr. Yilong Yin Email:[email protected] School of Computer Science and Technology, Shandong University Jinan 250100, China Abstract This report shows our solution to PAKDD Competition 2007. Following a brief description of the data mining task, we discuss four difficulties to be dealt with in this task. Then, we show how to do the data pre-processing. To weaken class-imbalance of the modeling dataset externally, we combine Under-sampling and Over-sampling techniques. Besides, we adjust the parameters of each support vector machine internally to solve cost-sensitivity. Next, we get an ensemble of SVM to achieve higher accuracy. In the end, we present the essence of the model and provide some cues for the consumer finance company.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Predicting Future Customers via Ensembling Gradually Expanded Trees

This report presents our solution to PAKDD’06 Data Mining Competition. Following a brief description on the task, we discuss the difficulties of the task and explain the motivation of our solution. Then, we propose the GetEnsemble (Gradually Expanded Tree Ensemble) method, which handles the difficulties via ensembling expanded trees. We evaluated the proposed method and several other methods us...

متن کامل

FDiBC: A Novel Fraud Detection Method in Bank Club based on Sliding Time and Scores Window

One of the recent strategies for increasing the customer’s loyalty in banking industry is the use of customers’ club system. In this system, customers receive scores on the basis of financial and club activities they are performing, and due to the achieved points, they get credits from the bank. In addition, by the advent of new technologies, fraud is growing in banking domain as well. Therefor...

متن کامل

Mental Distress Detection and Triage in Forum Posts: The LT3 CLPsych 2016 Shared Task System

This paper describes the contribution of LT3 for the CLPsych 2016 Shared Task on automatic triage of mental health forum posts. Our systems use multiclass Support Vector Machines (SVM), cascaded binary SVMs and ensembles with a rich feature set. The best systems obtain macro-averaged F-scores of 40% on the full task and 80% on the green versus alarming distinction. Multiclass SVMs with all feat...

متن کامل

Ensembling Predictions of Student Post-Test Scores for an Intelligent Tutoring System

Over the last few decades, there have been a rich variety of approaches towards modeling student knowledge and skill within interactive learning environments. There have recently been several empirical comparisons as to which types of student models are better at predicting future performance, both within and outside of the interactive learning environment. A recent paper (Baker et al., in pres...

متن کامل

The Sum is Greater than the Parts: Ensembling Student Knowledge Models in ASSISTments

Recent research has had inconsistent results as to the utility of ensembling different approaches towards modeling student knowledge and skill within interactive learning environments. While work in the 2010 KDD Cup data set has shown benefits from ensembling, work in the Genetics Tutor has failed to show benefits. We hypothesize that the key factor has been data set size. We explore the potent...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007