Classification with Strategically Withheld Data

نویسندگان

چکیده

Machine learning techniques can be useful in applications such as credit approval and college admission. However, to classified more favorably contexts, an agent may decide strategically withhold some of her features, bad test scores. This is a missing data problem with twist: which depends on the chosen classifier, because specific classifier what create incentive certain feature values. We address training classifiers that are robust this behavior. design three classification methods: MINCUT, Hill-Climbing (HC) Incentive-Compatible Logistic Regression (IC-LR). show MINCUT optimal when true distribution fully known. it produce complex decision boundaries, hence prone overfitting cases. Based characterization truthful (i.e., those give no hide features), we devise simpler alternative called HC consists hierarchical ensemble out-of-the-box classifiers, trained using specialized hill-climbing procedure convergent. For several reasons, not effective utilizing large number complementarily informative features. To end, present IC-LR, modification removes drop also our algorithms perform well experiments real-world sets, insights into their relative performance different settings.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Negative Selection Based Data Classification with Flexible Boundaries

One of the most important artificial immune algorithms is negative selection algorithm, which is an anomaly detection and pattern recognition technique; however, recent research has shown the successful application of this algorithm in data classification. Most of the negative selection methods consider deterministic boundaries to distinguish between self and non-self-spaces. In this paper, two...

متن کامل

Cheap talk with multiple strategically

We consider a cheap-talk setting that mimics the situation where an incumbent firm (the sender) is endowed with incentives to understate the true size of the market demand to two potential entrants (the receivers). Although our experimental data reveals that senders’ messages convey truthful information and this is picked up by the receivers, this overcommunication (relative to standard theoret...

متن کامل

Fuzzy Data Envelopment Analysis for Classification of Streaming Data

The classification of fuzzy uncertain data is considered one of the most challenging issues in data analysis. In spite of the significance of fuzzy data in mathematical programming, the development of the analytical methods of fuzzy data is slow. Therefore, the current study proposes a new fuzzy data classification method based on fuzzy data envelopment analysis (DEA) which can handle strea...

متن کامل

Fuzzy Data Envelopment Analysis for Classification of Streaming Data

متن کامل

the clustering and classification data mining techniques in insurance fraud detection:the case of iranian car insurance

با توجه به گسترش روز افزون تقلب در حوزه بیمه به خصوص در بخش بیمه اتومبیل و تبعات منفی آن برای شرکت های بیمه، به کارگیری روش های مناسب و کارآمد به منظور شناسایی و کشف تقلب در این حوزه امری ضروری است. درک الگوی موجود در داده های مربوط به مطالبات گزارش شده گذشته می تواند در کشف واقعی یا غیرواقعی بودن ادعای خسارت، مفید باشد. یکی از متداول ترین و پرکاربردترین راه های کشف الگوی داده ها استفاده از ر...

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2021

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v35i6.16694