Multi-Stage Malicious Click Detection on Large Scale Web Advertising Data
نویسندگان
چکیده
The healthy development of the Internet largely depends on the online advertisement which provides the financial support to the Internet. Click fraud, however, poses serious threat to the Internet ecosystem. It not only brings harm to the advertisers, but also damages the mutual trust between advertiser and ad agency. Click fraud prediction is a typical big data application in that we normally need to identify the malicious clicks from massive click logs, therefore e cient detection methods in big data framework are much desired to combat this fraudulent behavior. In this paper, we propose a three-stage filtering system to attack click fraud. The serialized filters e↵ectively detect the malicious clicks with decreasing confidence that can satisfy both advertisers and content providers.
منابع مشابه
Feature-based Malicious URL and Attack Type Detection Using Multi-class Classification
Nowadays, malicious URLs are the common threat to the businesses, social networks, net-banking etc. Existing approaches have focused on binary detection i.e. either the URL is malicious or benign. Very few literature is found which focused on the detection of malicious URLs and their attack types. Hence, it becomes necessary to know the attack type and adopt an effective countermeasure. This pa...
متن کاملAnalyzing new features of infected web content in detection of malicious web pages
Recent improvements in web standards and technologies enable the attackers to hide and obfuscate infectious codes with new methods and thus escaping the security filters. In this paper, we study the application of machine learning techniques in detecting malicious web pages. In order to detect malicious web pages, we propose and analyze a novel set of features including HTML, JavaScript (jQuery...
متن کاملMulti-agent System for Web Advertising
The main aim of a personalized advertising system is to provide advertisements, which are most suitable for the given anonymous user navigating the web site. To achieve this goal, many sources of data are processed in one coherent vector space: the advertisers’ and publisher’s web site content, sessions of former users from the past, the history of clicks on banners and the current user behavio...
متن کاملAddressing Malicious Noise in Clickthrough Data
Clickthrough logs are becoming an increasingly used source of training data for learning ranking functions. Due to the large impact that the position in search results has on commercial websites, malicious noise is bound to appear in search engine click logs. We present preliminary work in addressing this form of noise, that we term click-spam. We analyze click-spam from a utility standpoint, a...
متن کاملA New Memory Efficient Technique for Fraud Detection in Web Advertising Networks
The advertising network considered as the middle man in web advertising between advertisers and publishers. This paper presented an intelligent and memory efficient Fraud detection technique with intelligent classification engine to be used by the advertising networks to scan clicks and impressions offline streams happen on publisher side for the purpose of detecting click fraud and impression ...
متن کامل