Feature-based Malicious URL and Attack Type Detection Using Multi-class Classification

Authors

  • D. Patil Department of Computer Engineering, R.C. Patel Institute of Technology, Shirpur-25405, India
  • J. Patil Department of Computer Engineering, R.C. Patel Institute of Technology, Shirpur-25405, India
Abstract:

Nowadays, malicious URLs are the common threat to the businesses, social networks, net-banking etc. Existing approaches have focused on binary detection i.e. either the URL is malicious or benign. Very few literature is found which focused on the detection of malicious URLs and their attack types. Hence, it becomes necessary to know the attack type and adopt an effective countermeasure. This paper proposed a methodology to detect malicious URLs and the type of attacks based on multi-class classification. In this work, we proposed 42 new features of spam, phishing and malware URLs. These features are not considered in the earlier studies for malicious URLs detection and attack types identification. Binary and multi-class dataset is constructed using 49935 malicious and benign URLs. It consists of 26041 benign and 23894 malicious URLs containing 11297 malware, 8976 phishing, and 3621 spam URLs. To evaluate the proposed approach, state-of-the-art supervised batch and online machine learning classifiers are used. Experiments are performed on the binary and multi-class dataset using the aforementioned machine learning classifiers. It is found that confidence weighted learning classifier achieved the best 98.44% average detection accuracy with 1.56% error-rate in the multi-class setting and 99.86% detection accuracy with negligible error-rate of 0.14% in binary setting using our proposed URL features.

Upgrade to premium to download articles

Sign up to access the full text

Already have an account?login

similar resources

Efficient Malicious URL based on Feature Classification

Deceitful and malicious web sites pretense significant danger to desktop security, integrity and privacy. Malicious web pages that use drive-by download attacks or social engineering techniques to install unwanted software on a user‘s computer have become the main opportunity for the proliferation of malicious code. Detection of malicious URL has become difficult because of the phishing campaig...

full text

Detection of Malicious Url Redirection and Distribution

Web-based malicious software (malware) has been increasing over the Internet .It poses threats to computer users through web sites. Computers are infected with Web-based malware by drive-by-download attacks. Drive-by-download attacks force users to download and install the Web-based malware without being aware of it .these attacks evade detection by using automatic redirections to various websi...

full text

Malicious URL Detection using Machine Learning: A Survey

Malicious URL, a.k.a. malicious website, is a common and serious threat to cybersecurity. Malicious URLs host unsolicited content (spam, phishing, drive-by exploits, etc.) and lure unsuspecting users to become victims of scams (monetary loss, theft of private information, and malware installation), and cause losses of billions of dollars every year. It is imperative to detect and act on such th...

full text

Feature Based Data Stream Classification (FBDC) and Novel Class Detection

Data stream classification poses many challenges to the data mining community. Here this paper solves all the challenges such as infinite length, concept-drift, concept-evolution, and feature-evolution. Since a data stream is theoretically infinite in length, it is impractical to store and use all the historical data for training. Concept-drift is a common phenomenon in data streams, which occu...

full text

Multi-class SVMs for Image Classification using Feature Tracking

The authors would like to thank O. Chapelle and B. Schölkopf for useful discussions and M. Giese and F. Wichmann for helpful comments about the manuscript. The authors were supported by a grant from the EC (COGVIS). Abstract. In this paper a novel representation for image classification is proposed which exploits the temporal information inherent in natural visual input. Image sequences are rep...

full text

Direct Sparsity Optimization Based Feature Selection for Multi-Class Classification

A novel sparsity optimization method is proposed to select features for multi-class classification problems by directly optimizing a l2,p -norm ( 0 < p ≤ 1 ) based sparsity function subject to data-fitting inequality constraints to obtain large between-class margins. The direct sparse optimization method circumvents the empirical tuning of regularization parameters in existing feature selection...

full text

My Resources

Save resource for easier access later

Save to my library Already added to my library

{@ msg_add @}


Journal title

volume 10  issue 2

pages  141- 162

publication date 2018-03-20

By following a journal you will be notified via email when a new issue of this journal is published.

Hosted on Doprax cloud platform doprax.com

copyright © 2015-2023