Embedded Malware Detection Using Markov n-Grams

نویسندگان

  • Muhammad Zubair Shafiq
  • Syed Ali Khayam
  • Muddassar Farooq
چکیده

Embedded malware is a recently discovered security threat that allows malcode to be hidden inside a benign file. It has been shown that embedded malware is not detected by commercial antivirus software even when the malware signature is present in the antivirus database. In this paper, we present a novel anomaly detection scheme to detect embedded malware. We first analyze byte sequences in benign files to show that benign files’ data generally exhibit a 1-st order dependence structure. Consequently, conditional n-grams provide a more meaningful representation of a file’s statistical properties than traditional n-grams. To capture and leverage this correlation structure for embedded malware detection, we model the conditional distributions asMarkov n-grams. For embedded malware detection, we use an information-theoretic measure, called entropy rate, to quantify changes in Markov n-gram distributions observed in a file. We show that the entropy rate of Markov n-grams gets significantly perturbed at malcode embedding locations, and therefore can act as a robust feature for embedded malware detection. We evaluate the proposed Markov n-gram detector on a comprehensive malware dataset consisting of more than 37, 000 malware samples and 1, 800 benign samples of six well-known filetypes. We show that the Markov n-gram detector provides better detection and false positive rates than the only existing embedded malware detection scheme.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Deep Learning for Classification of Malware System Call Sequences

The increase in number and variety of malware samples amplifies the need for improvement in automatic detection and classification of the malware variants. Machine learning is a natural choice to cope with this increase, because it addresses the need of discovering underlying patterns in large-scale datasets. Nowadays, neural network methodology has been grown to the state that can surpass limi...

متن کامل

Malware Detection using Windows API Sequence and Machine Learning

Monitoring the behavior of program execution at run-time is widely used to differentiate benign and malicious processes executing in the host computer. Most of the existing run-time malware detection methods use the information available in Windows Application Programming Interface (API) calls. The proposed malware detection system uses the Windows API call sequence. A 3rd order Markov chain (i...

متن کامل

N-grams-based File Signatures for Malware Detection

Malware is any malicious code that has the potential to harm any computer or network. The amount of malware is increasing faster every year and poses a serious security threat. Thus, malware detection is a critical topic in computer security. Currently, signature-based detection is the most extended method for detecting malware. Although this method is still used on most popular commercial comp...

متن کامل

Malware Analysis using Multiple API Sequence Mining Control Flow Graph

Malwares are becoming persistent by creating fulledged variants of the same or different family. Malwares belonging to same family share same characteristics in their functionality of spreading infections into the victim computer. These similar characteristics among malware families can be taken as a measure for creating a solution that can help in the detection of the malware belonging to part...

متن کامل

Detección de malware con modelo de lenguaje y su clasificación mediante SVM

Malware analysis is a more difficult task each time, due to malware developers are taking care of avoiding and detection techniques, from obfuscation to destroying, hard drive (if malware detect that has been analyzed). In this paper we release a malware dynamic analysis of six types of malware: Trojans, Worms, Virus, Trojan-Spys, Backdoors y Rootkits, moreover a set of Whiteware using an n-gra...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008