Towards Classification of Malware on the Basis Their Characteristics and Importance Mining of Features

Dátum
Folyóirat címe
Folyóirat ISSN
Kötet címe (évfolyam száma)
Kiadó
Absztrakt

There are several websites, applications and resources that a user visits every day. Some of the resources have malicious threats and harmful entities. It becomes to be careful and identify such resources in advance to save our system maintain our privacy. This malicious software is called Malware, which can be of any type as Virus, Trojan horse, Worms, spam, adware etc. This study is developing four classification models to identify such threats. The algorithms used are Support vector machine, Decision tree algorithm, KNN classification algorithms and Naïve Bayesian classification. Derived models are tested for their accuracy using precision, recall, F-1 score and ROC curves. The models are trained and tested with the recorded data in virtual machine of LINUX. The data consisting 100000 dataset of 35 attributes. The ratio of Malware and Benign is 1:1. Study found decision tree algorithm and KNN classification are the first and two most accurate models of classification respectively. During the preprocessing the attributes were removed up to 17. Study also finds that the static priority, system time, free cache area and reserved area in virtual machine are the factors significantly affecting the classification. Static priority is the main factor which is having the most significant importance and importance values is 0.52. The study will be helpful for security experts and wide area users of internet to identify whether a resource contains any malicious threats or not.

Leírás
Kulcsszavak
Malware, decision tree, KNN classification, SVM model, Naïve Bayesian model, static priority, classification, machine learning, analysis
Forrás
Gyűjtemények