Nathan Ross,

Oluwafemi Olukoya,

Jesus Martinez del Rincon

and

Domhnall Carlin

PEVuln: A Benchmark Dataset for Using Machine Learning to Detect Vulnerabilities in PE Malware (pdf, video)

In this paper, we present a benchmark dataset for training and evaluating static PE malware machine learning models, specifically for detecting known vulnerabilities in malware. Our goal is to enable further research in defense against malware by exploiting their bugs or weaknesses. After recognising limitations in current malware datasets regarding exploitable malware, our dataset addresses these gaps by utilizing the malware vulnerability database Malvuln, and software vulnerability database ExploitDB to create a new malware dataset with 864 vulnerable malware samples, 35,241 non-vulnerable malware samples, 1,425 vulnerable benign samples, and 7,905 non-vulnerable benign samples, detailed with timestamps, families, threat mapping, vulnerability mapping, and obfuscation analysis. This 4-class dataset lays the foundation for advancing future research in analysis and vulnerability exploitation in malware using machine learning. We also provide baseline results using state-of-the-art models for malware classification to benchmark the performance of the dataset, where the binary tasks achieve F1 scores above 0.90, while the multi-class task attains an F1-Score of 0.958.