Malachi Jones

MITRE

Automated in-memory malware/rootkit detection via binary analysis and machine learning (pdf, video)

A prominent technique for detecting sophisticated malware consists of monitoring the execution behavior of each binary to identify anomalies and/or malicious intent. Hooking and emulation are two primary mechanisms that are employed to facilitate the monitoring. Although these behavioral monitoring mechanisms are a substantial improvement over classic signature detection, skilled malware authors have developed reliable techniques to defeat them. As an example, sophisticated malware can exploit hooking implementations by either utilizing alternative (e.g. lower level) unhooked API or by removing the hooks at run-time to evade monitoring. In addition, the malware can also perform checks to detect if it is executing in an emulator/VM and modify its behavior accordingly.

In this talk, we will demonstrate an approach for pairing Memory Forensics with Binary Analysis and Machine Learning to analyze the behavior of binaries on a set of hosts to detect advanced persistent threats (APT)s that may evade detection by hooking and traditional emulation. In particular, we will discuss how an approximate clustering algorithm with linear run-time performance can be leveraged to identify outliers (i.e. potential APTs) among sets of clustered memory artifacts (i.e. processes, shared libraries, drivers, and kernel modules). Note that these memory artifacts are collected from live, networked hosts and clustered real-time in a scalable manner. We will also discuss and demonstrate how dynamic binary analysis can be leveraged with Machine Learning techniques to differentiate between benign anomalous code and malware to improve detection accuracy.