David Elkind

CrowdStrike

Mitigating Adversarial Attacks against Machine Learning for Static Analysis (pdf, blog, video)

Computer security increasingly leverages machine learning to detect malware. This is not without risks. Machine learning methods have weaknesses that can be exploited by a savvy attacker. In the case of malware, adversaries have an enormous amount of control over how to accomplish their malicious goals in code; this flexibility allows malware authors to engineer PE files that can evade detection by machine learning.

In this presentation, we outline the high-level pipeline for identifying malware using machine learning and demonstrate an elementary strategy to evade detection using machine learning. Even simple modifications to a PE file can be leveraged to make the file evade naive machine learning models. As a notional example, we append ASCII bytes to the overlay of a PE file; because appending bytes to the overlay is unlikely to change the operation of the executable, the malicious functionality is likely left intact by this modification. Moreover, we show that such evasion can be mitigated using a novel regularization technique.

Our novel strategy for mitigating evasions leverages the internal structure of a deep neural network for malware classification. Specifically, we penalize the deep network’s objective function proportional to the magnitude of the discrepancy between the hidden representations of a PE file and its corresponding modified version. This penalty encourages pairs of files (original files are paired with the same with ASCII bytes appended) to be given similar learned representations within the hidden layers of the network. We know that the “twins” must have the same functionality, so the network should give them a similar representation. We show that this regularization strategy results in a model which is much more robust to targeted file perturbations such as the ASCII bytes evasion strategy. Furthermore, we analyze the trade-offs researchers need to make between adversarial hardening and detection efficacy.