Cheng Wang,

Riddam Rishu,

Akshay Kakkar,

Abdul Rahman,

Christopher Redino,

Dhruv Nandakumar,

Tyler Cody,

Ryan Clark,

Dan Radke

and

Edward Bowen

Enhancing Exfiltration Path Analysis Using Reinforcement Learning (pdf, video)

Building on previous work using reinforcement learning (RL) focused on identification of exfiltration paths, this work expands the methodology to include protocol and payload considerations. The former approach to exfiltration path discovery, where reward and state are associated specifically with the determination of optimal paths, are presented with these additional realistic characteristics to account for nuances in adversarial behavior. The paths generated are enhanced by including communication payload and protocol into the Markov decision process (MDP) in order to more realistically emulate attributes of network-based exfiltration events. The proposed method will help emulate complex adversarial considerations such as the size of a payload being exported over time or the protocol on which it occurs, as is the case where threat actors steal data over long periods of time using system native ports or protocols to avoid detection. As such, practitioners will be able to improve identification of expected adversary behavior under various payload and protocol assumptions more comprehensively.