Katie Paxton-Fear

Manchester Metropolitan University,

Duncan Hodges

Cranfield University,

and

Oliver Buckley

University of East Anglia

Visualising an insider threat incident from witness reports using natural language processing (pdf)

Insider threats are security incidents committed not by outsiders such as APT groups, but instead by an organization's own employees or other trusted individuals. These attacks can often be more impactful than incidents committed by outsiders as insiders may have valid security credentials, knowledge of business details, knowledge of security controls in place, and potentially how to bypass them. This activity could be unintentional such as an employee leaving a laptop on public transport, or malicious when an insider purposefully chooses to attack for some gain, such as an insider selling IP to a competitor. When an outsider chooses to attack often, they leave digital breadcrumbs as they perform reconnaissance activities, this can make it easier to detect and respond to an incident. Comparatively, an insider may be able to continue their attack for years for being caught by their employers. This is because insider threat activity is co-spatial and co-temporal with legitimate activity, as an insider conducts their attack during the course, or very soon after their jobs. Insider threat related activity, such as accessing high-value files, can also be very similar to legitimate activity and a change in file access patterns could represent a change in tasks and be benign, rather than insider threat. Finally, and especially for technical insiders, insiders have knowledge of security controls allowing them to go undetected, this can also be true of non-technical insiders where a security control may be bypassed for ease such as leaving laptops unlocked when in public.

Controlling the risk of malicious insider threat, there are three key approaches, first organizational where the risk of an insider attack is mitigated by managers in an organization, technical approaches which aim to highlight insider threat activity, usually by identifying insider threat activity on the network using anomaly detection techniques, and finally psychological and social approaches, which aim to understand the insider and ask questions such as the motivation behind committing the attack and any behavioural changes in the insider. As all insider threat activity will have various links to each of these approaches insider threat models attempt to combine these into a single framework or model. Instead of attempting to supplant existing practices, this work will support them, providing a new tool for exploring an insider threat attack to better understand insider threat through the lens of strategic and tactical decision making. This work uses a large corpus of publicly available news articles relating to insider threat incidents these documents can then be used to create a custom insider threat model which can be adapted with new documents, creating a dynamic, custom, insider threat model. This model can then be applied to a corpus of witness reports to visualize the model and give an overview of an incident. The custom insider threat model is created using topic modeling, identifying key themes across documents by examining word association. By identifying themes across many different insider threat incidents, the core attributes of insider threat should be recognized such as methodologies, motivation, information about the insider's role in the organization, or information about the organization's weaknesses. By combining this information with temporal, causal, and narrative clues, an incident can be visualized and placed on a graph, similar to existing insider threat models.

This system supports an investigator as they ask key questions about an incident such as "what was the motivation of the insider?" "What assets did they target and how?" "Were there any security controls in place?" "Did they bypass those?" and allows the investigator to visualize and explore the attack. Using the answers to these key questions informed organizational changes can be made, for example limiting access to certain systems or recommending new ones be deployed. This work has many implications for incident response and supports the reflection on an individual incident. To enable this further impact with practitioners this could be implemented into a piece of software, however presently this is a proof of concept for NLP to be used in this way. Additionally, further tools could be introduced or developed to improve the creation of graphs at a macro level, such as co-reference resolution or improved merging. Aside from direct impact in the insider threat domain, the methods developed and designed during this work also have an impact on cyber security and more widely in interdisciplinary research in social sciences. Particularly in the ability to leverage organic narratives and map these to an existing framework, rather than asking a witness to adapt their narrative to a framework directly. This enables reports to be collected on a large scale and analyzed as a whole if a participant does not wish to disclose something they do not feel pressured to. This provides a holistic view of an attack, considering many aspects of an insider threat attack by using reports already collected after an incident to create a better understanding of insider threat as a whole, this, in turn, leads to more techniques in prevention and detection.