Machine learning and anomaly detection for insider threat detection
Abstract
With the ever-growing online presence of businesses technology is ubiquitous in our
daily life. Whether through social media, online shops, working remotely, or talking to
a smart speaker to order dinner, cyber security will be only more important. While
external intrusions to information systems still constitute the vast majority of cyber
hazards, there is one type of threat that has been growing over the recent years and
drawing more attention, both from the industrial and academic worlds: a threat from
within.
Insider threat is a case of an employee or anyone with security access to an organisation
negatively affecting the company: through fraud, spreading misinformation and chaos
among other employees, or theft. Insiders are already past the security perimeter of an
organisation, giving them advantage and often easy access to sensitive files. That makes
insider threat a very dangerous and costly type of cyber crime.
Insider threat detection is a growing niche of cyber security. It faces many interesting
challenges, including problems with finding real data for experiments, the high point difficulty of anomaly detection, and difficulties encountered when trying to model attacks.
These problems have played a big role in shaping the current landscape of insider threat
detection.
In this thesis we describe our work with various machine learning and anomaly detection
models for insider threat detection. We describe the current state of the art and the
background of insider threat detection. We describe the datasets we use, both real
and synthetic, and the attacks we generate and insert into benign data to create data
for additional verification of our models. We outline three guidelines of good conduct
regarding experiment protocol, motivated by literature and experience gathered during
this research: use varied and multiple quality metrics to describe results; design and
create models without prior knowledge of data; report on all known limitations and
assumptions.
Out of the models we developed, the ensembles and local outlier factor (LOF) had consistently the best results out of the entire suite, and were comparable to state of the art
research. Our focus on multiple metrics helped us pinpoint a situation where models
were performing worse than the most popular metric, area under receiver operating characteristic curve (AUC), would suggest. Our method for generating and inserting attacks
was a valuable tool of verification of model results. Finally, for future research in the
field we suggest improving heterogenous ensembles, using deep learning and especially
autoencoders for anomaly detection, and further work on generating data.