Deep Learning for Insider Threat Detection: Review, Challenges and Opportunities

Authors: Shuhan Yuan, Xintao Wu

Abstract: Insider threats, as one type of the most challenging threats in cyberspace, usually cause significant loss to organizations. While the problem of insider threat detection has been studied for a long time in both security and data mining communities, the traditional machine learning based detection approaches, which heavily rely on feature engineering, are hard to accurately capture the behavior difference between insiders and normal users due to various challenges related to the characteristics of underlying data, such as high-dimensionality, complexity, heterogeneity, sparsity, lack of labeled insider threats, and the subtle and adaptive nature of insider threats. Advanced deep learning techniques provide a new paradigm to learn end-to-end models from complex data. In this brief survey, we first introduce one commonly-used dataset for insider threat detection and review the recent literature about deep learning for such research. The existing studies show that compared with traditional machine learning algorithms, deep learning models can improve the performance of insider threat detection. However, applying deep learning to further advance the insider threat detection task still faces several limitations, such as lack of labeled data, adaptive attacks. We then discuss such challenges and suggest future research directions that have the potential to address challenges and further boost the performance of deep learning for insider threat detection.

Submitted to arXiv on 25 May. 2020

Explore the paper tree

Click on the tree nodes to be redirected to a given paper and access their summaries and virtual assistant

Also access our AI generated Summaries, or ask questions about this paper to our AI assistant.

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.