Deep Anomaly Detection under Labeling Budget Constraints
Aodong Li,u00a0Chen Qiu,u00a0Marius Kloft,u00a0Padhraic Smyth,u00a0Stephan Mandt,u00a0Maja Rudolph
Selecting informative data points for expert feedback can significantly improve the performance of anomaly detection (AD) in various contexts, such as medical diagnostics or fraud detection. In this paper, we determine a set of theoretical conditions under which anomaly scores generalize from labeled queries to unlabeled data. Motivated by these results, we propose a data labeling strategy with optimal data coverage under labeling budget constraints. In addition, we propose a new learning framework for semi-supervised AD. Extensive experiments on image, tabular, and video data sets show that our approach results in state-of-the-art semi-supervised AD performance under labeling budget constraints.


