Kaixi Yang,
Paul Miller
and
Jesus Martinez del Rincon
DIP-ECOD: Improving Anomaly Detection in Multimodal Distributions (pdf, video)
Anomaly detection algorithms identify unusual events and outliers in large datasets where manual approaches are highly impractical. Most prior anomaly detection methods assume simple unimodal Gaussian data distributions; however, they produce suboptimal results on complex multimodal distributions. To address this problem, we propose DIP-ECOD, a novel anomaly detection algorithm leveraging unsupervised machine learning that generalises to both multimodal and unimodal distributions. DIP-ECOD integrates a dip test within the ECOD framework, using SkinnyDip to split a probability distribution into separate modes, after which ECOD is applied. In this way, difficult-to-find outliers between modes and hidden in the distribution tails of each mode are also detected. Experiments using nine benchmark datasets across a range of domains such as healthcare and imagery demonstrate DIP-ECOD’s improved performance over ECOD in detecting outliers in both multimodal and unimodal distributions, with DIP-ECOD achieving an average AUC score of 0.791 compared to ECOD’s 0.761. Further, using a proprietary enterprise dataset, we show DIP-ECOD effectively identifies anomalous Github commits, indicating its applicability to information security and software vulnerability, where multi modal distributions are expected.