Hasso-Plattner-Institut
Prof. Dr. Felix Naumann
 

DADS (Distributed Detection of Sequential Anomalies in Univariate Time Series)

We built a scalable time series anomaly detector.

Authors

Johannes Schneider, Phillip Wenig, and Thorsten Papenbrock

Abstract

The automated detection of sequential anomalies in time series is an essential task for many applications, such as the monitoring of technical systems, fraud detection in high-frequency trading, or the early detection of disease symptoms. All these applications require the detection to find all sequential anomalies possibly fast on potentially very large time series. In other words, the detection needs to be effective, efficient and scalable w.r.t. the input size. Series2Graph is an effective solution based on graph embeddings that are robust against re-occurring anomalies and can discover sequential anomalies of arbitrary length and works without training data. Yet, Series2Graph is not scalable due to its single-threaded approach; it cannot, in particular, process arbitrarily large sequences due to the memory constraints of a single machine. In this paper, we propose our distributed anomaly detection system, short DADS, which is an efficient and scalable adaptation of Series2Graph. Based on the actor programming model, DADS distributes the input time sequence, intermediate state and the computation to all processors of a cluster in a way that minimizes communication costs and synchronization barriers. Our evaluation shows that DADS is orders of magnitude faster than S2G, scales almost linearly with the number of processors in the cluster and can process much larger input sequences due to its scale-out property.

Links

Here you can find the Published Paper and the Source Code.

 

Datasets

We used the following datasets:

NameSource
SEDAbdul-Aziz, Ali & Woike, Mark & Oza, Nikunj & Matthews, Bryan & Lekki, John. (2011). Rotor health monitoring combining spin tests and data-driven anomaly detection methods. Structural Health Monitoring. 11. 3-12. 10.1177/1475921710395811.
MBA

Goldberger, Ary & Amaral, Luís & Glass, L. & Havlin, Shlomo & Hausdorg, J. & Ivanov, Plamen & Mark, R. & Mietus, J. & Moody, G. & Peng, Chung-Kang & Stanley, H. & Physiobank, Physiotoolkit. (2000). Components of a new research resource for complex physiologic signals. PhysioNet. 101.

Moody, G.B. & Mark, R.G.. (2001). The impact of the MIT-BIH arrhythmia database. IEEE engineering in medicine and biology magazine : the quarterly magazine of the Engineering in Medicine & Biology Society. 20. 45-50. 10.1109/51.932724.

MVKeogh, E. & Lin, J. & Fu, Ada. (2005). HOT SAX: Efficiently finding the most unusual time series subsequence. Proceedings - IEEE International Conference on Data Mining, ICDM. 8 pp.-. 10.1109/ICDM.2005.79.
DPD

Senin, Pavel & Lin, Jessica & Wang, Xing & Oates, Tim & Gandhi, Sunil & Boedihardjo, Arnold & Chen, Crystal & Frankenstein, Susan. (2015). Time series anomaly discovery with grammar-based compression.

Wijk, Jarke & Selow, E.. (1999). Cluster and calendar based visualization of time series data. 4-9, 140. 10.1109/INFVIS.1999.801851.

SynthBoniol, Paul & Palpanas, Themis. (2020). Series2Graph: Graph-based Subsequence Anomaly Detection for Time Series. Proceedings of the International Conference on Very Large Databases.

 

References

Boniol, Paul & Palpanas, Themis. (2020). Series2Graph: Graph-based Subsequence Anomaly Detection for Time Series. Proceedings of the International Conference on Very Large Databases.