ABSTRACT
Many outlier detection methods do not merely provide the decision for a single data object being or not being an outlier but give also an outlier score or "outlier factor" signaling "how much" the respective data object is an outlier. A major problem for any user not very acquainted with the outlier detection method in question is how to interpret this "factor" in order to decide for the numeric score again whether or not the data object indeed is an outlier. Here, we formulate a local density based outlier detection method providing an outlier "score" in the range of [0, 1] that is directly interpretable as a probability of a data object for being an outlier.
References
- E. Achtert, T. Bernecker, H.-P. Kriegel, E. Schubert, and A. Zimek. ELKI in time: ELKI 0.2 for the performance evaluation of distance measures for time series. In Proc. SSTD, 2009. Google Scholar
Digital Library
- F. Angiulli and C. Pizzuti. Fast outlier detection in high dimensional spaces. In Proc. PKDD, 2002. Google Scholar
Digital Library
- A. Asuncion and D. J. Newman. UCI Machine Learning Repository. http://www.ics.uci.edu/~mlearn/MLRepository.html, 2007.Google Scholar
- M. M. Breunig, H.-P. Kriegel, R. Ng, and J. Sander. LOF: Identifying density-based local outliers. In Proc. SIGMOD, 2000. Google Scholar
Digital Library
- H.-P. Kriegel, P. Kröger, and A. Zimek. Outlier detection techniques. Tutorial at PAKDD, 2009.Google Scholar
- H.-P. Kriegel, M. Schubert, and A. Zimek. Angle-based outlier detection in high-dimensional data. In Proc. KDD, 2008. Google Scholar
Digital Library
- B. Liebl, U. Nennstiel-Ratzel, R. von Kries, R. Fingerhut, B. Olgemöller, A. Zapf, and A. A. Roscher. Very high compliance in an expanded MS-MS-based newborn screening program despite written parental consent. Preventive Medicine, 34(2):127--131, 2002.Google Scholar
Cross Ref
- S. Papadimitriou, H. Kitagawa, P. Gibbons, and C. Faloutsos. LOCI: Fast outlier detection using the local correlation integral. In Proc. ICDE, 2003.Google Scholar
Cross Ref
- S. Ramaswamy, R. Rastogi, and K. Shim. Efficient algorithms for mining outliers from large data sets. In Proc. SIGMOD, 2000. Google Scholar
Digital Library
- K. Zhang, M. Hutter, and H. Jin. A new local distance-based outlier detection approach for scattered real-world data. In Proc. PAKDD, 2009. Google Scholar
Digital Library
Index Terms
LoOP

Erich Schubert
Arthur Zimek


Comments