Anomaly Detection with R

Overview

AnomalyDetection R package

Build Status Pending Pull-Requests Github Issues

AnomalyDetection is an open-source R package to detect anomalies which is robust, from a statistical standpoint, in the presence of seasonality and an underlying trend. The AnomalyDetection package can be used in wide variety of contexts. For example, detecting anomalies in system metrics after a new software release, user engagement post an A/B test, or for problems in econometrics, financial engineering, political and social sciences.

How the package works

The underlying algorithm – referred to as Seasonal Hybrid ESD (S-H-ESD) builds upon the Generalized ESD test for detecting anomalies. Note that S-H-ESD can be used to detect both global as well as local anomalies. This is achieved by employing time series decomposition and using robust statistical metrics, viz., median together with ESD. In addition, for long time series (say, 6 months of minutely data), the algorithm employs piecewise approximation - this is rooted to the fact that trend extraction in the presence of anomalies in non-trivial - for anomaly detection.

Besides time series, the package can also be used to detect anomalies in a vector of numerical values. We have found this very useful as many times the corresponding timestamps are not available. The package provides rich visualization support. The user can specify the direction of anomalies, the window of interest (such as last day, last hour), enable/disable piecewise approximation; additionally, the x- and y-axis are annotated in a way to assist visual data analysis.

How to get started

Install the R package using the following commands on the R console:

install.packages("devtools")
devtools::install_github("twitter/AnomalyDetection")
library(AnomalyDetection)

The function AnomalyDetectionTs is called to detect one or more statistically significant anomalies in the input time series. The documentation of the function AnomalyDetectionTs, which can be seen by using the following command, details the input arguments and the output of the function AnomalyDetectionTs.

help(AnomalyDetectionTs)

The function AnomalyDetectionVec is called to detect one or more statistically significant anomalies in a vector of observations. The documentation of the function AnomalyDetectionVec, which can be seen by using the following command, details the input arguments and the output of the function AnomalyDetectionVec.

help(AnomalyDetectionVec)

A simple example

To get started, the user is recommended to use the example dataset which comes with the packages. Execute the following commands:

data(raw_data)
res = AnomalyDetectionTs(raw_data, max_anoms=0.02, direction='both', plot=TRUE)
res$plot

Fig 1

From the plot, we observe that the input time series experiences both positive and negative anomalies. Furthermore, many of the anomalies in the time series are local anomalies within the bounds of the time series’ seasonality (hence, cannot be detected using the traditional approaches). The anomalies detected using the proposed technique are annotated on the plot. In case the timestamps for the plot above were not available, anomaly detection could then carried out using the AnomalyDetectionVec function; specifically, one can use the following command:

AnomalyDetectionVec(raw_data[,2], max_anoms=0.02, period=1440, direction='both', only_last=FALSE, plot=TRUE)

Often, anomaly detection is carried out on a periodic basis. For instance, at times, one may be interested in determining whether there was any anomaly yesterday. To this end, we support a flag only_last whereby one can subset the anomalies that occurred during the last day or last hour. Execute the following command:

res = AnomalyDetectionTs(raw_data, max_anoms=0.02, direction='both', only_last=”day”, plot=TRUE)
res$plot

Fig 2

From the plot, we observe that only the anomalies that occurred during the last day have been annotated. Further, the prior six days are included to expose the seasonal nature of the time series but are put in the background as the window of prime interest is the last day.

Anomaly detection for long duration time series can be carried out by setting the longterm argument to T.

Copyright and License

Copyright 2015 Twitter, Inc and other contributors

Licensed under the GPLv3

You might also like...
A Python Library for Graph Outlier Detection (Anomaly Detection)
A Python Library for Graph Outlier Detection (Anomaly Detection)

PyGOD is a Python library for graph outlier detection (anomaly detection). This exciting yet challenging field has many key applications, e.g., detect

Anomaly Detection and Correlation library

luminol Overview Luminol is a light weight python library for time series data analysis. The two major functionalities it supports are anomaly detecti

Find big moving stocks before they move using machine learning and anomaly detection
Find big moving stocks before they move using machine learning and anomaly detection

Surpriver - Find High Moving Stocks before they Move Find high moving stocks before they move using anomaly detection and machine learning. Surpriver

A Python toolkit for rule-based/unsupervised anomaly detection in time series

Anomaly Detection Toolkit (ADTK) Anomaly Detection Toolkit (ADTK) is a Python package for unsupervised / rule-based time series anomaly detection. As

Real-world Anomaly Detection in Surveillance Videos- pytorch Re-implementation

Real world Anomaly Detection in Surveillance Videos : Pytorch RE-Implementation This repository is a re-implementation of "Real-world Anomaly Detectio

Awesome anomaly detection in medical images

A curated list of awesome anomaly detection works in medical imaging, inspired by the other awesome-* initiatives.

Paper list of log-based anomaly detection

Paper list of log-based anomaly detection

This is an unofficial implementation of the paper “Student-Teacher Feature Pyramid Matching for Unsupervised Anomaly Detection”.
This is an unofficial implementation of the paper “Student-Teacher Feature Pyramid Matching for Unsupervised Anomaly Detection”.

This is an unofficial implementation of the paper “Student-Teacher Feature Pyramid Matching for Unsupervised Anomaly Detection”.

Demo project for real time anomaly detection using kafka and python
Demo project for real time anomaly detection using kafka and python

kafkaml-anomaly-detection Project for real time anomaly detection using kafka and python It's assumed that zookeeper and kafka are running in the loca

Unofficial implementation of PatchCore anomaly detection
Unofficial implementation of PatchCore anomaly detection

PatchCore anomaly detection Unofficial implementation of PatchCore(new SOTA) anomaly detection model Original Paper : Towards Total Recall in Industri

MemStream: Memory-Based Anomaly Detection in Multi-Aspect Streams with Concept Drift
MemStream: Memory-Based Anomaly Detection in Multi-Aspect Streams with Concept Drift

MemStream Implementation of MemStream: Memory-Based Anomaly Detection in Multi-Aspect Streams with Concept Drift . Siddharth Bhatia, Arjit Jain, Shivi

USAD - UnSupervised Anomaly Detection on multivariate time series

USAD - UnSupervised Anomaly Detection on multivariate time series Scripts and utility programs for implementing the USAD architecture. Implementation

Anomaly detection on SQL data warehouses and databases
Anomaly detection on SQL data warehouses and databases

With CueObserve, you can run anomaly detection on data in your SQL data warehouses and databases. Getting Started Install via Docker docker run -p 300

LogDeep is an open source deeplearning-based log analysis toolkit for automated anomaly detection.
LogDeep is an open source deeplearning-based log analysis toolkit for automated anomaly detection.

LogDeep is an open source deeplearning-based log analysis toolkit for automated anomaly detection.

Code for the paper "TadGAN: Time Series Anomaly Detection Using Generative Adversarial Networks"

TadGAN: Time Series Anomaly Detection Using Generative Adversarial Networks This is a Python3 / Pytorch implementation of TadGAN paper. The associated

Industrial knn-based anomaly detection for images. Visit streamlit link to check out the demo.
Industrial knn-based anomaly detection for images. Visit streamlit link to check out the demo.

Industrial KNN-based Anomaly Detection ⭐ Now has streamlit support! ⭐ Run $ streamlit run streamlit_app.py This repo aims to reproduce the results of

Official PyTorch code for WACV 2022 paper "CFLOW-AD: Real-Time Unsupervised Anomaly Detection with Localization via Conditional Normalizing Flows"

CFLOW-AD: Real-Time Unsupervised Anomaly Detection with Localization via Conditional Normalizing Flows WACV 2022 preprint:https://arxiv.org/abs/2107.1

A PyTorch implementation of
A PyTorch implementation of "ANEMONE: Graph Anomaly Detection with Multi-Scale Contrastive Learning", CIKM-21

ANEMONE A PyTorch implementation of "ANEMONE: Graph Anomaly Detection with Multi-Scale Contrastive Learning", CIKM-21 Dependencies python==3.6.1 dgl==

Comments
  • Anomaly Detection from Data vs Image

    Anomaly Detection from Data vs Image

    I was assigned with project to do anomaly detection on for all our company KPIs. I googled and found AnomalyDetection by Twitter. There was an idea from my colleague to do the anomaly detection on the graph images (comparing with previous week images to identify anomaly points) instead of using time-series raw data.

    I am not familiar with the Anomaly Detection, anyone here experienced and able to advice which one is better (Anomaly Detection from data or image) in term of accuracy, storage and processing time.

    opened by hscj87 0
  • ad_ts does not work with data.table

    ad_ts does not work with data.table

    I'm using a data set with different time series, I'm store it as data.table So in every iteration I filter by some condition:

    DT[var1 == x, c("date", "var2")]

    Error in rbindlist(l, use.names, fill, idcol) : Class attribute on column 1 of item 2 does not match with column 1 of item 1.

    This happen because date column is store as numeric(0), ie:

    all_anoms <- data.frame(timestamp = numeric(0), count = numeric(0)) meanwhile column date is required to be POSIXct/POSIXlt

    opened by fedemolina 0
  • Cannot remove prior installation of package ‘Rcpp’?

    Cannot remove prior installation of package ‘Rcpp’?

    Error: Failed to install 'AnomalyDetection' from GitHub: (converted from warning) cannot remove prior installation of package ‘Rcpp’

    Which version of R is supported?

    opened by esride-jts 1
  • Definition of period in AnomalyDetectionVec !!!

    Definition of period in AnomalyDetectionVec !!!

    The date of the data I have is the monthly data from January 2010, February 2010 to December 2019. I want to use AnomalyDetectionVec to find anomaly for the data. I am wondering should I set period = 12 or else??? Can someone explain more in detail on how the period perimeter work in AnomalyDetectionVec.

    opened by dbsxo2995 2
Releases(v1.0.0)
  • v1.0.0(Jan 6, 2015)

    Today, we’re announcing AnomalyDetection, our open-source R package that automatically detects anomalies like these in big data in a practical and robust way.

    https://blog.twitter.com/2015/introducing-practical-and-robust-anomaly-detection-in-a-time-series

    Source code(tar.gz)
    Source code(zip)
Owner
Twitter
Twitter 💙 #opensource
Twitter
WaveFake: A Data Set to Facilitate Audio DeepFake Detection

WaveFake: A Data Set to Facilitate Audio DeepFake Detection This is the code repository for our NeurIPS 2021 (Track on Datasets and Benchmarks) paper

Chair for Sys­tems Se­cu­ri­ty 27 Dec 22, 2022
Feature Detection Based Template Matching

Feature Detection Based Template Matching The classification of the photos was made using the OpenCv template Matching method. Installation Use the pa

Muhammet Erem 2 Nov 18, 2021
Wafer Fault Detection - Wafer circleci with python

Wafer Fault Detection Problem Statement: Wafer (In electronics), also called a slice or substrate, is a thin slice of semiconductor, such as a crystal

Avnish Yadav 14 Nov 21, 2022
A set of procedures that can realize covid19 virus detection based on blood.

A set of procedures that can realize covid19 virus detection based on blood.

Nuyoah-xlh 3 Mar 7, 2022
Fully Automated YouTube Channel ▶️with Added Extra Features.

Fully Automated Youtube Channel ▒█▀▀█ █▀▀█ ▀▀█▀▀ ▀▀█▀▀ █░░█ █▀▀▄ █▀▀ █▀▀█ ▒█▀▀▄ █░░█ ░░█░░ ░▒█░░ █░░█ █▀▀▄ █▀▀ █▄▄▀ ▒█▄▄█ ▀▀▀▀ ░░▀░░ ░▒█░░ ░▀▀▀ ▀▀▀░

sam-sepiol 249 Jan 2, 2023
Anomaly Transformer: Time Series Anomaly Detection with Association Discrepancy" (ICLR 2022 Spotlight)

About Code release for Anomaly Transformer: Time Series Anomaly Detection with Association Discrepancy (ICLR 2022 Spotlight)

THUML @ Tsinghua University 221 Dec 31, 2022
Motion detector, Full body detection, Upper body detection, Cat face detection, Smile detection, Face detection (haar cascade), Silverware detection, Face detection (lbp), and Sending email notifications

Security camera running OpenCV for object and motion detection. The camera will send email with image of any objects it detects. It also runs a server that provides web interface with live stream video.

Peace 10 Jun 30, 2021
(JMLR'19) A Python Toolbox for Scalable Outlier Detection (Anomaly Detection)

Python Outlier Detection (PyOD) Deployment & Documentation & Stats Build Status & Coverage & Maintainability & License PyOD is a comprehensive and sca

Yue Zhao 6.6k Jan 3, 2023
Streaming Anomaly Detection Framework in Python (Outlier Detection for Streaming Data)

Python Streaming Anomaly Detection (PySAD) PySAD is an open-source python framework for anomaly detection on streaming multivariate data. Documentatio

Selim Firat Yilmaz 181 Dec 18, 2022
deployment of a hybrid model for automatic weapon detection/ anomaly detection for surveillance applications

Automatic Weapon Detection Deployment of a hybrid model for automatic weapon detection/ anomaly detection for surveillance applications. Loved the pro

Janhavi 4 Mar 4, 2022