Industrial knn-based anomaly detection for images. Visit streamlit link to check out the demo.

aventau

Last update: Dec 26, 2022

Related tags

Overview

Industrial KNN-based Anomaly Detection

⭐ Now has streamlit support! ⭐ Run $ streamlit run streamlit_app.py

This repo aims to reproduce the results of the following KNN-based anomaly detection methods:

SPADE (Cohen et al. 2021) - knn in z-space and distance to feature maps
PaDiM* (Defard et al. 2020) - distance to multivariate Gaussian of feature maps
PatchCore (Roth et al. 2021) - knn distance to avgpooled feature maps

* actually does not have any knn mechanism, but shares many things implementation-wise.

Install

$ pipenv install -r requirements.txt

Note: I used torch cu11 wheels.

Usage

CLI:

$ python indad/run.py METHOD [--dataset DATASET]

Results can be found under ./results/.

Code example:

from indad.model import SPADE

model = SPADE(k=5, backbone_name="resnet18")

# feed healthy dataset
model.fit(...)

# get predictions
img_lvl_anom_score, pxl_lvl_anom_score = model.predict(...)

Custom datasets

👁️

Check out one of the downloaded MVTec datasets. Naming of images should correspond among folders. Right now there is no support for no ground truth pixel masks.

📂datasets
 ┗ 📂your_custom_dataset
  ┣ 📂 ground_truth/defective
  ┃ ┣ 📂 defect_type_1
  ┃ ┗ 📂 defect_type_2
  ┣ 📂 test
  ┃ ┣ 📂 defect_type_1
  ┃ ┣ 📂 defect_type_2
  ┃ ┗ 📂 good
  ┗ 📂 train/good

$ python indad/run.py METHOD --dataset your_custom_dataset

Results

📝 = paper, 👇 = this repo

Image-level

class	SPADE 📝	SPADE 👇	PaDiM 📝	PaDiM 👇	PatchCore 📝	PatchCore 👇
bottle	-	98.3	98.3	99.9	100.0	100.0
cable	-	88.1	96.7	87.8	99.5	96.2
capsule	-	80.4	98.5	87.6	98.1	95.3
carpet	-	62.5	99.1	99.5	98.7	98.7
grid	-	25.6	97.3	95.5	98.2	93.0
hazelnut	-	92.8	98.2	86.1	100.0	100.0
leather	-	85.6	99.2	100.0	100.0	100.0
metal_nut	-	78.6	97.2	97.6	100.0	98.3
pill	-	78.8	95.7	92.7	96.6	92.8
screw	-	66.1	98.5	79.6	98.1	96.7
tile	-	96.4	94.1	99.5	98.7	99.0
toothbrush	-	83.9	98.8	94.7	100.0	98.1
transistor	-	89.4	97.5	95.0	100.0	99.7
wood	-	85.3	94.7	99.4	99.2	98.8
zipper	-	97.1	98.5	93.8	99.4	98.4
averages	85.5	80.6	97.5	93.9	99.1	97.7

Pixel-level

class	SPADE 📝	SPADE 👇	PaDiM 📝	PaDiM 👇	PatchCore 📝	PatchCore 👇
bottle	97.5	97.7	94.8	97.6	98.6	97.8
cable	93.7	94.4	88.8	95.5	98.5	97.4
capsule	97.6	98.7	93.5	98.1	98.9	98.3
carpet	87.4	99.0	96.2	98.7	99.1	98.3
grid	88.5	96.4	94.6	96.4	98.7	96.7
hazelnut	98.4	98.4	92.6	97.3	98.7	98.1
leather	97.2	99.1	97.8	98.6	99.3	98.4
metal_nut	99.0	96.1	85.6	95.8	98.4	96.2
pill	99.1	93.5	92.7	94.4	97.6	98.7
screw	98.1	98.9	94.4	97.5	99.4	98.4
tile	96.5	93.1	86.0	92.6	95.9	94.0
toothbrush	98.9	98.9	93.1	98.5	98.7	98.1
transistor	97.9	95.8	84.5	96.9	96.4	97.5
wood	94.1	94.5	91.1	92.9	95.1	91.9
zipper	96.5	98.3	95.9	97.0	98.9	97.6
averages	96.9	96.6	92.1	96.5	98.1	97.2

PatchCore-10 was used.

Hyperparams

The following parameters were used to calculate the results. They more or less correspond to the parameters used in the papers.

spade:
  backbone: wide_resnet50_2
  k: 50
padim:
  backbone: wide_resnet50_2
  d_reduced: 250
  epsilon: 0.04
patchcore:
  backbone: wide_resnet50_2
  f_coreset: 0.1
  n_reweight: 3

Progress

Design considerations

Data is processed in single images to avoid batch statistics interference.
I decided to implement greedy kcenter from scratch and there is room for improvement.
torch.nn.AdaptiveAvgPool2d for feature map resizing, torch.nn.functional.interpolate for score map resizing.
GPU is used for backbones and coreset selection. GPU coreset selection currently runs at:
- 400-500 it/s @ float32 (RTX3080)
- 1000+ it/s @ float16 (RTX3080)

Acknowledgements

hcw-00 for tipping sklearn.random_projection.SparseRandomProjection

References

SPADE:

@misc{cohen2021subimage,
      title={Sub-Image Anomaly Detection with Deep Pyramid Correspondences}, 
      author={Niv Cohen and Yedid Hoshen},
      year={2021},
      eprint={2005.02357},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

PaDiM:

@misc{defard2020padim,
      title={PaDiM: a Patch Distribution Modeling Framework for Anomaly Detection and Localization}, 
      author={Thomas Defard and Aleksandr Setkov and Angelique Loesch and Romaric Audigier},
      year={2020},
      eprint={2011.08785},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

PatchCore:

@misc{roth2021total,
      title={Towards Total Recall in Industrial Anomaly Detection}, 
      author={Karsten Roth and Latha Pemula and Joaquin Zepeda and Bernhard Schölkopf and Thomas Brox and Peter Gehler},
      year={2021},
      eprint={2106.08265},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Comments

Fix for ValueError: continuous format is not supported

in the indad/model.py line 78

the mask must be convert to int type, otherwise, it will get ValueError: continuous format is not supported

Here is the solution: I changed the mask type from float to int

pixel_labels.extend(mask.flatten().numpy().astype(int))

The initial function is like below

	def evaluate(self, test_ds: VisionDataset) -> Tuple[float, float]:
		"""Calls predict step for each test sample."""
		image_preds = []
		image_labels = []
		pixel_preds = []
		pixel_labels = []

		for sample, mask, label in tqdm(test_ds, **TQDM_PARAMS):
			z_score, fmap = self.predict(sample.unsqueeze(0))
			
			image_preds.append(z_score.numpy())
			image_labels.append(label)
			
			pixel_preds.extend(fmap.flatten().numpy())
			pixel_labels.extend(mask.flatten().numpy())
			
		image_preds = np.stack(image_preds)

		image_rocauc = roc_auc_score(image_labels, image_preds)

		pixel_rocauc = roc_auc_score(pixel_labels, pixel_preds)

		return image_rocauc, pixel_rocauc

bug

opened by nixczhou 9

What's the neighbourhood size In PathCore implementations?

Hello and thank for your contribution. Really good job

Trying your code i am wondering if this attribute n_reweight is the neighbourhood size described in the Paper.

opened by mjack3 5
PADIM method consumes too much memory

Hi there,

I am using the pipeline based on the padim method and backbone_name="wide_resnet50_2". However, I see excessive raise in my memory when I check it with htop command in another terminal. I don't have same experience with spade method.

As picture shows, it consumes up to 121G in the memory. I even tried to reduce the 'd_reduced' parameter, but it is still around 100G. May I ask about your opinion how can I make it better? did U also faced same issue?

Best

opened by Mshz2 5

Converting PyTorch Model to Torch Script

I'm trying to convert pytorch to torch script.I want to know whether the detection speed of the converted model has changed. Neither method in the document can be used. https://pytorch.org/tutorials/advanced/cpp_export.html

model = SPADE(k=42)  # , backbone_name="hypernet")
train_dataset = StreamingDataset()
test_dataset = StreamingDataset()
app_custom_train_images =  "D:\\cwge\\Dataset\\bottle\\train\\good"
# train images
for root, dirs, files in os.walk(app_custom_train_images):
    for file in files:
        train_dataset.add_pil_image(Image.open(os.path.join(root, file)))
model.fit(train_dataset)
PATH = "test.pth"
torch.save(model.state_dict(), PATH)
# model.load_state_dict(torch.load(PATH))
#example = torch.rand(1, 3, 224, 224).cuda()
#traced_script_module = torch.jit.trace(model, example)
script_net = torch.jit.script(model)
script_net.save('model_script.pt')
#traced_script_module.save("traced_resnet_model.pt")

My C + + code reported an error when loading the model.

#include <iostream>
#include "torch/script.h"
#include <vector>
int main()
{torch::jit::script::Module module = torch::jit::load("D:\\model_script.pt");}

Unhandled exception at 0x00007FFB76BA4B89 in ConsoleApplication1.exe: Microsoft C++ exception: c10::NotImplementedError at memory location 0x00000095C3EFDD50.

opened by x12901 4

Patchcore: the value of k in nearest neighbor search

Thanks to your efforts, I've noticed in the code of patchcore, doing k nearest neighbor search, https://github.com/rvorias/ind_knn_ad/blob/5510f0f4bc1fc9688e57d20023985087f90f0f4d/indad/models.py#L286 the value of k is set to 3, can I ask how to determine the value of k? can I set k=1?

opened by ghost 3
Defining new threshold for new dataset

Hi there, thanks for sharing your codes publicly. I am trying to follow the custom dataset workflow. My question is about defining my own threshold based on required prec/recall, as you mentioned in here. I am quite newbie and would like to use your repo for my study thesis. My I ask for quick clue where in the code and how can I set a new threshold?

After training with spade method, I test the model on an anomaly test image and got the img anom score: tensor(8.3064), while I obtained even similar value with a good test image.

btw, as you mentioned I defined a custom dataset folder, but do we need to also define a new dataloader? as the already utilized the one in the run.py is for MVTecDataset.
train_ds, test_ds = MVTecDataset(cls).get_dataloaders() My dataset folder structure is like: -custom_dataset --train ---good --test ---good ---bad --ground_truth ---bad

I really appreciate your help and effort.

opened by Mshz2 3
New performance and improvement int patchcore

Hello.

I tested your Patchcore implementation and i need to advice you of a mistake of what the table of results shows I tested your implementation for 1% of coreset with different B values to get the mean as final results

You can see, that the performance is better than what you described in your table.

Moreover, if we use avgpooling (3, 1, 1) instead avgpooling(3, 2, 0) with Interpolation Lanzcos in the training, results are still better. In fact, paper says that increasing the stride decreases the performance

Here is the results of using this new avgpooling

opened by mjack3 2
Heatmap Manual adjustment option on Streamlit.

Thank you for sharing your streamlit code.

It was very easy to check my data.

But I made a red pattern even in a normal image.

Because on streamlit color assignment of heatmap is decided by each image. I think it is better to fix the range if necessary.

So I made it. If you don't check the box, the operation will be the same as before.

If you like it, please merge it.

opened by h1day 2

Some images will be lost due to detection

The size of my picture is 1280*1024,I use the command streamlit run streamlit_app.py . The result is very good. But part of my picture is missing. The displayed result is not a complete picture. Can the cropping of the picture be changed? I tried to modify the code, but the result was not good.Can the detection speed be improved?Can I just load the model without training every time?

class SPADE(KNNExtractor):
    def __init__(
            self,
            k: int = 5,
            backbone_name: str = "resnet50",
    ):
        super().__init__(
            backbone_name=backbone_name,
            out_indices=(1, 2, 3),
            pool=True,
        )
        self.k = k
        self.image_size_x = 1280
        self.image_size_y = 1024
        self.z_lib = []
        self.feature_maps = []
        self.threshold_z = None
        self.threshold_fmaps = None
        self.blur = GaussianBlur(4)

    def predict(self, sample):
        feature_maps, z = self(sample)

        distances = torch.linalg.norm(self.z_lib - z, dim=1)
        values, indices = torch.topk(distances.squeeze(), self.k, largest=False)

        z_score = values.mean()

        # Build the feature gallery out of the k nearest neighbours.
        # The authors migh have concatenated all features maps first, then check the minimum norm per pixel.
        # Here, we check for the minimum norm first, then concatenate (sum) in the final layer.
        scaled_s_map = torch.zeros(1, 1, self.image_size_y, self.image_size_x)
        for idx, fmap in enumerate(feature_maps):
            nearest_fmaps = torch.index_select(self.feature_maps[idx], 0, indices)
            # min() because kappa=1 in the paper
            s_map, _ = torch.min(torch.linalg.norm(nearest_fmaps - fmap, dim=1), 0, keepdims=True)
            scaled_s_map += torch.nn.functional.interpolate(
                s_map.unsqueeze(0), size=(self.image_size_y, self.image_size_x), mode='bilinear'
            )

        scaled_s_map = self.blur(scaled_s_map)

        return z_score, scaled_s_map

opened by x12901 2

Too many training images, memory overflow

Hi, great project! I have 8000 images, and I found that the memory increased a lot during training. My computer has 60G RAM but it is still not enough.

model = SPADE(k=42)  # , backbone_name="hypernet")
train_dataset = StreamingDataset()
app_custom_train_images = "D:\\train\\good"
# train images
for root, dirs, files in os.walk(app_custom_train_images):
    for file in files:
        train_dataset.add_pil_image(Image.open(os.path.join(root, file)))
model.fit(train_dataset)
PATH = "test.pth"
torch.save(model.state_dict(), PATH)

pil_img = plt.open("20210115_raw.png")
img_pil_1 = np.array(pil_img) 
tensor_pil = torch.from_numpy(np.transpose(img_pil_1, (2, 0, 1)))
img = tensor_pil.type(torch.float32)  
img = img.unsqueeze(0) 
pil_to_tensor = img 
img_lvl_anom_score, pxl_lvl_anom_score = model.predict(pil_to_tensor)

The test picture does not seem to need to be transformed. Does it support pictures in other formats?

good first issue

opened by x12901 2

Overwrite yml file but it does not exist.

Test results zipper - image_rocauc: 0.94, pixel_rocauc: 0.98 Traceback (most recent call last): File "D:\Code\ind_knn_ad-master\indad\run.py", line 82, in cli_interface() File "D:\ProgramData\Anaconda3\envs\env1\lib\site-packages\click\core.py", line 1137, in call return self.main(*args, **kwargs) File "D:\ProgramData\Anaconda3\envs\env1\lib\site-packages\click\core.py", line 1062, in main rv = self.invoke(ctx) File "D:\ProgramData\Anaconda3\envs\env1\lib\site-packages\click\core.py", line 1404, in invoke return ctx.invoke(self.callback, **ctx.params) File "D:\ProgramData\Anaconda3\envs\env1\lib\site-packages\click\core.py", line 763, in invoke return __callback(*args, **kwargs) File "D:\Code\ind_knn_ad-master\indad\run.py", line 79, in cli_interface write_results(total_results, method) File "D:\Code\ind_knn_ad-master\indad\utils.py", line 65, in write_results with open(f"./results/{name}.yml", "w") as outfile: FileNotFoundError: [Errno 2] No such file or directory: './results/spade_26_07_2021_22_47_37.yml'
good first issue

opened by leolin65 2
Over resource limits on Streamlit Cloud

Hey the app is currently not responding

by the way do you have a tutorial how can i make the app run on a windows maschine? im new to maschine learning and want to test anomaly detection on pictures of industrial parts.

opened by fIomingo 0
Over resource limits on Streamlit Cloud

Hey there :wave: Just wanted to let you know that your app on Streamlit Cloud deployed from this repo has gone over its resource limits. Access to the app is temporarily limited. Visit the app to see more details and possible solutions.

opened by vrojkova 0
Over resource limits on Streamlit Cloud

Hey there :wave: Just wanted to let you know that your app on Streamlit Cloud deployed from this repo has gone over its resource limits. Access to the app is temporarily limited. Visit the app to see more details and possible solutions.

opened by kotai2003 1

When creating "patchcore patch_lib" variable, can we calculate it in batch units?

Hi! The more image samples are, the greater the 28(Fmap_H)*28(Fmap_W)*num_samples, so it grows because it is projecting. Is there any way to reduce it?

def fit(self, train_dl):
        for sample, _ in tqdm(train_dl, **get_tqdm_params()):
	        feature_maps = self(sample)
        
	        if self.resize is None:
		        largest_fmap_size = feature_maps[0].shape[-2:]
		        self.resize = torch.nn.AdaptiveAvgPool2d(largest_fmap_size)
	        resized_maps = [self.resize(self.average(fmap)) for fmap in feature_maps]
	        patch = torch.cat(resized_maps, 1)
	        patch = patch.reshape(patch.shape[1], -1).T
        
	        self.patch_lib.append(patch)
        
        self.patch_lib = torch.cat(self.patch_lib, 0)
        
        if self.f_coreset < 1:
	        self.coreset_idx = get_coreset_idx_randomp(
		        self.patch_lib,
		        n=int(self.f_coreset * self.patch_lib.shape[0]),
		        eps=self.coreset_eps,
	        )
	        self.patch_lib = self.patch_lib[self.coreset_idx]

opened by ingbeeedd 1

Model Save&Load&Test

How can I model save, model load and model test for one image?

PATH = '/home/agteks/Desktop/ind_knn_ad/model.pt' model = PatchCore(f_coreset= .01, backbone_name= "efficientnet_b0", coreset_eps= .95)

model.load_state_dict(torch.load('/home/agteks/Desktop/ind_knn_ad/model.pt')) model.eval() model = model.to('cuda')

#sample, *_ = test_dataset[st.session_state.test_idx] pil_img = Image.open("/home/agteks/Desktop/ind_knn_ad/datasets/fabric_4/test/broken_large/Cropped_2_276_46_768.png") pil_img = pil_img.resize((224, 224)) pil_to_tensor = transforms.ToTensor()(pil_img).unsqueeze_(0) img_lvl_anom_score, pxl_lvl_anom_score = model.predict(pil_to_tensor) score_range = pxl_lvl_anom_score.min(), pxl_lvl_anom_score.max() show_pred(pil_to_tensor, img_lvl_anom_score, pxl_lvl_anom_score)

-- This code not working. Please help me :(

opened by caglarerennn 4

Owner

aventau

Into graphics and modelling. Computer Vision / Machine Learning Engineer.

GitHub https://share.streamlit.io/rvorias/ind_knn_ad

Anomaly Transformer: Time Series Anomaly Detection with Association Discrepancy" (ICLR 2022 Spotlight)

About Code release for Anomaly Transformer: Time Series Anomaly Detection with Association Discrepancy (ICLR 2022 Spotlight)

221 Dec 31, 2022

OCR Streamlit App is used to extract text from images using python's easyocr, pytorch and streamlit packages

OCR-Streamlit-App OCR Streamlit App is used to extract text from images using python's easyocr, pytorch and streamlit packages OCR app gets an image a

5 Apr 5, 2022

Demo project for real time anomaly detection using kafka and python

kafkaml-anomaly-detection Project for real time anomaly detection using kafka and python It's assumed that zookeeper and kafka are running in the loca

36 Dec 12, 2022

Product-based-recommendation-system - A product based recommendation system which uses Machine learning algorithm such as KNN and cosine similarity

Product-based-recommendation-system A product based recommendation system which

2 Feb 15, 2022

Working demo of the Multi-class and Anomaly classification model using the CLIP feature space

??️ Hindsight AI: Crime Classification With Clip About For Educational Purposes Only This is a recursive neural net trained to classify specific crime

2 Jun 5, 2022

An image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testingAn image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testing

SVM Données Une base d’images contient 490 images pour l’apprentissage (400 voitures et 90 bateaux), et encore 21 images pour fait des tests. Prétrait

3 Nov 30, 2021

Streamlit App For Product Analysis - Streamlit App For Product Analysis

Streamlit_App_For_Product_Analysis Здравствуйте! Перед вами дашборд, позволяющий

1 Jan 10, 2022

Weighted K Nearest Neighbors (kNN) algorithm implemented on python from scratch.

kNN_From_Scratch I implemented the k nearest neighbors (kNN) classification algorithm on python. This algorithm is used to predict the classes of new

1 Dec 14, 2021

Python KNN model: Predicting a probability of getting a work visa. Tableau: Non-immigrant visas over the years.

The value of international students to the United States. Probability of getting a non-immigrant visa. Project timeline: Jan 2021 - April 2021 Project

2 Nov 21, 2021

Team nan solution repository for FPT data-centric competition. Data augmentation, Albumentation, Mosaic, Visualization, KNN application

FPT_data_centric_competition - Team nan solution repository for FPT data-centric competition. Data augmentation, Albumentation, Mosaic, Visualization, KNN application

2 Oct 30, 2022

(JMLR'19) A Python Toolbox for Scalable Outlier Detection (Anomaly Detection)

Python Outlier Detection (PyOD) Deployment & Documentation & Stats Build Status & Coverage & Maintainability & License PyOD is a comprehensive and sca

6.6k Jan 3, 2023

Streaming Anomaly Detection Framework in Python (Outlier Detection for Streaming Data)

Python Streaming Anomaly Detection (PySAD) PySAD is an open-source python framework for anomaly detection on streaming multivariate data. Documentatio

181 Dec 18, 2022

A Python Library for Graph Outlier Detection (Anomaly Detection)

PyGOD is a Python library for graph outlier detection (anomaly detection). This exciting yet challenging field has many key applications, e.g., detect

757 Jan 4, 2023

ONNX Runtime Web demo is an interactive demo portal showing real use cases running ONNX Runtime Web in VueJS.

ONNX Runtime Web demo is an interactive demo portal showing real use cases running ONNX Runtime Web in VueJS. It currently supports four examples for you to quickly experience the power of ONNX Runtime Web.

58 Dec 18, 2022

Paper list of log-based anomaly detection

411 Dec 5, 2022

MemStream: Memory-Based Anomaly Detection in Multi-Aspect Streams with Concept Drift

MemStream Implementation of MemStream: Memory-Based Anomaly Detection in Multi-Aspect Streams with Concept Drift . Siddharth Bhatia, Arjit Jain, Shivi

61 Dec 2, 2022

LogDeep is an open source deeplearning-based log analysis toolkit for automated anomaly detection.

279 Dec 13, 2022

Anomaly Detection Based on Hierarchical Clustering of Mobile Robot Data

We proposed a new approach to detect anomalies of mobile robot data. We investigate each data seperately with two clustering method hierarchical and k-means. There are two sub-method that we used for produce an anomaly score. Then, we merge these two score and produce merged anomaly score as a result.

1 Jan 9, 2022

LightLog is an open source deep learning based lightweight log analysis tool for log anomaly detection.

LightLog Introduction LightLog is an open source deep learning based lightweight log analysis tool for log anomaly detection. Function description [BG

25 Dec 17, 2022