Official PyTorch implementation for "Mixed supervision for surface-defect detection: from weakly to fully supervised learning"

Overview

Mixed supervision for surface-defect detection: from weakly to fully supervised learning [Computers in Industry 2021]

Official PyTorch implementation for "Mixed supervision for surface-defect detection: from weakly to fully supervised learning" published in journal Computers in Industry 2021.

The same code is also an offical implementation of the method used in "End-to-end training of a two-stage neural network for defect detection" published in International Conference on Pattern Recognition 2020.

Citation

Please cite our Computers in Industry 2021 paper when using this code:

@article{Bozic2021COMIND,
  author = {Bo{\v{z}}i{\v{c}}, Jakob and Tabernik, Domen and 
  Sko{\v{c}}aj, Danijel},
  journal = {Computers in Industry},
  title = {{Mixed supervision for surface-defect detection: from weakly to fully supervised learning}},
  year = {2021}
}

How to run:

Requirements

Code has been tested to work on:

  • Python 3.8
  • PyTorch 1.6, 1.8
  • CUDA 10.0, 10.1
  • using additional packages as listed in requirements.txt

Datasets

You will need to download the datasets yourself. For DAGM and Severstal Steel Defect Dataset you will also need a Kaggle account.

  • DAGM available here.
  • KolektorSDD available here.
  • KolektorSDD2 available here.
  • Severstal Steel Defect Dataset available here.

For details about data structure refer to README.md in datasets folder.

Cross-validation splits, train/test splits and weakly/fully labeled splits for all datasets are located in splits directory of this repository, alongside the instructions on how to use them.

Using on other data

Refer to README.md in datasets for instructions on how to use the method on other datasets.

Demo - fully supervised learning

To run fully supervised learning and evaluation on all four datasets run:

./DEMO.sh
# or by specifying multiple GPU ids 
./DEMO.sh 0 1 2

Results will be written to ./results folder.

Replicating paper results

To replicate the results published in the paper run:

./EXPERIMENTS_COMIND.sh
# or by specifying multiple GPU ids 
./EXPERIMENTS_COMIND.sh 0 1 2

To replicate the results from ICPR 2020 paper:

@misc{Bozic2020ICPR,
    title={End-to-end training of a two-stage neural network for defect detection},
    author={Jakob Božič and Domen Tabernik and Danijel Skočaj},
    year={2020},
    eprint={2007.07676},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

run:

./EXPERIMENTS_ICPR.sh
# or by specifying multiple GPU ids 
./EXPERIMENTS_ICPR.sh 0 1 2

Results will be written to ./results-comind and ./results-icpr folders.

Usage of training/evaluation code

The following python files are used to train/evaluate the model:

  • train_net.py Main entry for training and evaluation
  • models.py Model file for network
  • data/dataset_catalog.py Contains currently supported datasets

In order to train and evaluate a network you can also use EXPERIMENTS_ROOT.sh, which contains several functions that will make training and evaluation easier for you. For more details see the file EXPERIMENTS_ROOT.sh.

Running code

Simplest way to train and evaluate a network is to use EXPERIMENTS_ROOT.sh, you can see examples of use in EXPERIMENTS_ICPR.sh and in EXPERIMENTS_COMIND.sh

If you wish to do it the other way you can do it by running train_net.py and passing the parameters as keyword arguments. Bellow is an example of how to train a model for a single fold of KSDD dataset.

python -u train_net.py  \
    --GPU=0 \
    --DATASET=KSDD \
    --RUN_NAME=RUN_NAME \
    --DATASET_PATH=/path/to/dataset \
    --RESULTS_PATH=/path/to/save/results \
    --SAVE_IMAGES=True \
    --DILATE=7 \
    --EPOCHS=50 \
    --LEARNING_RATE=1.0 \
    --DELTA_CLS_LOSS=0.01 \
    --BATCH_SIZE=1 \
    --WEIGHTED_SEG_LOSS=True \
    --WEIGHTED_SEG_LOSS_P=2 \
    --WEIGHTED_SEG_LOSS_MAX=1 \
    --DYN_BALANCED_LOSS=True \
    --GRADIENT_ADJUSTMENT=True \
    --FREQUENCY_SAMPLING=True \
    --TRAIN_NUM=33 \
    --NUM_SEGMENTED=33 \
    --FOLD=0

Some of the datasets do not require you to specify --TRAIN_NUM or --FOLD- After training, each model is also evaluated.

For KSDD you need to combine the results of evaluation from all three folds, you can do this by using join_folds_results.py:

python -u join_folds_results.py \
    --RUN_NAME=SAMPLE_RUN \
    --RESULTS_PATH=/path/to/save/results \
    --DATASET=KSDD 

You can use read_results.py to generate a table of results f0r all runs for selected dataset.
Note: The model is sensitive to random initialization and data shuffles during the training and will lead to different performance with different runs unless --REPRODUCIBLE_RUN is set.

Comments
  • torch.jit.trace error

    torch.jit.trace error

    Thanks for your great project, and I need to use jit.trace convert model to libtorch(C++) model ,code as below

    model.to(device) model.cuda() model.eval() example = torch.rand(1, 3, INPUT_HEIGHT,INPUT_WIDTH).cuda() traced_script_module = torch.jit.trace(model, example) traced_script_module.save("model_for_libtorch.pt")

    RuntimeError: Could not export Python function call 'GradientMultiplyLayer'. Remove calls to Python functions before export. Did you forget to add @script or @script_method annotation? If this is a nn.ModuleList, add it to constants:

    opened by jjqcat 5
  • It seems no evalutation about segmentation

    It seems no evalutation about segmentation

    First, thanks for your public nice code!

    I just wonder whether the AP metric in paper is about classification rather than semantic segmentation. And in your "end2end.py " L246, it seems only evaluate for "predictions" which is result of classification.

    By the way, I think "weakly supervision" is not precise in your mix supervision setting, because if your final evaluation is only about classification, class-level, pixel-level labels will all be full-supervised label for classification.

    opened by HustHB 3
  • KSDD2 splits Question

    KSDD2 splits Question

    hello, the number of the positive data is 246, all same in five different splits, is there anything wrong? I think that the number in the name of the split represents the number of the positive data, do i make something wrong?

    opened by smiler96 3
  • generate models by training on STEEL data

    generate models by training on STEEL data

    Hi , can you please tell me as to what parameters to use to train the model to get the AP 99.99% (as mentioned in paper)? I tried to train using below parameters on STEEL dataset but achieved AP 75% only. Also could you please tell me how is the class information given to the model for this dataset?

    python -u train_net.py  \
            --GPU=0 \
            --DATASET=STEEL \
            --RUN_NAME=steel_server_1 \
            --DATASET_PATH=datasets/STEEL \
            --RESULTS_PATH=datasets/steel_result_1 \
            --SAVE_IMAGES=True \
            --DILATE=7 \
            --EPOCHS=50 \
            --VALIDATION_N_EPOCHS=5 \
            --LEARNING_RATE=0.1 \
            --DELTA_CLS_LOSS=0.1 \
            --BATCH_SIZE=500 \
            --WEIGHTED_SEG_LOSS=True \
            --WEIGHTED_SEG_LOSS_P=2 \
            --WEIGHTED_SEG_LOSS_MAX=1 \
            --DYN_BALANCED_LOSS=True \
            --GRADIENT_ADJUSTMENT=True \
            --FREQUENCY_SAMPLING=True \
            --TRAIN_NUM=3000 \
            --NUM_SEGMENTED=3000 \
            --FOLD=0
    

    Thanks

    opened by rashi-b 1
  • Question about End2End.py

    Question about End2End.py

    Thank you so much for sharing the scripts.

    In line 129 of "End2End.py", "training_iteration()" returns "total_loss_seg + total_loss_dec" instead of "total_loss". Is it supposed to mean something?

    opened by smiura-ai 1
  • Question about Total_correct

    Question about Total_correct

    end2end.py, line113: total_correct += (decision > 0.5).item() == is_pos_.item() decision is the output of model(fc layer),it‘s not a probability, Is function sigmoid() missed?

    opened by Youskrpig 1
  • Modified the distance transform function to fix the problem of ignoring small defects.

    Modified the distance transform function to fix the problem of ignoring small defects.

    Result

    image

    Code

    import os
    import cv2
    import numpy as np
    from scipy.ndimage.morphology import distance_transform_edt
    import matplotlib.pyplot as plt
    
    def distance_transform(mask: np.ndarray, max_val: float, p: float) -> np.ndarray:
    	dst_trf = distance_transform_edt(mask)
    	if dst_trf.max() > 0:
    		dst_trf = (dst_trf / dst_trf.max())
    		dst_trf = (dst_trf ** p) * max_val
    	dst_trf[mask == 0] = 1
    	return np.array(dst_trf, dtype=np.float32)
    
    def distance_transform_new(mask, max_val, p): 
        h, w = mask.shape[:2]
        dst_trf = np.zeros((h, w))
        num_labels, labels = cv2.connectedComponents(mask, connectivity=8)
        for idx in range(1, num_labels):
            mask_roi= np.zeros((h, w))
            k = labels == idx
            mask_roi[k] = 255
            dst_trf_roi = distance_transform_edt(mask_roi)
            if dst_trf_roi.max() > 0:
                dst_trf_roi = (dst_trf_roi / dst_trf_roi.max())
                dst_trf_roi = (dst_trf_roi ** p) * max_val
            dst_trf += dst_trf_roi
    
        dst_trf[mask == 0] = 1
        return np.array(dst_trf, dtype=np.float32)
    
    image_name = './KSDD2/train/20922.png'
    mask_name = './KSDD2/train/20922_GT.png'
    
    img = cv2.imread(image_name)
    img = cv2.resize(img, dsize=(224, 600))
    
    lbl = cv2.imread(mask_name, cv2.IMREAD_GRAYSCALE)
    lbl = cv2.resize(lbl, dsize=(224, 600))
    
    dilate_lbl = cv2.dilate(lbl, np.ones((7, 7)))
    
    distance_transform_lbl = distance_transform(dilate_lbl, max_val=3.0, p=2)
    distance_transform_lbl_new = distance_transform_new(dilate_lbl, max_val=3.0, p=2)
    
    plt.figure(figsize=(10, 6))
    plt.suptitle("KSDD2 train 20922.png and 20922_GT.png")
    plt.subplot(151)
    plt.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
    plt.title("img")
    plt.subplot(152)
    plt.imshow(lbl, cmap='gray')
    plt.title("lbl")
    plt.subplot(153)
    plt.imshow(dilate_lbl, cmap='gray')
    plt.title("dilate7_lbl")
    plt.subplot(154)
    plt.imshow(distance_transform_lbl, cmap='gray')
    plt.title("DT_lbl")
    plt.subplot(155)
    plt.imshow(distance_transform_lbl_new, cmap='gray')
    plt.title("DT_lbl_new")
    plt.tight_layout()
    plt.show()
    
    
    opened by Yangliuly1 0
  • Modified config.py to fix variable name.

    Modified config.py to fix variable name.

    # if args.ON_DEMAND_READ is not None: self.TRAIN_NUM = args.ON_DEMAND_READ if args.ON_DEMAND_READ is not None: self.ON_DEMAND_READ = args.ON_DEMAND_READ # if args.REPRODUCIBLE_RUN is not None: self.TRAIN_NUM= args.REPRODUCIBLE_RUN if args.REPRODUCIBLE_RUN is not None: self.REPRODUCIBLE_RUN = args.REPRODUCIBLE_RUN # if args.MEMORY_FIT is not None: self.TRAIN_NUM = args.MEMORY_FIT if args.MEMORY_FIT is not None: self.MEMORY_FIT = args.MEMORY_FIT

    opened by Yangliuly1 0
  • How to change the binary classification to multi-classification?

    How to change the binary classification to multi-classification?

    KSDD datasets are binary classification. If every picture in my datasets is an object with two or more defects, how can I modify the algorithm to be multi-classified?I'd appreciate it if you could help me!Thanks!

    opened by wang-gang 2
  • how do I get image results by running demo_eval_single_image.py

    how do I get image results by running demo_eval_single_image.py

    Thansk for your great work!

    I was just wondering if it is possible to get image resulst from running demo_eval_single_image.py It seems that only score is availale as a result. An input image with a localization of defects is needed.

    opened by Jihyun0510 1
  • index 266 is out of bounds for axis 0 with size 264

    index 266 is out of bounds for axis 0 with size 264

    Problems when using my own dataset. How can I annotation multi-tagged images?Is it possible to use pseudo-color annotation like paddle?https://github.com/PaddlePaddle/PaddleSeg/blob/release/2.5/docs/data/marker/marker.md

    Directory structure

    COCO

    --IMAGE

    --LABEL

    D:\mixed-segdec-net-comind2021-master\mixed-segdec-net-comind2021-master>python -u train_net.py  --GPU=1 --DATASET=COCO --RUN_NAME=COCO --DATASET_PATH=datasets/COCO --RESULTS_PATH=save/results --SAVE_IMAGES=True --DILATE=7 --EPOCHS=50 --LEARNING_RATE=1.0 --DELTA_CLS_LOSS=0.01 --BATCH_SIZE=1 --WEIGHTED_SEG_LOSS=True --WEIGHTED_SEG_LOSS_P=2 --WEIGHTED_SEG_LOSS_MAX=1 --DYN_BALANCED_LOSS=True --GRADIENT_ADJUSTMENT=True --FREQUENCY_SAMPLING=True  --NUM_SEGMENTED=0 --FOLD=2
    COCO Executing run with path save/results\COCO\COCO
    COCO BATCH_SIZE                : 1
    COCO DATASET                   : COCO
    COCO DATASET_PATH              : datasets/COCO
    COCO DELTA_CLS_LOSS            : 0.01
    COCO DILATE                    : 7
    COCO DYN_BALANCED_LOSS         : True
    COCO EPOCHS                    : 50
    COCO FOLD                      : 2
    COCO FREQUENCY_SAMPLING        : True
    COCO GPU                       : 1
    COCO GRADIENT_ADJUSTMENT       : True
    COCO INPUT_CHANNELS            : 1
    COCO INPUT_HEIGHT              : 400
    COCO INPUT_WIDTH               : 2448
    COCO LEARNING_RATE             : 1.0
    COCO MEMORY_FIT                : 1
    COCO NUM_SEGMENTED             : 0
    COCO ON_DEMAND_READ            : False
    COCO REPRODUCIBLE_RUN          : False
    COCO RESULTS_PATH              : save/results
    COCO SAVE_IMAGES               : True
    COCO TRAIN_NUM                 : None
    COCO USE_BEST_MODEL            : False
    COCO VALIDATE                  : True
    COCO VALIDATE_ON_TEST          : True
    COCO VALIDATION_N_EPOCHS       : 5
    COCO WEIGHTED_SEG_LOSS         : True
    COCO WEIGHTED_SEG_LOSS_MAX     : 1.0
    COCO WEIGHTED_SEG_LOSS_P       : 2.0
    datasets/COCO
    268
    264
    536
    datasets/COCO
    268
    264
    532
    536
    536
    COCO Saving current model state to save/results\COCO\COCO\models\ep_00.pth
    COCO Returning seg_loss_weight 1.0 and dec_loss_weight 0.0
    COCO Returning dec_gradient_multiplier 0
    Traceback (most recent call last):
      File "train_net.py", line 63, in <module>
        end2end.train()
      File "D:\mixed-segdec-net-comind2021-master\mixed-segdec-net-comind2021-master\end2end.py", line 60, in train
        train_results = self._train_model(device, model, train_loader, loss_seg, loss_dec, optimizer, validation_loader, tensorboard_writer)
      File "D:\mixed-segdec-net-comind2021-master\mixed-segdec-net-comind2021-master\end2end.py", line 160, in _train_model
        for iter_index, (data) in enumerate(train_loader):
      File "C:\ProgramData\Anaconda3\envs\pytorch\lib\site-packages\torch\utils\data\dataloader.py", line 517, in __next__
        data = self._next_data()
      File "C:\ProgramData\Anaconda3\envs\pytorch\lib\site-packages\torch\utils\data\dataloader.py", line 557, in _next_data
        data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
      File "C:\ProgramData\Anaconda3\envs\pytorch\lib\site-packages\torch\utils\data\_utils\fetch.py", line 44, in fetch
        data = [self.dataset[idx] for idx in possibly_batched_index]
      File "C:\ProgramData\Anaconda3\envs\pytorch\lib\site-packages\torch\utils\data\_utils\fetch.py", line 44, in <listcomp>
        data = [self.dataset[idx] for idx in possibly_batched_index]
      File "D:\mixed-segdec-net-comind2021-master\mixed-segdec-net-comind2021-master\data\dataset.py", line 49, in __getitem__
        ix = self.neg_imgs_permutation[ix]
    IndexError: index 266 is out of bounds for axis 0 with size 264
    
    

    input_coco.py

    class COCODataset(Dataset):
        def __init__(self, kind: str, cfg):
            super(COCODataset, self).__init__(cfg.DATASET_PATH, cfg, kind)
            self.read_contents()
    
        def read_contents(self):
            pos_samples, neg_samples = [], []
            print(self.path)
            img_num = 0
            for image_name in glob.glob(self.path + '/IMAGE/*.jpg'):
                img_num += 1
                image_path = image_name
                image = self.read_img_resize(image_path, self.grayscale, self.image_size)
                img_name_short = image_name[:-4]
                jpg_name = img_name_short.split('\\')[-1]
                seg_mask_path = os.path.join(self.path, "LABEL", f"{jpg_name}.jpg")
                if os.path.exists(seg_mask_path):
                    seg_mask, _ = self.read_label_resize(seg_mask_path, self.image_size, dilate=self.cfg.DILATE)
                    image = self.to_tensor(image)
                    seg_loss_mask = self.distance_transform(seg_mask, self.cfg.WEIGHTED_SEG_LOSS_MAX,
                                                            self.cfg.WEIGHTED_SEG_LOSS_P)
                    seg_mask = self.to_tensor(self.downsize(seg_mask))
                    seg_loss_mask = self.to_tensor(self.downsize(seg_loss_mask))
                    pos_samples.append((image, seg_mask, seg_loss_mask, True, image_path, None, img_name_short))
                else:
                    seg_mask = np.zeros_like(image)
                    image = self.to_tensor(image)
                    seg_loss_mask = self.to_tensor(self.downsize(np.ones_like(seg_mask)))
                    seg_mask = self.to_tensor(self.downsize(seg_mask))
                    neg_samples.append((image, seg_mask, seg_loss_mask, True, image_path, seg_mask_path, img_name_short))
            self.pos_samples = pos_samples
            self.neg_samples = neg_samples
            self.num_pos = len(pos_samples)
            print(len(pos_samples))
            self.num_neg = len(neg_samples)
            print(len(neg_samples))
            self.len = 2 * len(pos_samples) if self.kind in ['TRAIN'] else len(pos_samples) + len(neg_samples)
            print(self.len)
            self.init_extra()
    
    opened by x12901 0
Owner
ViCoS Lab
ViCoS Lab
CVPR 2021 Oral paper "LED2-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering" official PyTorch implementation.

LED2-Net This is PyTorch implementation of our CVPR 2021 Oral paper "LED2-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering". Y

Fu-En Wang 83 Jan 4, 2023
An official PyTorch implementation of the paper "Learning by Aligning: Visible-Infrared Person Re-identification using Cross-Modal Correspondences", ICCV 2021.

PyTorch implementation of Learning by Aligning (ICCV 2021) This is an official PyTorch implementation of the paper "Learning by Aligning: Visible-Infr

CV Lab @ Yonsei University 30 Nov 5, 2022
[BMVC'21] Official PyTorch Implementation of Grounded Situation Recognition with Transformers

Grounded Situation Recognition with Transformers Paper | Model Checkpoint This is the official PyTorch implementation of Grounded Situation Recognitio

Junhyeong Cho 18 Jul 19, 2022
Official implementation of Character Region Awareness for Text Detection (CRAFT)

CRAFT: Character-Region Awareness For Text detection Official Pytorch implementation of CRAFT text detector | Paper | Pretrained Model | Supplementary

Clova AI Research 2.5k Jan 3, 2023
The project is an official implementation of our paper "3D Human Pose Estimation with Spatial and Temporal Transformers".

3D Human Pose Estimation with Spatial and Temporal Transformers This repo is the official implementation for 3D Human Pose Estimation with Spatial and

Ce Zheng 363 Dec 28, 2022
Official implementation of "An Image is Worth 16x16 Words, What is a Video Worth?" (2021 paper)

An Image is Worth 16x16 Words, What is a Video Worth? paper Official PyTorch Implementation Gilad Sharir, Asaf Noy, Lihi Zelnik-Manor DAMO Academy, Al

null 213 Nov 12, 2022
Pytorch implementation of PSEnet with Pyramid Attention Network as feature extractor

Scene Text-Spotting based on PSEnet+CRNN Pytorch implementation of an end to end Text-Spotter with a PSEnet text detector and CRNN text recognizer. We

azhar shaikh 62 Oct 10, 2022
A PyTorch implementation of ECCV2018 Paper: TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes

TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes A PyTorch implement of TextSnake: A Flexible Representation for Detecting

Prince Wang 417 Dec 12, 2022
This is a pytorch re-implementation of EAST: An Efficient and Accurate Scene Text Detector.

EAST: An Efficient and Accurate Scene Text Detector Description: This version will be updated soon, please pay attention to this work. The motivation

Dejia Song 544 Dec 20, 2022
PyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector

Description This is a PyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector. Only RBOX part is implemented. Using dice loss

null 365 Dec 20, 2022
FOTS Pytorch Implementation

News!!! Recognition branch now is added into model. The whole project has beed optimized and refactored. ICDAR Dataset SynthText 800K Dataset detectio

Ning Lu 599 Dec 19, 2022
kaldi-asr/kaldi is the official location of the Kaldi project.

Kaldi Speech Recognition Toolkit To build the toolkit: see ./INSTALL. These instructions are valid for UNIX systems including various flavors of Linux

Kaldi 12.3k Jan 5, 2023
Official code for "Bridging Video-text Retrieval with Multiple Choice Questions", CVPR 2022 (Oral).

Bridging Video-text Retrieval with Multiple Choice Questions, CVPR 2022 (Oral) Paper | Project Page | Pre-trained Model | CLIP-Initialized Pre-trained

Applied Research Center (ARC), Tencent PCG 99 Jan 6, 2023
Official code for ROCA: Robust CAD Model Retrieval and Alignment from a Single Image (CVPR 2022)

ROCA: Robust CAD Model Alignment and Retrieval from a Single Image (CVPR 2022) Code release of our paper ROCA. Check out our video, paper, and website

null 123 Dec 25, 2022
Open Source Differentiable Computer Vision Library for PyTorch

Kornia is a differentiable computer vision library for PyTorch. It consists of a set of routines and differentiable modules to solve generic computer

kornia 7.6k Jan 4, 2023
CRAFT-Pyotorch:Character Region Awareness for Text Detection Reimplementation for Pytorch

CRAFT-Reimplementation Note:If you have any problems, please comment. Or you can join us weChat group. The QR code will update in issues #49 . Reimple

null 453 Dec 28, 2022
Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector

CRAFT: Character-Region Awareness For Text detection Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector | Paper |

null 188 Dec 28, 2022
A pure pytorch implemented ocr project including text detection and recognition

ocr.pytorch A pure pytorch implemented ocr project. Text detection is based CTPN and text recognition is based CRNN. More detection and recognition me

coura 444 Dec 30, 2022
Repository collecting all the submodules for the new PyTorch-based OCR System.

OCRopus3 is being replaced by OCRopus4, which is a rewrite using PyTorch 1.7; release should be soonish. Please check github.com/tmbdev/ocropus for up

NVIDIA Research Projects 138 Dec 9, 2022