Dogs classification with Deep Metric Learning using some popular losses

Overview

Tsinghua Dogs classification with
Deep Metric Learning

1. Introduction

Tsinghua Dogs dataset

Tsinghua Dogs is a fine-grained classification dataset for dogs, over 65% of whose images are collected from people's real life. Each dog breed in the dataset contains at least 200 images and a maximum of 7,449 images. For more info, see dataset's homepage.

Following is the brief information about the dataset:

  • Number of categories: 130
  • Number of training images: 65228
  • Number of validating images: 5200

Variation in Tsinghua Dogs dataset. (a) Great Danes exhibit large variations in appearance, while (b) Norwich terriers and (c) Australian terriers are quite similar to each other. (Source)

Deep metric learning

Deep metric learning (DML) aims to measure the similarity among samples by training a deep neural network and a distance metric such as Euclidean distance or Cosine distance. For fine-grained data, in which the intra-class variances are larger than inter-class variances, DML proves to be useful in classification tasks.

Goal

In this projects, I use deep metric learning to classify dog images in Tsinghua Dogs dataset. Those loss functions are implemented:

  1. Triplet loss
  2. Proxy-NCA loss
  3. Proxy-anchor loss: In progress
  4. Soft-triple loss: In progress

I also evaluate models' performance on some common metrics:

  1. Precision at k (P@K)
  2. Mean average precision (MAP)
  3. Top-k accuracy
  4. Normalized mutual information (NMI)


2. Benchmarks

  • Architecture: Resnet-50 for feature extractions.
  • Embedding size: 128.
  • Batch size: 48.
  • Number of epochs: 100.
  • Online hard negatives mining.
  • Augmentations:
    • Random horizontal flip.
    • Random brightness, contrast and saturation.
    • Random affine with rotation, scale and translation.
MAP P@1 P@5 P@10 Top-5 NMI Download
Triplet loss 73.85% 74.66% 73.90 73.00% 93.76% 0.82
Proxy-NCA loss 89.10% 90.26% 89.28% 87.76% 99.39% 0.98
Proxy-anchor loss
Soft-triple loss


3. Visualization

Proxy-NCA loss

Confusion matrix on validation set

T-SNE on validation set

Similarity matrix of some images in validation set

  • Each cell represent the L2 distance between 2 images.
  • The closer distance to 0 (blue), the more similar.
  • The larger distance (green), the more dissimilar.

Triplet loss

Confusion matrix on validation set

T-SNE on validation set

Similarity matrix of some images in validation set

  • Each cell represent the L2 distance between 2 images.
  • The closer distance to 0 (blue), the more similar.
  • The larger distance (green), the more dissimilar.



4. Train

4.1 Install dependencies

# Create conda environment
conda create --name dml python=3.7 pip
conda activate dml

# Install pytorch and torchvision
conda install -n dml pytorch torchvision cudatoolkit=10.2 -c pytorch

# Install faiss for indexing and calulcating accuracy
# https://github.com/facebookresearch/faiss
conda install -n dml faiss-gpu cudatoolkit=10.2 -c pytorch

# Install other dependencies
pip install opencv-python tensorboard torch-summary torch_optimizer scikit-learn matplotlib seaborn requests ipdb flake8 pyyaml

4.2 Prepare Tsinghua Dogs dataset

PYTHONPATH=./ python src/scripts/prepare_TsinghuaDogs.py --output_dir data/

Directory data should be like this:

data/
└── TsinghuaDogs
    ├── High-Annotations
    ├── high-resolution
    ├── TrainAndValList
    ├── train
    │   ├── 561-n000127-miniature_pinscher
    │   │   ├── n107028.jpg
    │   │   ├── n107031.jpg
    │   │   ├── ...
    │   │   └── n107218.jp
    │   ├── ...
    │   ├── 806-n000129-papillon
    │   │   ├── n107440.jpg
    │   │   ├── n107451.jpg
    │   │   ├── ...
    │   │   └── n108042.jpg
    └── val
        ├── 561-n000127-miniature_pinscher
        │   ├── n161176.jpg
        │   ├── n161177.jpg
        │   ├── ...
        │   └── n161702.jpe
        ├── ...
        └── 806-n000129-papillon
            ├── n169982.jpg
            ├── n170022.jpg
            ├── ...
            └── n170736.jpeg

4.3 Train model

  • Train with proxy-nca loss
CUDA_VISIBLE_DEVICES=0 PYTHONPATH=./ python src/main.py --train_dir data/TsinghuaDogs/train --test_dir data/TsinghuaDogs/val --loss proxy_nca --config src/configs/proxy_nca_loss.yaml --checkpoint_root_dir src/checkpoints/proxynca-resnet50
  • Train with triplet loss
CUDA_VISIBLE_DEVICES=0 PYTHONPATH=./ python src/main.py --train_dir data/TsinghuaDogs/train --test_dir data/TsinghuaDogs/val --loss tripletloss --config src/configs/triplet_loss.yaml --checkpoint_root_dir src/checkpoints/tripletloss-resnet50

Run PYTHONPATH=./ python src/main.py --help for more detail about arguments.

If you want to train on 2 gpus, replace CUDA_VISIBLE_DEVICES=0 with CUDA_VISIBLE_DEVICES=0,1 and so on.

If you encounter out of memory issues, try reducing classes_per_batch and samples_per_class in src/configs/triplet_loss.yaml or batch_size in src/configs/your-loss.yaml



5. Evaluate

To evaluate, directory data should be structured like this:

data/
└── TsinghuaDogs
    ├── train
    │   ├── 561-n000127-miniature_pinscher
    │   │   ├── n107028.jpg
    │   │   ├── n107031.jpg
    │   │   ├── ...
    │   │   └── n107218.jp
    │   ├── ...
    │   ├── 806-n000129-papillon
    │   │   ├── n107440.jpg
    │   │   ├── n107451.jpg
    │   │   ├── ...
    │   │   └── n108042.jpg
    └── val
        ├── 561-n000127-miniature_pinscher
        │   ├── n161176.jpg
        │   ├── n161177.jpg
        │   ├── ...
        │   └── n161702.jpe
        ├── ...
        └── 806-n000129-papillon
            ├── n169982.jpg
            ├── n170022.jpg
            ├── ...
            └── n170736.jpeg

Plot confusion matrix

PYTHONPATH=./ python src/scripts/visualize_confusion_matrix.py --test_images_dir data/TshinghuaDogs/val/ --reference_images_dir data/TshinghuaDogs/train -c src/checkpoints/proxynca-resnet50.pth

Plot T-SNE

PYTHONPATH=./ python src/scripts/visualize_tsne.py --images_dir data/TshinghuaDogs/val/ -c src/checkpoints/proxynca-resnet50.pth

Plot similarity matrix

PYTHONPATH=./ python src/scripts/visualize_similarity.py  --images_dir data/TshinghuaDogs/val/ -c src/checkpoints/proxynca-resnet50.pth


6. Developement

.
├── __init__.py
├── README.md
├── src
│   ├── main.py  # Entry point for training.
│   ├── checkpoints  # Directory to save model's weights while training
│   ├── configs  # Configurations for each loss function
│   │   ├── proxy_nca_loss.yaml
│   │   └── triplet_loss.yaml
│   ├── dataset.py
│   ├── evaluate.py  # Calculate mean average precision, accuracy and NMI score
│   ├── __init__.py
│   ├── logs
│   ├── losses
│   │   ├── __init__.py
│   │   ├── proxy_nca_loss.py
│   │   └── triplet_margin_loss.py
│   ├── models  # Feature extraction models
│   │   ├── __init__.py
│   │   └── resnet.py
│   ├── samplers
│   │   ├── __init__.py
│   │   └── pk_sampler.py  # Sample triplets in each batch for triplet loss
│   ├── scripts
│   │   ├── __init__.py
│   │   ├── prepare_TsinghuaDogs.py  # download and prepare dataset for training and validating
│   │   ├── visualize_confusion_matrix.py
│   │   ├── visualize_similarity.py
│   │   └── visualize_tsne.py
│   ├── trainer.py  # Helper functions for training
│   └── utils.py  # Some utility functions
└── static
    ├── proxynca-resnet50
    │   ├── confusion_matrix.jpg
    │   ├── similarity.jpg
    │   ├── tsne_images.jpg
    │   └── tsne_points.jpg
    └── tripletloss-resnet50
        ├── confusion_matrix.jpg
        ├── similarity.jpg
        ├── tsne_images.jpg
        └── tsne_points.jpg

7. Acknowledgement

@article{Zou2020ThuDogs,
    title={A new dataset of dog breed images and a benchmark for fine-grained classification},
    author={Zou, Ding-Nan and Zhang, Song-Hai and Mu, Tai-Jiang and Zhang, Min},
    journal={Computational Visual Media},
    year={2020},
    url={https://doi.org/10.1007/s41095-020-0184-6}
}
Comments
  • Evaluate model -  The system cannot find the path specified

    Evaluate model - The system cannot find the path specified

    @QuocThangNguyen Hello. I tried training and inference, they work fine (just training takes too long). Then I wanted to evaluate a trained model to see confusion matrix, T-SNE and similarity matrix, but I receive errors

    FileNotFoundError: [WinError 3] The system cannot find the path specified: 'data/TshinghuaDogs/train' FileNotFoundError: [WinError 3] The system cannot find the path specified: 'data/TshinghuaDogs/val'

    I followed the advice how to structure dataset for train and val folders, so I don't know why it can't find directories. Can you please help to figure out what is the issue and let me know if you need additional information from my side!

    image

    opened by HripsimeS 5
  • Tsinghua Dogs dataset preparation

    Tsinghua Dogs dataset preparation

    @QuocThangNguyen Hello. I am having a bit problem in running prepare_TsinghuaDogs.py script for dataset preparation and getting error IndexError: list index out of range. Just wanted to check with you if I am doing things correctly.

    So in data folder I opened TsinghuaDogs where I opened the following folders High-Annotations high-resolution TrainAndValList train val

    Is this a correct data directory structure before running a prepare_TsinghuaDogs.py ?

    image

    image

    opened by HripsimeS 2
  • checkpoints folder is empty

    checkpoints folder is empty

    @QuocThangNguyen First of all, thanks for the great work on fine-grained classification. I would like to test the repository train and inference, but unfortunatly I could not find proxynca-resnet50.pth pre-trained model in checkpoints folder. Can you please let me know where I can find and dowload it. Look forward to hearing frolm you!

    opened by HripsimeS 2
  • prepare_TsinghuaDogs not working

    prepare_TsinghuaDogs not working

    Hi,

    I'm interested in using your dataset. However, when prepare_TsinghuaDogs.py is unzipping the downloaded files, an exception occurs:

    Archive: data/TsinghuaDogs.zip End-of-central-directory signature not found. Either this file is not a zipfile, or it constitutes one disk of a multi-part archive. In the latter case the central directory and zipfile comment will be found on the last disk(s) of this archive. unzip: cannot find zipfile directory in one of data/TsinghuaDogs.zip or data/TsinghuaDogs.zip.zip, and cannot find data/TsinghuaDogs.zip.ZIP, period. 22-Jun-08 17:47:02 root INFO: Downloading https://cg.cs.tsinghua.edu.cn/ThuDogs/high-annotations.zip to data/TsinghuaDogs/high-annotations.zip

    Could you please fix it or tell me the proper way to use it?

    opened by Mo-Kanya 2
  • How can I convert .pth weight to tflite?

    How can I convert .pth weight to tflite?

    @QuocThangNguyen Hello!

    I would like to convert a pytorch (pth) trained model to TensorFlow Lite (tflite) format. Do you have some advice how to do it properly for your Fine-Grained classification models? I saw that in general the converting process has several steps, convert Pytorch to ONNX, ONNX to TensorFlow and then TensorFlow to TensorFlow Lite. But if you know a quick way to do it or maybe there is a special script that can do all these steps without errors, that would be great.

    Look forward to hearing from you!

    opened by HripsimeS 1
  • example code for pretrained model

    example code for pretrained model

    Dear QuocThangNguyen,

    Thank you for your great work about Tsinghua Dogs dataset, I really like it.

    In the main pape, you provide pretrained model, would you mind provide the usuage codes for the pretrained model? Did you do any preprocessing before fitting the image to the model?

    You mention about other benchmarks in project main page, but there is not pretrained model, so I do not know the pretrained model in github page is using which method.

    Thank you for your help.

    Best Wishes,

    Alex

    opened by betterze 1
  • output category

    output category

    Dear QuocThangNguyen,

    Thank you for your great work about Tsinghua Dogs dataset, I really like it.

    In the main pape, you provide pretrained model, this model only provide the image embedding for retrieval rather than output the dog category. Is there a way to get the dog category given an image?

    Thank you for your help.

    Best Wishes,

    Alex

    question 
    opened by betterze 1
Owner
QuocThangNguyen
Computer Vision Researcher
QuocThangNguyen
Code for paper ECCV 2020 paper: Who Left the Dogs Out? 3D Animal Reconstruction with Expectation Maximization in the Loop.

Who Left the Dogs Out? Evaluation and demo code for our ECCV 2020 paper: Who Left the Dogs Out? 3D Animal Reconstruction with Expectation Maximization

Benjamin Biggs 29 Dec 28, 2022
ML model to classify between cats and dogs

Cats-and-dogs-classifier This is my first ML model which can classify between cats and dogs. Here the accuracy is around 75%, however , the accuracy c

Sharath V 4 Aug 20, 2021
Seach Losses of our paper 'Loss Function Discovery for Object Detection via Convergence-Simulation Driven Search', accepted by ICLR 2021.

CSE-Autoloss Designing proper loss functions for vision tasks has been a long-standing research direction to advance the capability of existing models

Peidong Liu(刘沛东) 54 Dec 17, 2022
Alpha-IoU: A Family of Power Intersection over Union Losses for Bounding Box Regression

Alpha-IoU: A Family of Power Intersection over Union Losses for Bounding Box Regression YOLOv5 with alpha-IoU losses implemented in PyTorch. Example r

Jacobi(Jiabo He) 147 Dec 5, 2022
Hypersearch weight debugging and losses tutorial

tutorial Activate tensorboard option Running TensorBoard remotely When working on a remote server, you can use SSH tunneling to forward the port of th

null 1 Dec 11, 2021
Official PyTorch implementation of "Proxy Synthesis: Learning with Synthetic Classes for Deep Metric Learning" (AAAI 2021)

Proxy Synthesis: Learning with Synthetic Classes for Deep Metric Learning Official PyTorch implementation of "Proxy Synthesis: Learning with Synthetic

NAVER/LINE Vision 30 Dec 6, 2022
Towards Interpretable Deep Metric Learning with Structural Matching

DIML Created by Wenliang Zhao*, Yongming Rao*, Ziyi Wang, Jiwen Lu, Jie Zhou This repository contains PyTorch implementation for paper Towards Interpr

Wenliang Zhao 75 Nov 11, 2022
[ICCV 2021] Official PyTorch implementation for Deep Relational Metric Learning.

Deep Relational Metric Learning This repository is the official PyTorch implementation of Deep Relational Metric Learning. Framework Datasets CUB-200-

Borui Zhang 39 Dec 10, 2022
GeDML is an easy-to-use generalized deep metric learning library

GeDML is an easy-to-use generalized deep metric learning library

Borui Zhang 32 Dec 5, 2022
A Pytorch implementation of "Manifold Matching via Deep Metric Learning for Generative Modeling" (ICCV 2021)

Manifold Matching via Deep Metric Learning for Generative Modeling A Pytorch implementation of "Manifold Matching via Deep Metric Learning for Generat

null 69 Dec 10, 2022
Unofficial implementation of Proxy Anchor Loss for Deep Metric Learning

Proxy Anchor Loss for Deep Metric Learning Unofficial pytorch, tensorflow and mxnet implementations of Proxy Anchor Loss for Deep Metric Learning. Not

Geonmo Gu 3 Jun 9, 2021
Near-Duplicate Video Retrieval with Deep Metric Learning

Near-Duplicate Video Retrieval with Deep Metric Learning This repository contains the Tensorflow implementation of the paper Near-Duplicate Video Retr

null 2 Jan 24, 2022
Paper: Cross-View Kernel Similarity Metric Learning Using Pairwise Constraints for Person Re-identification

Cross-View Kernel Similarity Metric Learning Using Pairwise Constraints for Person Re-identification T M Feroz Ali, Subhasis Chaudhuri, ICVGIP-20-21

T M Feroz Ali 3 Jun 17, 2022
Metric learning algorithms in Python

metric-learn: Metric Learning in Python metric-learn contains efficient Python implementations of several popular supervised and weakly-supervised met

null 1.3k Dec 28, 2022
Code reproduce for paper "Vehicle Re-identification with Viewpoint-aware Metric Learning"

VANET Code reproduce for paper "Vehicle Re-identification with Viewpoint-aware Metric Learning" Introduction This is the implementation of article VAN

EMDATA-AILAB 23 Dec 26, 2022
Official PyTorch Implementation of Embedding Transfer with Label Relaxation for Improved Metric Learning, CVPR 2021

Embedding Transfer with Label Relaxation for Improved Metric Learning Official PyTorch implementation of CVPR 2021 paper Embedding Transfer with Label

Sungyeon Kim 37 Dec 6, 2022
Negative Sample Matters: A Renaissance of Metric Learning for Temporal Grounding

2D-TAN (Optimized) Introduction This is an optimized re-implementation repository for AAAI'2020 paper: Learning 2D Temporal Localization Networks for

Joya Chen 112 Dec 31, 2022
PyTorch implementation of some learning rate schedulers for deep learning researcher.

pytorch-lr-scheduler PyTorch implementation of some learning rate schedulers for deep learning researcher. Usage WarmupReduceLROnPlateauScheduler Visu

Soohwan Kim 59 Dec 8, 2022
Hl classification bc - A Network-Based High-Level Data Classification Algorithm Using Betweenness Centrality

A Network-Based High-Level Data Classification Algorithm Using Betweenness Centr

Esteban Vilca 3 Dec 1, 2022