The Pytorch code of "Joint Distribution Matters: Deep Brownian Distance Covariance for Few-Shot Classification", CVPR 2022 (Oral).

Overview

DeepBDC for few-shot learning

      

Introduction

In this repo, we provide the implementation of the following paper:
"Joint Distribution Matters: Deep Brownian Distance Covariance for Few-Shot Classification" [Project] [Paper].

In this paper, we propose deep Brownian Distance Covariance (DeepBDC) for few-shot classification. DeepBDC can effectively learn image representations by measuring, for the query and support images, the discrepancy between the joint distribution of their embedded features and product of the marginals. The core of DeepBDC is formulated as a modular and efficient layer, which can be flexibly inserted into deep networks, suitable not only for meta-learning framework based on episodic training, but also for the simple transfer learning (STL) framework of pretraining plus linear classifier.

If you find this repo helpful for your research, please consider citing our paper:

@inproceedings{DeepBDC-CVPR2022,
    title={Joint Distribution Matters: Deep Brownian Distance Covariance for Few-Shot Classification},
    author={Jiangtao Xie and Fei Long and Jiaming Lv and Qilong Wang and Peihua Li}, 
    booktitle={CVPR},
    year={2022}
 }

Few-shot classification Results

Experimental results on miniImageNet and CUB. We report average results with 2,000 randomly sampled episodes for both 1-shot and 5-shot evaluation. More details on the experiments can be seen in the paper.

miniImageNet

Method ResNet-12 Pre-trained models Meta-trained models
5-way-1-shot 5-way-5-shot GoogleDrive BaiduCloud GoogleDrive BaiduCloud
ProtoNet 62.11±0.44 80.77±0.30 Download Download Download Download
Good-Embed 64.98±0.44 82.10±0.30 Download Download N/A
Meta DeepBDC 67.34±0.43 84.46±0.28 Download Download Download Download
STL DeepBDC 67.83±0.43 85.45±0.29 Download Download N/A

Note that for Good-Embed and STL DeepBDC, a sequential self-distillation technique is used to obtain the pre-trained models; See the paper of Good-Embed for details.

CUB

Method ResNet-18 Pre-trained models Meta-trained models
5-way-1-shot 5-way-5-shot GoogleDrive BaiduCloud GoogleDrive BaiduCloud
ProtoNet 80.90±0.43 89.81±0.23 Download Download Download Download
Good-Embed 77.92±0.46 89.94±0.26 Download Download N/A
Meta DeepBDC 83.55±0.40 93.82±0.17 Download Download Download Download
STL DeepBDC 84.01±0.42 94.02±0.24 Download Download N/A

Note that for Good-Embed and STL DeepBDC, a sequential self-distillation technique is used to obtain the pre-trained models; See the paper of Good-Embed for details.

References

[BDC] G. J. Szekely and M. L. Rizzo. Brownian distance covariance. Annals of Applied Statistics, 3:1236–1265, 2009.
[ProtoNet] Jake Snell, Kevin Swersky, and Richard Zemel. Prototypical networks for few-shot learning. In NIPS, 2017.
[Good-Embed] Y. Tian, Y. Wang, D. Krishnan, J. B. Tenenbaum, and P. Isola. Rethinking few-shot image classification: a good embedding is all you need? In ECCV, 2020.

Implementation details

Datasets

  • miniImageNet: We use the splits provided by Chen et al.
  • CUB: We use the splits provided by Chen et al.
  • tieredImageNet
  • Aircraft
  • Cars

Implementation environment

Note that the test accuracy may slightly vary with different Pytorch/CUDA versions, GPUs, etc.

  • Linux
  • Python 3.8.3
  • torch 1.7.1
  • GPU (RTX3090) + CUDA11.0 CuDNN
  • sklearn1.0.1, pillow8.0.0, numpy1.19.2

Installation

  • Clone this repo:
git clone https://github.com/Fei-Long121/DeepBDC.git
cd DeepBDC

For Meta DeepBDC on general object recognition

  1. cd scripts/mini_magenet/run_meta_deepbdc
  2. modify the dataset path in run_pretrain.sh, run_metatrain.sh and run_test.sh
  3. bash run.sh

For STL DeepBDC on general object recognition

  1. cd scripts/mini_imagenet/run_stl_deepbdc
  2. modify the dataset path in run_pretrain.sh, run_distillation.sh and run_test.sh
  3. bash run.sh

Acknowledgments

Our code builds upon the the following code publicly available:

Contact

If you have any questions or suggestions, please contact us:

Fei Long([email protected])
Jiaming Lv([email protected])

Issues
  • 与风格Gram matrix的关系

    与风格Gram matrix的关系

    以下是一些个人见解和疑惑,作者可以回答一下么: 我理解这工作核心,在于使用预训练模型的feature map(记为X),然后计算BDC来抽取图片的特征。而BDC就是X的列的欧氏距离经过一些归一化操作得到的。在神经风格中,使用Gram矩阵来描述图片的风格(记为G),G=(X^T) X。文章中公式(6)第一行也提到了A = G_ii + G_jj - 2*G_ij;公式第二行、第三行分别做了开根号,和移除行、列的均值的操作。 那么可以想象,如果两张图片的Gram矩阵很接近,BDC模块提取的特征也会非常接近,也就是说他们会被认为是同一类。而我们知道Gram矩阵是跟纹理、风格相关的,而跟内容关系不大。那是否可以理解为,最终抽取的特征其实是图片纹理或风格?

    opened by weifengchiu 7
  • Do you use a single checkpoint to evaluate arbitrary-shot episodes?

    Do you use a single checkpoint to evaluate arbitrary-shot episodes?

    Hello,

    Congratulations on your paper accepted as an oral presentation!

    I would like to ask a simple question on the implementation: To evaluate both 1-shot and 5-shot episodes on the test set, do you use one model checkpoint which had the best validation accuracy on 1-shot episodes?

    Thank you for releasing the code publicly available! Have a great day! 😃

    Best, Dahyun

    opened by dahyun-kang 2
  • Self-distillation

    Self-distillation

    Hi, thanks for publishing the code. It is really an interesting work!

    I have a question in terms of the self-distillation used in the pre-training stage. It looks like the teacher model is fixed in the beginning and the knowledge is distilled into a new student model over the iteration. So it is not really sequential (i.e. the previous student becomes the teacher in the next iteration)?

    opened by RongKaiWeskerMA 1
  • tiered_ImageNet的代码及超参数能提供么?

    tiered_ImageNet的代码及超参数能提供么?

    你好,代码中只提供了mini-imagenet 和 cu b的代码及超参数,您看能否提供tired-image的超参数脚本? tired-image的代码和其他两个数据集一样么?
    这里的代码主要是数据集相关的代码

    在请教下 cub 和tiered_ImageNet 在STLBDC中的预训练阶段batch size 都是64么?

    opened by xue19890510 3
Official code for the CVPR 2022 (oral) paper "Extracting Triangular 3D Models, Materials, and Lighting From Images".

nvdiffrec Joint optimization of topology, materials and lighting from multi-view image observations as described in the paper Extracting Triangular 3D

NVIDIA Research Projects 1.1k Aug 9, 2022
Code for "Neural 3D Scene Reconstruction with the Manhattan-world Assumption" CVPR 2022 Oral

News 05/10/2022 To make the comparison on ScanNet easier, we provide all quantitative and qualitative results of baselines here, including COLMAP, COL

ZJU3DV 313 Aug 12, 2022
[CVPR 2022 Oral] Rethinking Minimal Sufficient Representation in Contrastive Learning

Rethinking Minimal Sufficient Representation in Contrastive Learning PyTorch implementation of Rethinking Minimal Sufficient Representation in Contras

null 31 Aug 5, 2022
(CVPR 2022 - oral) Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry

Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry Official implementation of the paper Multi-View Depth Est

Bae, Gwangbin 74 Aug 6, 2022
[CVPR 2022 Oral] TubeDETR: Spatio-Temporal Video Grounding with Transformers

TubeDETR: Spatio-Temporal Video Grounding with Transformers Website • STVG Demo • Paper This repository provides the code for our paper. This includes

Antoine Yang 73 Aug 8, 2022
[CVPR 2022 Oral] Crafting Better Contrastive Views for Siamese Representation Learning

Crafting Better Contrastive Views for Siamese Representation Learning (CVPR 2022 Oral) 2022-03-29: The paper was selected as a CVPR 2022 Oral paper! 2

null 220 Jul 31, 2022
(CVPR 2022 Oral) Official implementation for "Surface Representation for Point Clouds"

RepSurf - Surface Representation for Point Clouds [CVPR 2022 Oral] By Haoxi Ran* , Jun Liu, Chengjie Wang ( * : corresponding contact) The pytorch off

Haoxi Ran 130 Jul 21, 2022
[CVPR 2022 Oral] EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation

EPro-PnP EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation In CVPR 2022 (Oral). [paper] Hanshen

 同济大学智能汽车研究所综合感知研究组 ( Comprehensive Perception Research Group under Institute of Intelligent Vehicles, School of Automotive Studies, Tongji University) 678 Aug 5, 2022
[CVPR 2022 Oral] Versatile Multi-Modal Pre-Training for Human-Centric Perception

Versatile Multi-Modal Pre-Training for Human-Centric Perception Fangzhou Hong1  Liang Pan1  Zhongang Cai1,2,3  Ziwei Liu1* 1S-Lab, Nanyang Technologic

Fangzhou Hong 70 Aug 3, 2022
Scribble-Supervised LiDAR Semantic Segmentation, CVPR 2022 (ORAL)

Scribble-Supervised LiDAR Semantic Segmentation Dataset and code release for the paper Scribble-Supervised LiDAR Semantic Segmentation, CVPR 2022 (ORA

null 59 Aug 2, 2022
Not All Points Are Equal: Learning Highly Efficient Point-based Detectors for 3D LiDAR Point Clouds (CVPR 2022, Oral)

Not All Points Are Equal: Learning Highly Efficient Point-based Detectors for 3D LiDAR Point Clouds (CVPR 2022, Oral) This is the official implementat

Yifan Zhang 164 Aug 6, 2022
[CVPR 2022] PoseTriplet: Co-evolving 3D Human Pose Estimation, Imitation, and Hallucination under Self-supervision (Oral)

PoseTriplet: Co-evolving 3D Human Pose Estimation, Imitation, and Hallucination under Self-supervision Kehong Gong*, Bingbing Li*, Jianfeng Zhang*, Ta

null 204 Aug 11, 2022
[CVPR 2022 Oral] MixFormer: End-to-End Tracking with Iterative Mixed Attention

MixFormer The official implementation of the CVPR 2022 paper MixFormer: End-to-End Tracking with Iterative Mixed Attention [Models and Raw results] (G

Multimedia Computing Group, Nanjing University 193 Aug 2, 2022
Temporally Efficient Vision Transformer for Video Instance Segmentation, CVPR 2022, Oral

Temporally Efficient Vision Transformer for Video Instance Segmentation Temporally Efficient Vision Transformer for Video Instance Segmentation (CVPR

Hust Visual Learning Team 183 Aug 5, 2022
Official MegEngine implementation of CREStereo(CVPR 2022 Oral).

[CVPR 2022] Practical Stereo Matching via Cascaded Recurrent Network with Adaptive Correlation This repository contains MegEngine implementation of ou

MEGVII Research 242 Aug 12, 2022
Focal Sparse Convolutional Networks for 3D Object Detection (CVPR 2022, Oral)

Focal Sparse Convolutional Networks for 3D Object Detection (CVPR 2022, Oral) This is the official implementation of Focals Conv (CVPR 2022), a new sp

DV Lab 232 Aug 12, 2022
Imposter-detector-2022 - HackED 2022 Team 3IQ - 2022 Imposter Detector

HackED 2022 Team 3IQ - 2022 Imposter Detector By Aneeljyot Alagh, Curtis Kan, Jo

Joshua Ji 4 Jan 27, 2022
The 7th edition of NTIRE: New Trends in Image Restoration and Enhancement workshop will be held on June 2022 in conjunction with CVPR 2022.

NTIRE 2022 - Image Inpainting Challenge Important dates 2022.02.01: Release of train data (input and output images) and validation data (only input) 2

Andrés Romero 28 Jul 20, 2022
A PyTorch implementation of ICLR 2022 Oral paper PiCO: Contrastive Label Disambiguation for Partial Label Learning

PiCO: Contrastive Label Disambiguation for Partial Label Learning This is a PyTorch implementation of ICLR 2022 Oral paper PiCO; also see our Project

王皓波 83 May 11, 2022