# Balanced MSE

Code for the paper:

Balanced MSE for Imbalanced Visual Regression
Jiawei Ren, Mingyuan Zhang, Cunjun Yu, Ziwei Liu

CVPR 2022 (Oral)

## Live Demo

Check out our live demo in the Hugging Face 🤗 space!

## Tutorial

We provide a minimal working example of Balanced MSE using the BMC implementation on a small-scale dataset, Boston Housing dataset.

The notebook is developed on top of Deep Imbalanced Regression (DIR) Tutorial, we thank the authors for their amazing tutorial!

## Quick Preview

A code snippet of the Balanced MSE loss is shown below. We use the BMC implementation for demonstration, BMC does not require any label prior beforehand.

### One-dimensional Balanced MSE

```def bmc_loss(pred, target, noise_var):
"""Compute the Balanced MSE Loss (BMC) between `pred` and the ground truth `targets`.
Args:
pred: A float tensor of size [batch, 1].
target: A float tensor of size [batch, 1].
noise_var: A float number or tensor.
Returns:
loss: A float tensor. Balanced MSE Loss.
"""
logits = - (pred - target.T).pow(2) / (2 * noise_var)   # logit size: [batch, batch]
loss = F.cross_entropy(logits, torch.arange(pred.shape[0]))     # contrastive-like loss
loss = loss * (2 * noise_var).detach()  # optional: restore the loss scale, 'detach' when noise is learnable

return loss```

`noise_var` is a one-dimensional hyper-parameter. `noise_var` can be optionally optimized in training:

```class BMCLoss(_Loss):
def __init__(self, init_noise_sigma):
super(BMCLoss, self).__init__()
self.noise_sigma = torch.nn.Parameter(torch.tensor(init_noise_sigma))

def forward(self, pred, target):
noise_var = self.noise_sigma ** 2
return bmc_loss(pred, target, noise_var)

criterion = BMCLoss(init_noise_sigma)
optimizer.add_param_group({'params': criterion.noise_sigma, 'lr': sigma_lr, 'name': 'noise_sigma'})```

### Multi-dimensional Balanced MSE

The multi-dimensional implementation is compatible with the 1-D version.

```from torch.distributions import MultivariateNormal as MVN

def bmc_loss_md(pred, target, noise_var):
"""Compute the Multidimensional Balanced MSE Loss (BMC) between `pred` and the ground truth `targets`.
Args:
pred: A float tensor of size [batch, d].
target: A float tensor of size [batch, d].
noise_var: A float number or tensor.
Returns:
loss: A float tensor. Balanced MSE Loss.
"""
I = torch.eye(pred.shape[-1])
logits = MVN(pred.unsqueeze(1), noise_var*I).log_prob(target.unsqueeze(0))  # logit size: [batch, batch]
loss = F.cross_entropy(logits, torch.arange(pred.shape[0]))     # contrastive-like loss
loss = loss * (2 * noise_var).detach()  # optional: restore the loss scale, 'detach' when noise is learnable

return loss```

`noise_var` is still a one-dimensional hyper-parameter and can be optionally learned in training.

## Run Experiments

Please go into the sub-folder to run experiments.

## Citation

```@inproceedings{ren2021bmse,
title={Balanced MSE for Imbalanced Visual Regression},
author={Ren, Jiawei and Zhang, Mingyuan and Yu, Cunjun and Liu, Ziwei},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year={2022}
}```

## Acknowledgment

This work is supported by NTU NAP, MOE AcRF Tier 2 (T2EP20221-0033), the National Research Foundation, Singapore under its AI Singapore Programme, and under the RIE2020 Industry Alignment Fund – Industry Collabo- ration Projects (IAF-ICP) Funding Initiative, as well as cash and in-kind contribution from the industry partner(s).

The code is developed on top of Delving into Deep Imbalanced Regression.

• #### BMCLossMD loss Multi-dimensional

Thank you for sharing, I refer BalancedMSE/synthetic_benchmark/loss.py; Modified loss function in distributed run regression, after training the result is very low is my code modified by mistake? Below is the code for my change, which can explain why

opened by lijain 6

def bmc_loss_md(pred, target, noise_var): I = torch.eye(pred.shape[-1]) logits = MVN(pred.unsqueeze(1), noise_var*I).log_prob(target.unsqueeze(0)) loss = F.cross_entropy(logits, torch.arange(pred.shape[0])) loss = loss * (2 * noise_var).detach() return loss

My size of pred and target are [30,3,256,256] when running this code "loss = F.cross_entropy(logits, torch.arange(pred.shape[0]))",i got a error because "torch.arange(pred.shape[0])"is 1d,and logits is 4d.

How can i solve this error

opened by SongYxing 3
• #### err

def bmc_loss_md(pred, target, noise_var): I = torch.eye(pred.shape[-1]).cuda() logits = MVN(pred.unsqueeze(1), noise_var*I) logits = logits.log_prob(target.unsqueeze(0)) loss = F.cross_entropy(logits, torch.arange(pred.shape[0])) loss = loss * (2 * noise_var).detach() return loss 用于ssrnet年龄回归任务。 在这一段代码中，loss = F.cross_entropy(logits, torch.arange(pred.shape[0]))　报错： IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)

回归任务不就是一维的吗。

opened by sssssshf 3
• #### Dose BMCloss only work for minibatch > 1?

The bmcloss equals 0.0 when I added BMCloss to my project with minibatch==1. I noticed that bmcloss is calculated by treating the regression predictions in a batch as multiclass classification. But since the size of each of my samples is different, my minibatch can only be set to 1. Is it not feasible to apply bmcloss in this case?

opened by ccxsun 2
• #### How to apply BalancedMSE for the d-dimensional regression task?

@jiawei-ren I wondered how should I apply Balanced-MSE for the D-dimensional regression task? I checked the tutorial code, the example code is formatted for one-dimensional regression.

``````def bmc_loss(pred, target, noise_var):
logits = - (pred - target.T).pow(2) / (2 * noise_var)
loss = F.cross_entropy(logits, torch.arange(pred.shape[0]))
loss = loss * (2 * noise_var).detach()  # optional: restore the loss scale

return loss
``````

For example, the pred shape is (N, D), and the target is also (N, D). N is the batch size, and D is the dimension for regressing tasks. If following the above example, there is an error in the F.cross_entropy function.

opened by ChengBinJin 2
• #### Bad performance on train set

Hi. I apply bMSE on my task which is a single-dimensional regression task. I found that the performance on train set is weird. The prediction of relatively low value data point is very high. In all, the performance on train set is bad. So it is normal? Looking forward to your reply!

opened by Witiy 1
• #### The shape of logits

The shape of logits in bmc loss is [batch, batch] which means both pred and target have the size [batch, 1]. For dense prediction, do we have to flat the 2d-image to a 1d vector with an operation of unsqueeze(-1)?

opened by DIVE128 1

``````gmm = {k: gmm[k].reshape(1, -1).expand(pred.shape[0], -1) for k in gmm}
mse_term = F.mse_loss(pred, target, reduction='none') / 2 / noise_var + 0.5 * noise_var.log()
sum_var = gmm['variances'] + noise_var
balancing_term = - 0.5 * sum_var.log() - 0.5 * (pred - gmm['means']).pow(2) / sum_var + gmm['weights'].log()
balancing_term = torch.logsumexp(balancing_term, dim=-1, keepdim=True)
``````

when I use gai loss, my pred size is [128], but the data of gmm is [128, 8] after expand. How I solve this, looking forward to your recovery

opened by luxianlebron 2
###### Jiawei Ren
Ph.D. student at MMLAB@NTU.
###### This repository contains the code used for Predicting Patient Outcomes with Graph Representation Learning (https://arxiv.org/abs/2101.03940).

Predicting Patient Outcomes with Graph Representation Learning This repository contains the code used for Predicting Patient Outcomes with Graph Repre

76 Dec 22, 2022
###### Official implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis https://arxiv.org/abs/2011.13775

CIPS -- Official Pytorch Implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis Requirements pip install -r requi

201 Dec 21, 2022
###### Pytorch implementation of Each Part Matters: Local Patterns Facilitate Cross-view Geo-localization https://arxiv.org/abs/2008.11646

[TCSVT] Each Part Matters: Local Patterns Facilitate Cross-view Geo-localization LPN [Paper] NEWs Prerequisites Python 3.6 GPU Memory >= 8G Numpy > 1.

46 Dec 14, 2022
###### https://arxiv.org/abs/2102.11005

LogME LogME: Practical Assessment of Pre-trained Models for Transfer Learning How to use Just feed the features f and labels y to the function, and yo

149 Dec 19, 2022
###### Supplementary code for the paper "Meta-Solver for Neural Ordinary Differential Equations" https://arxiv.org/abs/2103.08561

Meta-Solver for Neural Ordinary Differential Equations Towards robust neural ODEs using parametrized solvers. Main idea Each Runge-Kutta (RK) solver w

25 Aug 12, 2021
###### Code for paper "A Critical Assessment of State-of-the-Art in Entity Alignment" (https://arxiv.org/abs/2010.16314)

A Critical Assessment of State-of-the-Art in Entity Alignment This repository contains the source code for the paper A Critical Assessment of State-of

16 Oct 14, 2022
###### [PyTorch] Official implementation of CVPR2021 paper "PointDSC: Robust Point Cloud Registration using Deep Spatial Consistency". https://arxiv.org/abs/2103.05465

PointDSC repository PyTorch implementation of PointDSC for CVPR'2021 paper "PointDSC: Robust Point Cloud Registration using Deep Spatial Consistency",

153 Dec 14, 2022
###### Official Implementation for "ReStyle: A Residual-Based StyleGAN Encoder via Iterative Refinement" https://arxiv.org/abs/2104.02699

ReStyle: A Residual-Based StyleGAN Encoder via Iterative Refinement Recently, the power of unconditional image synthesis has significantly advanced th

967 Jan 4, 2023
###### Code for the paper: Learning Adversarially Robust Representations via Worst-Case Mutual Information Maximization (https://arxiv.org/abs/2002.11798)

Representation Robustness Evaluations Our implementation is based on code from MadryLab's robustness package and Devon Hjelm's Deep InfoMax. For all t

19 Dec 7, 2022
###### ISTR: End-to-End Instance Segmentation with Transformers (https://arxiv.org/abs/2105.00637)

This is the project page for the paper: ISTR: End-to-End Instance Segmentation via Transformers, Jie Hu, Liujuan Cao, Yao Lu, ShengChuan Zhang, Yan Wa

182 Dec 19, 2022
###### Non-Official Pytorch implementation of "Face Identity Disentanglement via Latent Space Mapping" https://arxiv.org/abs/2005.07728 Using StyleGAN2 instead of StyleGAN

Face Identity Disentanglement via Latent Space Mapping - Implement in pytorch with StyleGAN 2 Description Pytorch implementation of the paper Face Ide

58 Dec 24, 2022
###### Minimal implementation of PAWS (https://arxiv.org/abs/2104.13963) in TensorFlow.

PAWS-TF ?? Implementation of Semi-Supervised Learning of Visual Features by Non-Parametrically Predicting View Assignments with Support Samples (PAWS)

43 Jan 8, 2023
###### YOLO5Face: Why Reinventing a Face Detector (https://arxiv.org/abs/2105.12931)

Introduction Yolov5-face is a real-time,high accuracy face detection. Performance Single Scale Inference on VGA resolution（max side is equal to 640 an

1.4k Jan 7, 2023
###### A PyTorch implementation of EventProp [https://arxiv.org/abs/2009.08378], a method to train Spiking Neural Networks

Spiking Neural Network training with EventProp This is an unofficial PyTorch implemenation of EventProp, a method to compute exact gradients for Spiki

35 Jul 29, 2022
###### Unofficial implementation of "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" (https://arxiv.org/abs/2103.14030)

Swin-Transformer-Tensorflow A direct translation of the official PyTorch implementation of "Swin Transformer: Hierarchical Vision Transformer using Sh

52 Dec 29, 2022
###### Source code for models described in the paper "AudioCLIP: Extending CLIP to Image, Text and Audio" (https://arxiv.org/abs/2106.13043)

AudioCLIP Extending CLIP to Image, Text and Audio This repository contains implementation of the models described in the paper arXiv:2106.13043. This

458 Jan 2, 2023
###### Unofficial implementation of Alias-Free Generative Adversarial Networks. (https://arxiv.org/abs/2106.12423) in PyTorch

alias-free-gan-pytorch Unofficial implementation of Alias-Free Generative Adversarial Networks. (https://arxiv.org/abs/2106.12423) This implementation

502 Jan 3, 2023
###### source code for https://arxiv.org/abs/2005.11248 "Accelerating Antimicrobial Discovery with Controllable Deep Generative Models and Molecular Dynamics"

Accelerating Antimicrobial Discovery with Controllable Deep Generative Models and Molecular Dynamics This work will be published in Nature Biomedical

71 Nov 15, 2022
###### Official repository with code and data accompanying the NAACL 2021 paper "Hurdles to Progress in Long-form Question Answering" (https://arxiv.org/abs/2103.06332).

Hurdles to Progress in Long-form Question Answering This repository contains the official scripts and datasets accompanying our NAACL 2021 paper, "Hur

41 Nov 8, 2022