Learning Energy-Based Models by Diffusion Recovery Likelihood

Ruiqi Gao

Last update: Nov 22, 2022

Related tags

Deep Learning recovery_likelihood

Overview

Learning Energy-Based Models by Diffusion Recovery Likelihood

Ruiqi Gao, Yang Song, Ben Poole, Ying Nian Wu, Diederik P. Kingma

Paper: https://arxiv.org/pdf/2012.08125

Requirements

Experiments can be run on a single GPU or Google Cloud TPU v3-8. Requires python >= 3.5. To install dependencies:

pip install -r requirements.txt

To compute FID/inception scores, download the pre-computed statistics of datasets from: https://drive.google.com/file/d/1QOLyYHESflcdZu8CsBLZohZzC95HyukK/view?usp=sharing, unzip the file and put the folder in this repo.

Train with 1 GPU

CIFAR10

python main.py --num_res_blocks=8 --n_batch_train=256

CelebA

python main.py --problem=celeba --num_res_blocks=6 --beta_1=0.5 --batch_size=128

LSUN church_outdoor 64x64 / LSUN bedroom 64x64

python main.py --problem=[lsun_church64/lsun_bedroom64] --batch_size=128

LSUN church_outdoor 128x128

python main.py --problem=lsun_church128 --beta_1=0.5

LSUN bedroom 128x128

python main.py --problem=lsun_bedroom128 --beta_1=0.5 --num_res_blocks=5

Compute full FID / IS scores after training on CIFAR10

python main.py --eval --num_res_blocks=8 --noise_scale=0.99 --fid_n_batch=2000

For faster training, reduce the value of num_res_blocks.

Train with Google Cloud TPU

Add --tpu=True to the above scripts for 1 GPU. Also need to set --tpu_name and --tpu_zone as shown in Google Cloud.

Pretrained models

https://drive.google.com/file/d/1eneA6T5jQIyVFLFSOrSfJvDeUJJMh9xk/view?usp=sharing

This code is for T6 setting. Will upload T1k setting soon!

Citation

If you find our work helpful to your research, please cite:

@article{gao2020learning,
  title={Learning Energy-Based Models by Diffusion Recovery Likelihood},
  author={Gao, Ruiqi and Song, Yang and Poole, Ben and Wu, Ying Nian and Kingma, Diederik P},
  journal={arXiv preprint arXiv:2012.08125},
  year={2020}
}

Comments

How long does it take to train on CIFAR10 using a single GPU/TPU?

Hi, this is really an excellent paper and code! I can run it and I found that it takes about 1 hour (time=4258.89s) to train 500 iterations on a single GPU with num_res_blocks=5, n_batch_train=256.

In the paper, you said you train CIFAR-10 240k iterations(final version) or 50k (under review version). So I wonder the total training time you need to finish the training for CIFAR10. Thanks so much!

opened by sndnyang 1
Can you explain why the energy is divided by b0?

In the following code for computing the (unnormalized) log probability, the network output is divided by b0.

https://github.com/ruiqigao/recovery_likelihood/blob/c77cc0511dedcb8d9ab928438d80acb62aeca96f/model.py#L154

I wonder if there is a legitimate explanation for this division.

b0 is supposed to be step_size_square, which usually has a very small value. https://github.com/ruiqigao/recovery_likelihood/blob/c77cc0511dedcb8d9ab928438d80acb62aeca96f/model.py#L184

I wonder if dividing by this b0 makes the gradient too large and harms the training in some settings.

opened by swyoon 0

How to train with multi GPUs?

Hello! I tried to train the model with multi GPUs. And I found that you have released train_distributed.py So I tried to use tf.distribute.MirroredStrategy() as strategy to achieve distributed training. But I got an error as follows:

 RuntimeError: `merge_call` called while defining a new graph or a tf.function. This can often happen if the function `fn` passed to `strategy.experimental_run()` is decorated with `@tf.function` (or contains a nested `@tf.function`), and `fn` contains a synchronization point, such as aggregating gradients. This behavior is not yet supported. Instead, please wrap the entire call `strategy.experimental_run(fn)` in a `@tf.function`, and avoid nested `tf.function`s that may potentially cross a synchronization boundary.

Looking forward to your help!!

opened by HoJ-Onle 0

Some Errors
I met an error when I tried to eval the model. I don't know what I should do

ValueError: in user code: /opt/conda/lib/python3.7/site-packages/tensorflow/python/ops/array_ops.py:1947 split ** axis=axis, num_split=num_or_size_splits, value=value, name=name) /opt/conda/lib/python3.7/site-packages/tensorflow/python/ops/gen_array_ops.py:9723 split "Split", split_dim=axis, value=value, num_split=num_split, name=name) /opt/conda/lib/python3.7/site-packages/tensorflow/python/framework/op_def_library.py:744 _apply_op_helper attrs=attr_protos, op_def=op_def) /opt/conda/lib/python3.7/site-packages/tensorflow/python/framework/func_graph.py:595 _create_op_internal compute_device) /opt/conda/lib/python3.7/site-packages/tensorflow/python/framework/ops.py:3327 _create_op_internal op_def=op_def) /opt/conda/lib/python3.7/site-packages/tensorflow/python/framework/ops.py:1817 init control_input_ops, op_def) /opt/conda/lib/python3.7/site-packages/tensorflow/python/framework/ops.py:1657 _create_c_op raise ValueError(str(e))

ValueError: Dimension size must be evenly divisible by 3 but is 64 Number of ways to split should evenly divide the split dimension for '{{node split}} = Split[T=DT_UINT8, num_split=3](split/split_dim, input_tensor)' with input shapes: [], [64,32,32,3] and with computed input tensors: input[0] = <0>.
opened by HoJ-Onle 0
Code implementation for the NLL metric

Hi,

Thanks a lot for your amazing work. I'm recently reproducing your work, but I find that the code implementation for calculating the NLL metric (i.e., Table 4 in the paper) is missing. Will you release the code for evaluating the NLL metric? Or could you provide a pseudo code for reference?

Thanks.

opened by chen-hao-chao 0
It seems not converge when setting K=30, T=6

Hello, I have cloned your code and run it with the default settings from config.py. However, it still converges with high negative contrastive losses and nan grads. Here present the log.

output.log

Logging ........ 2021-07-16 20:11:02,055 : gpus=0 2021-07-16 20:11:02,056 : {'logtostderr': False, 'alsologtostderr': False, 'log_dir': '', 'v': 0, 'verbosity': 0, 'logger_levels': {}, 'stderrthreshold': 'fatal', 'showprefixforinfo': True, 'run_with_pdb': False, 'pdb_post_mortem': False, 'pdb': None, 'run_with_profiling': False, 'profile_file': None, 'use_cprofile_for_profiling': True, 'only_check_args': False, 'runtime_oom_exit': True, 'op_conversion_fallback_to_while_loop': False, 'test_srcdir': '', 'test_tmpdir': '/tmp/absl_testing', 'test_random_seed': 301, 'test_randomize_ordering_seed': '', 'xml_output_file': '', 'jobid': 0, 'logdir': '', 'eager': False, 'ckpt_load': None, 'device': '0', 'tpu': False, 'tpu_name': None, 'tpu_zone': None, 'rnd_seed': 1, 'problem': 'cifar10', 'n_batch_train': 64, 'lr': 0.0001, 'beta_1': 0.9, 'n_iters': 1000000, 'grad_clip': False, 'warmup': 1000, 'n_batch_per_iter': 1, 'cosine_decay': False, 'opt': 'adam', 'eval': False, 'include_xpred_freq': 1, 'eval_fid': False, 'fid_n_samples': 64, 'fid_n_iters': 40000, 'fid_n_batch': 64, 'num_res_blocks': 8, 'num_diffusion_timesteps': 6, 'randflip': True, 'dropout': 0.0, 'normalize': None, 'act': 'lrelu', 'final_act': 'relu', 'use_attention': False, 'resamp_with_conv': False, 'spec_norm': True, 'res_conv_shortcut': True, 'res_use_scale': True, 'ma_decay': 0.999, 'noise_scale': 1.0, 'mcmc_num_steps': 30, 'mcmc_step_size_b_square': 0.0002, 'tfhub_cache_dir': None, 'tfhub_model_load_format': 'AUTO', '?': False, 'help': False, 'helpshort': False, 'helpfull': False, 'helpxml': False, 'output': './output/main/2021-07-16-20-10-55--num_res_blocks=8--n_batch_train=64'} 2021-07-16 20:11:02,238 : output dir ./output/main/2021-07-16-20-10-55--num_res_blocks=8--n_batch_train=64 2021-07-16 20:40:31,668 : ========== begin training ========= 2021-07-16 20:42:38,251 : dir=2021-07-16-20-10-55--num_res_blocks=8--n_batch_train=64 i= 0 loss=-2228.8938 learning grads mean= nan grads max=448971.1250 disp=2.339, 15.666, 28.026, 40.882, 53.527, 42.221, 42.221 loss_ts=19446.877, 16269.05, 13148.938, 11828.364, 11097.337, 5871.38, 5871.38 f_ts=17548.744, 19199.219, 23628.014, 21098.88, 26118.945, 14722.812, 14722.812 is_accepted_ts= 0.0000 lr=0.00000010 time=126.57s 2021-07-16 20:43:13,702 : early exit due to nan 2021-07-16 20:43:13,703 : done

Can you tell me how to reinterpret the result in the paper? Thanks.

opened by ljrprocc 1

Owner

Ruiqi Gao

Ph.D student at VCLA@UCLA. Research interest is machine learning, computer vision and artificial intelligence.

GitHub

Minimal diffusion models - Minimal code and simple experiments to play with Denoising Diffusion Probabilistic Models (DDPMs)

Minimal code and simple experiments to play with Denoising Diffusion Probabilist

16 Oct 6, 2022

Pytorch-diffusion - A basic PyTorch implementation of 'Denoising Diffusion Probabilistic Models'

PyTorch implementation of 'Denoising Diffusion Probabilistic Models' This reposi

76 Jan 7, 2023

Pcos-prediction - Predicts the likelihood of Polycystic Ovary Syndrome based on patient attributes and symptoms

PCOS Prediction ?? Predicts the likelihood of Polycystic Ovary Syndrome based on

1 Jan 10, 2022

💡 Learnergy is a Python library for energy-based machine learning models.

Learnergy: Energy-based Machine Learners Welcome to Learnergy. Did you ever reach a bottleneck in your computational experiments? Are you tired of imp

57 Nov 17, 2022

PyTorch code accompanying our paper on Maximum Entropy Generators for Energy-Based Models

Maximum Entropy Generators for Energy-Based Models All experiments have tensorboard visualizations for samples / density / train curves etc. To run th

135 Oct 27, 2022

Official implementation for Likelihood Regret: An Out-of-Distribution Detection Score For Variational Auto-encoder at NeurIPS 2020

Likelihood-Regret Official implementation of Likelihood Regret: An Out-of-Distribution Detection Score For Variational Auto-encoder at NeurIPS 2020. T

33 Oct 12, 2022

Code for "Human Pose Regression with Residual Log-likelihood Estimation", ICCV 2021 Oral

Human Pose Regression with Residual Log-likelihood Estimation [Paper] [arXiv] [Project Page] Human Pose Regression with Residual Log-likelihood Estima

347 Dec 24, 2022

Github for the conference paper GLOD-Gaussian Likelihood OOD detector

FOOD - Fast OOD Detector Pytorch implamentation of the confernce peper FOOD arxiv link. Abstract Deep neural networks (DNNs) perform well at classifyi

17 Jun 19, 2022

The Multi-Mission Maximum Likelihood framework (3ML)

PyPi Conda The Multi-Mission Maximum Likelihood framework (3ML) A framework for multi-wavelength/multi-messenger analysis for astronomy/astrophysics.

The Multi-Mission Maximum Likelihood (3ML)

62 Dec 30, 2022

Minimisation of a negative log likelihood fit to extract the lifetime of the D^0 meson (MNLL2ELDM)

Minimisation of a negative log likelihood fit to extract the lifetime of the D^0 meson (MNLL2ELDM) Introduction The average lifetime of the $D^{0}$ me

1 Dec 17, 2021

DeFMO: Deblurring and Shape Recovery of Fast Moving Objects (CVPR 2021)

Evaluation, Training, Demo, and Inference of DeFMO DeFMO: Deblurring and Shape Recovery of Fast Moving Objects (CVPR 2021) Denys Rozumnyi, Martin R. O

139 Dec 26, 2022

HandTailor: Towards High-Precision Monocular 3D Hand Recovery

HandTailor This repository is the implementation code and model of the paper "HandTailor: Towards High-Precision Monocular 3D Hand Recovery" (arXiv) G

113 Jan 6, 2023

[ICCV'21] PlaneTR: Structure-Guided Transformers for 3D Plane Recovery

PlaneTR: Structure-Guided Transformers for 3D Plane Recovery This is the official implementation of our ICCV 2021 paper News There maybe some bugs in

73 Nov 30, 2022

Unsupervised 3D Human Mesh Recovery from Noisy Point Clouds

Unsupervised 3D Human Mesh Recovery from Noisy Point Clouds Xinxin Zuo, Sen Wang, Minglun Gong, Li Cheng Prerequisites We have tested the code on Ubun

41 Dec 12, 2022

(ICCV 2021) ProHMR - Probabilistic Modeling for Human Mesh Recovery

ProHMR - Probabilistic Modeling for Human Mesh Recovery Code repository for the paper: Probabilistic Modeling for Human Mesh Recovery Nikos Kolotouros

209 Dec 13, 2022

the code for paper "Energy-Based Open-World Uncertainty Modeling for Confidence Calibration"

EOW-Softmax This code is for the paper "Energy-Based Open-World Uncertainty Modeling for Confidence Calibration". Accepted by ICCV21. Usage Commnd exa

36 Dec 2, 2022

Energy consumption estimation utilities for Jetson-based platforms

This repository contains a utility for measuring energy consumption when running various programs in NVIDIA Jetson-based platforms. Currently TX-2, NX, and AGX are supported.

10 Jun 17, 2022

This is the codebase for Diffusion Models Beat GANS on Image Synthesis.

3k Dec 26, 2022

Official PyTorch implementation for FastDPM, a fast sampling algorithm for diffusion probabilistic models

Official PyTorch implementation for "On Fast Sampling of Diffusion Probabilistic Models". FastDPM generation on CIFAR-10, CelebA, and LSUN datasets. S

68 Dec 26, 2022

Learning Energy-Based Models by Diffusion Recovery Likelihood

Related tags

Overview

Learning Energy-Based Models by Diffusion Recovery Likelihood

Requirements

Train with 1 GPU

CIFAR10

CelebA

LSUN church_outdoor 64x64 / LSUN bedroom 64x64

LSUN church_outdoor 128x128

LSUN bedroom 128x128

Compute full FID / IS scores after training on CIFAR10

Train with Google Cloud TPU

Pretrained models

Citation

Comments

How long does it take to train on CIFAR10 using a single GPU/TPU?

Can you explain why the energy is divided by b0?

How to train with multi GPUs?

Some Errors

Code implementation for the NLL metric

It seems not converge when setting K=30, T=6

Owner

Ruiqi Gao

Minimal diffusion models - Minimal code and simple experiments to play with Denoising Diffusion Probabilistic Models (DDPMs)

Pytorch-diffusion - A basic PyTorch implementation of 'Denoising Diffusion Probabilistic Models'

Pcos-prediction - Predicts the likelihood of Polycystic Ovary Syndrome based on patient attributes and symptoms

💡 Learnergy is a Python library for energy-based machine learning models.

PyTorch code accompanying our paper on Maximum Entropy Generators for Energy-Based Models

Official implementation for Likelihood Regret: An Out-of-Distribution Detection Score For Variational Auto-encoder at NeurIPS 2020

Code for "Human Pose Regression with Residual Log-likelihood Estimation", ICCV 2021 Oral

Github for the conference paper GLOD-Gaussian Likelihood OOD detector

The Multi-Mission Maximum Likelihood framework (3ML)

Minimisation of a negative log likelihood fit to extract the lifetime of the D^0 meson (MNLL2ELDM)

DeFMO: Deblurring and Shape Recovery of Fast Moving Objects (CVPR 2021)

HandTailor: Towards High-Precision Monocular 3D Hand Recovery

[ICCV'21] PlaneTR: Structure-Guided Transformers for 3D Plane Recovery

Unsupervised 3D Human Mesh Recovery from Noisy Point Clouds

(ICCV 2021) ProHMR - Probabilistic Modeling for Human Mesh Recovery

the code for paper "Energy-Based Open-World Uncertainty Modeling for Confidence Calibration"

Energy consumption estimation utilities for Jetson-based platforms

This is the codebase for Diffusion Models Beat GANS on Image Synthesis.

Official PyTorch implementation for FastDPM, a fast sampling algorithm for diffusion probabilistic models