Implementation of accepted AAAI 2021 paper: Deep Unsupervised Image Hashing by Maximizing Bit Entropy

Last update: Dec 30, 2022

Related tags

Deep Learning Deep-Unsupervised-Image-Hashing

Overview

Deep Unsupervised Image Hashing by Maximizing Bit Entropy

This is the PyTorch implementation of accepted AAAI 2021 paper: Deep Unsupervised Image Hashing by Maximizing Bit Entropy

Proposed Bi-half layer

A simple, parameter-free, bi-half coding layer to maximize hash channel capacity

Datasets and Architectures on different settings

Experiments on 5 image datasets: Flickr25k, Nus-wide, Cifar-10, Mscoco, Mnist, and 2 video datasets: Ucf-101 and Hmdb-51. According to different settings, we divided them into: i) Train an AutoEncoder on Mnist; ii) Image Hashing on Flickr25k, Nus-wide, Cifar-10, Mscoco using Pre-trained Vgg; iii) Video Hashing on Ucf-101 and Hmdb-51 using Pre-trained 3D models.

Glance

3 settings ── AutoEncoder ── ── ── ── ImageHashing ── ── ── ── VideoHashing      
               ├── Sign.py             ├── Cifar10_I.py          └── main.py
               ├── SignReg.py          ├── Cifar10_II.py
               └── BiHalf.py           ├── Flickr25k.py
    	     			       └── Mscoco.py

Datasets download

#	Datasets	Download
1	Flick25k	Link
2	Mscoco	Link
3	Nuswide	Link
4	Cifar10	Link
5	Mnist	Link
6	Ucf101	Link
7	Hmdb51	Link

For video datasets, we converted them from avi to jpg files. The original avi videos can be download: Ucf101 and Hmdb51.

Implementation Details for Video Setup

For the video datasets ucf101 and hmdb51, to generate a training sample, we first select a video frame by uniform sampling, and then generate a 16-frame clip around the frame. If the selected position has less than 16 frames before the video ends, then we repeat the procedure until it fits. We spatially resize the cropped sample to 112 x 112 pixels, resulting in one training sample with size of 3 channels x 16 frames x 112 pixels x 112 pixels. In the retrieval, we adopt sliding window to generate clips as input, i.e, each video is split into non-overlapping 16-frame clips. Each video has an average 92 non-overlapped clips. Take the ucf101 for example, we obtain a query set of 3,783 videos containing 348,047 non-overlapped clips, and the retrieval set of 9,537 videos containing 891,961 clips. We then input the non-overlapped clips to extract binary descriptors for hashing. For more details, please see the paper.

Pretrained model

You can download kinetics pre-trained 3D models: ResNet-34 and ResNet-101 here.

3D Visualization

The continuous feature visualization on an AutoEncoder using Mnist. We compare 3 different models: sign layer, sign+reg and our bi-half layer.

Sign Layer	Sign + Reg	Bi-half Layer

Citation

If you find the code in this repository useful for your research consider citing it.

@article{liAAAI2021,
  title={Deep Unsupervised Image Hashing by Maximizing Bit Entropy},
  author={Li, Yunqiang and van Gemert, Jan},
  journal={AAAI},
  year={2021}
}

Contact

If you have any problem about our code, feel free to contact

You might also like...

Predicting path with preference based on user demonstration using Maximum Entropy Deep Inverse Reinforcement Learning in a continuous environment

Preference-Planning-Deep-IRL Introduction Check my portfolio post Dependencies Gym stable-baselines3 PyTorch Usage Take Demonstration python3 record.

9 Oct 26, 2022

Official implementation for paper Knowledge Bridging for Empathetic Dialogue Generation (AAAI 2021).

Knowledge Bridging for Empathetic Dialogue Generation This is the official implementation for paper Knowledge Bridging for Empathetic Dialogue Generat

50 Dec 20, 2022

PyTorch code accompanying our paper on Maximum Entropy Generators for Energy-Based Models

Maximum Entropy Generators for Energy-Based Models All experiments have tensorboard visualizations for samples / density / train curves etc. To run th

135 Oct 27, 2022

QTool: A Low-bit Quantization Toolbox for Deep Neural Networks in Computer Vision

This project provides abundant choices of quantization strategies (such as the quantization algorithms, training schedules and empirical tricks) for quantizing the deep neural networks into low-bit counterparts.

51 Dec 10, 2022

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

This is the Vowpal Wabbit fast online learning code. Why Vowpal Wabbit? Vowpal Wabbit is a machine learning system which pushes the frontier of machin

8.1k Jan 6, 2023

This is a Python Module For Encryption, Hashing And Other stuff

EnroCrypt This is a Python Module For Encryption, Hashing And Other Basic Stuff You Need, With Secure Encryption And Strong Salted Hashing You Can Do

5 Sep 15, 2022

Fake videos detection by tracing the source using video hashing retrieval.

Vision Transformer Based Video Hashing Retrieval for Tracing the Source of Fake Videos 🎉️ 📜 Directory Introduction VTL Trace Samples and Acc of Hash

56 Dec 22, 2022

Softlearning is a reinforcement learning framework for training maximum entropy policies in continuous domains. Includes the official implementation of the Soft Actor-Critic algorithm.

Softlearning Softlearning is a deep reinforcement learning toolbox for training maximum entropy policies in continuous domains. The implementation is

997 Dec 30, 2022

Seach Losses of our paper 'Loss Function Discovery for Object Detection via Convergence-Simulation Driven Search', accepted by ICLR 2021.

CSE-Autoloss Designing proper loss functions for vision tasks has been a long-standing research direction to advance the capability of existing models

54 Dec 17, 2022

Comments

about the loss function for image features and hash codes
The images features are extracted by vgg16 networks which are activated by ReLu, and the range is 0 to +inf. So, the cosine similarity between features is from 0 to 1. However, the hash codes are +1 or -1 which cosine similarity is from -1 to 1. The loss function is mseloss written as : /ImageHashing/Flickr25k.py:101

target_b = F.cosine_similarity(b[:int(labels.size(0) / 2)], b[int(labels.size(0) / 2):]) target_x = F.cosine_similarity(x[:int(labels.size(0) / 2)], x[int(labels.size(0) / 2):]) loss = F.mse_loss(target_b, target_x)

I’m not sure if there is a problem here, because the numerical ranges of the target_b and target_x are not the same.
opened by Dreamupers 3
The loss has been rising and tends to oscillate.

At the beginning of training, the real-time loss keeps rising and tends to oscillate later. I have tried several times and it is the same. May I ask whether this is a problem with the model implementation or my training graph output. Thank you in advance.

opened by huhailang9012 2
Image Retrieval in Practical Applications

The Bi-half layer you proposed to maximize bit entropy is very clever, but I have a question about that: In most case, the query image is one picture rather than a minibatch images generated by dataloader in actual retrieval or testing. The Bi-half layer only generate [-1,-1,-1,...] if the input is one image which is useless. In essence, it may do not encourage the network before Bi-half to generate hash codes with maximum bit entropy because it is parameter-free. Looking forward for your reply, thanks.

opened by Dreamupers 2
How to generate the label of image?

Thanks for you excellent work? I confused about that how to generate the label of image when make a train dataset?

for example, each image in Mscoco has a bin code as train label. How to generate this label on my own image?

opened by captainfffsama 1

Implementation of accepted AAAI 2021 paper: Deep Unsupervised Image Hashing by Maximizing Bit Entropy

Related tags

Overview

Deep Unsupervised Image Hashing by Maximizing Bit Entropy

Proposed Bi-half layer

Datasets and Architectures on different settings

Glance

Datasets download

Implementation Details for Video Setup

Pretrained model

3D Visualization

Citation

Contact

You might also like...

Predicting path with preference based on user demonstration using Maximum Entropy Deep Inverse Reinforcement Learning in a continuous environment

Official implementation for paper Knowledge Bridging for Empathetic Dialogue Generation (AAAI 2021).

PyTorch code accompanying our paper on Maximum Entropy Generators for Energy-Based Models

QTool: A Low-bit Quantization Toolbox for Deep Neural Networks in Computer Vision

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

This is a Python Module For Encryption, Hashing And Other stuff

Fake videos detection by tracing the source using video hashing retrieval.

Softlearning is a reinforcement learning framework for training maximum entropy policies in continuous domains. Includes the official implementation of the Soft Actor-Critic algorithm.

Seach Losses of our paper 'Loss Function Discovery for Object Detection via Convergence-Simulation Driven Search', accepted by ICLR 2021.

Comments

about the loss function for image features and hash codes

The loss has been rising and tends to oscillate.

Image Retrieval in Practical Applications

How to generate the label of image?

Owner

Official implementation of NeurIPS 2021 paper "One Loss for All: Deep Hashing with a Single Cosine Similarity based Learning Objective"

Decorators for maximizing memory utilization with PyTorch & CUDA

Source code for our paper "Learning to Break Deep Perceptual Hashing: The Use Case NeuralHash"

PyTorch code for SENTRY: Selective Entropy Optimization via Committee Consistency for Unsupervised DA

Official PyTorch implementation of the paper "Recycling Discriminator: Towards Opinion-Unaware Image Quality Assessment Using Wasserstein GAN", accepted to ACM MM 2021 BNI Track.

Joint Versus Independent Multiview Hashing for Cross-View Retrieval[J] (IEEE TCYB 2021, PyTorch Code)

This project is the official implementation of our accepted ICLR 2021 paper BiPointNet: Binary Neural Network for Point Clouds.

Official PyTorch implementation of the preprint paper "Stylized Neural Painting", accepted to CVPR 2021.

This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis, accepted at EMNLP 2021.

Official PyTorch implementation of "Proxy Synthesis: Learning with Synthetic Classes for Deep Metric Learning" (AAAI 2021)