Codebase for BMVC 2021 paper "Text Based Person Search with Limited Data"

Overview

Text Based Person Search with Limited Data

PWC

This is the codebase for our BMVC 2021 paper.

Please bear with me refactoring this codebase after CVPR deadline πŸ˜…

Abstract

Text-based person search (TBPS) aims at retrieving a target person from an image gallery with a descriptive text query. Solving such a fine-grained cross-modal retrieval task is challenging, which is further hampered by the lack of large-scale datasets. In this paper, we present a framework with two novel components to handle the problems brought by limited data. Firstly, to fully utilize the existing small-scale benchmarking datasets for more discriminative feature learning, we introduce a cross-modal momentum contrastive learning framework to enrich the training data for a given mini-batch. Secondly, we propose to transfer knowledge learned from existing coarse-grained large-scale datasets containing image-text pairs from drastically different problem domains to compensate for the lack of TBPS training data. A transfer learning method is designed so that useful information can be transferred despite the large domain gap. Armed with these components, our method achieves new state of the art on the CUHK-PEDES dataset with significant improvements over the prior art in terms of Rank-1 and mAP.

Comments
  • Research prepared to obtain a diploma degree in computer and Automation Engineering.

    Research prepared to obtain a diploma degree in computer and Automation Engineering.

    Hello!

    My research focuses on Person search using Visual-Textual Attributes. Having said that, I would like to use your model to assist me in my project, but I have some issues when I finish train and test the model. My problem is trying to write code to run the model to get the same response as the photo. so Can you help me please!

    photo_2022-08-07_18-44-28

    opened by ram7772 6
  • Cannot find test_query and train_query folders

    Cannot find test_query and train_query folders

    Hi @BrandonHanx

    In the ReadMe file, it is mentioned to setup the datasets dir as follows:

    └── cuhkpedes
        β”œβ”€β”€ annotations
        β”‚   β”œβ”€β”€ test.json
        β”‚   β”œβ”€β”€ train.json
        β”‚   └── val.json
        β”œβ”€β”€ clip_vocab_vit.npy
        └── imgs
            β”œβ”€β”€ cam_a
            β”œβ”€β”€ cam_b
            β”œβ”€β”€ CUHK01
            β”œβ”€β”€ CUHK03
            β”œβ”€β”€ Market
            β”œβ”€β”€ test_query
            └── train_query
    

    After downloading the cuhkpedes data set, we get only the imgs folder, containing cam_a, cam_b and CUHK01 folders. there is no test_query and train_query folders. Also, these folders are not in the repository. Could you provide more information regarding on these folders, more exactly, what kind of information they contain and how they must be set up?

    Also, there are few more folders that are not part of the cuhkpedes, such as CUHK03 and Market. Do we need these data sets to reproduce the results?

    Best regards, liviust

    opened by liviust 5
  • some problem in training and testing

    some problem in training and testing

    Hello

    I have some problem. first: I don't find test_query and train_query file when I get images from [Dr. Shuang Li] second: I have this problem for testing and training.

    image

    opened by ram7772 4
  • Problem about the clip_vocab_vit.npy

    Problem about the clip_vocab_vit.npy

    Hi :) I have a question about the pre-processing document clip_vocab_vit.npy. My understanding is that it contains the tensor of the CLIP-Text-Encoder output corresponding to each word (total 9408). My question is, the output dimension of CLIP-TEXT-ENCODER is 1024, but the tensor dimension of each word in clip_vocab_vit.npy is 512. Is there some other operation in it? Thanks

    opened by Frost-Yang-99 2
  • There is only caption_all.json in the dataset CUHK-PEDES, what are the train.json and test.json in the dataset part

    There is only caption_all.json in the dataset CUHK-PEDES, what are the train.json and test.json in the dataset part

    Describe the bug A clear and concise description of what the bug is.

    To Reproduce Steps to reproduce the behavior:

    1. Go to '...'
    2. Click on '....'
    3. Scroll down to '....'
    4. See error

    Expected behavior A clear and concise description of what you expected to happen.

    Screenshots If applicable, add screenshots to help explain your problem.

    Desktop (please complete the following information):

    • OS: [e.g. iOS]
    • Browser [e.g. chrome, safari]
    • Version [e.g. 22]

    Smartphone (please complete the following information):

    • Device: [e.g. iPhone6]
    • OS: [e.g. iOS8.1]
    • Browser [e.g. stock browser, safari]
    • Version [e.g. 22]

    Additional context Add any other context about the problem here.

    opened by SwimKY 1
Releases(v0.1.1)
Owner
Xiao Han
Ph.D. student @ UoSurrey CVSSP, B.Eng. @ ZJU ISEE
Xiao Han
Pytorch implementation of the paper Progressive Growing of Points with Tree-structured Generators (BMVC 2021)

PGpoints Pytorch implementation of the paper Progressive Growing of Points with Tree-structured Generators (BMVC 2021) Hyeontae Son, Young Min Kim Pre

Hyeontae Son 9 Jun 6, 2022
Boosting Adversarial Attacks with Enhanced Momentum (BMVC 2021)

EMI-FGSM This repository contains code to reproduce results from the paper: Boosting Adversarial Attacks with Enhanced Momentum (BMVC 2021) Xiaosen Wa

John Hopcroft Lab at HUST 10 Sep 26, 2022
[BMVC 2021] Official PyTorch Implementation of Self-supervised learning of Image Scale and Orientation Estimation

Self-Supervised Learning of Image Scale and Orientation Estimation (BMVC 2021) This is the official implementation of the paper "Self-Supervised Learn

Jongmin Lee 17 Nov 10, 2022
Cascading Feature Extraction for Fast Point Cloud Registration (BMVC 2021)

Cascading Feature Extraction for Fast Point Cloud Registration This repository contains the source code for the paper [Arxive link comming soon]. Meth

null 7 May 26, 2022
Official PyTorch implementation of "Improving Face Recognition with Large AgeGaps by Learning to Distinguish Children" (BMVC 2021)

Inter-Prototype (BMVC 2021): Official Project Webpage This repository provides the official PyTorch implementation of the following paper: Improving F

Jungsoo Lee 16 Jun 30, 2022
This repository provides the official implementation of 'Learning to ignore: rethinking attention in CNNs' accepted in BMVC 2021.

inverse_attention This repository provides the official implementation of 'Learning to ignore: rethinking attention in CNNs' accepted in BMVC 2021. Le

Firas Laakom 5 Jul 8, 2022
This is the codebase for the ICLR 2021 paper Trajectory Prediction using Equivariant Continuous Convolution

Trajectory Prediction using Equivariant Continuous Convolution (ECCO) This is the codebase for the ICLR 2021 paper Trajectory Prediction using Equivar

Spatiotemporal Machine Learning 45 Jul 22, 2022
Codebase for the solution that won first place and was awarded the most human-like agent in the 2021 NeurIPS Competition MineRL BASALT Challenge.

KAIROS MineRL BASALT Codebase for the solution that won first place and was awarded the most human-like agent in the 2021 NeurIPS Competition MineRL B

Vinicius G. Goecks 37 Oct 30, 2022
Codebase for the Summary Loop paper at ACL2020

Summary Loop This repository contains the code for ACL2020 paper: The Summary Loop: Learning to Write Abstractive Summaries Without Examples. Training

Canny Lab @ The University of California, Berkeley 44 Nov 4, 2022
A weakly-supervised scene graph generation codebase. The implementation of our CVPR2021 paper ``Linguistic Structures as Weak Supervision for Visual Scene Graph Generation''

README.md shall be finished soon. WSSGG 0 Overview 1 Installation 1.1 Faster-RCNN 1.2 Language Parser 1.3 GloVe Embeddings 2 Settings 2.1 VG-GT-Graph

Keren Ye 35 Nov 20, 2022
Codebase for the self-supervised goal reaching benchmark introduced in the LEXA paper

LEXA Benchmark Codebase for the self-supervised goal reaching benchmark introduced in the LEXA paper (Discovering and Achieving Goals via World Models

Oleg Rybkin 36 Dec 22, 2022
Official codebase for ICLR oral paper Unsupervised Vision-Language Grammar Induction with Shared Structure Modeling

CLIORA This is the official codebase for ICLR oral paper: Unsupervised Vision-Language Grammar Induction with Shared Structure Modeling. We introduce

Bo Wan                                             32 Dec 23, 2022
Spearmint Bayesian optimization codebase

Spearmint Spearmint is a software package to perform Bayesian optimization. The Software is designed to automatically run experiments (thus the code n

Formerly: Harvard Intelligent Probabilistic Systems Group -- Now at Princeton 1.5k Dec 29, 2022
A general 3D Object Detection codebase in PyTorch.

Det3D is the first 3D Object Detection toolbox which provides off the box implementations of many 3D object detection algorithms such as PointPillars, SECOND, PIXOR, etc, as well as state-of-the-art methods on major benchmarks like KITTI(ViP) and nuScenes(CBGS).

Benjin Zhu 1.4k Jan 5, 2023
Official codebase for Pretrained Transformers as Universal Computation Engines.

universal-computation Overview Official codebase for Pretrained Transformers as Universal Computation Engines. Contains demo notebook and scripts to r

Kevin Lu 210 Dec 28, 2022
AOT-GAN for High-Resolution Image Inpainting (codebase for image inpainting)

AOT-GAN for High-Resolution Image Inpainting Arxiv Paper | AOT-GAN: Aggregated Contextual Transformations for High-Resolution Image Inpainting Yanhong

Multimedia Research 214 Jan 3, 2023
This is the codebase for Diffusion Models Beat GANS on Image Synthesis.

This is the codebase for Diffusion Models Beat GANS on Image Synthesis.

OpenAI 3k Dec 26, 2022
Official codebase for Decision Transformer: Reinforcement Learning via Sequence Modeling.

Decision Transformer Lili Chen*, Kevin Lu*, Aravind Rajeswaran, Kimin Lee, Aditya Grover, Michael Laskin, Pieter Abbeel, Aravind Srinivas†, and Igor M

Kevin Lu 1.4k Jan 7, 2023
X-modaler is a versatile and high-performance codebase for cross-modal analytics.

X-modaler X-modaler is a versatile and high-performance codebase for cross-modal analytics. This codebase unifies comprehensive high-quality modules i

null 910 Dec 28, 2022