Rethinking of Pedestrian Attribute Recognition: A Reliable Evaluation under Zero-Shot Pedestrian Identity Setting

Overview

Rethinking of Pedestrian Attribute Recognition: A Reliable Evaluation under Zero-Shot Pedestrian Identity Setting (official Pytorch implementation)

zero-shot This paper submitted to TIP is the extension of the previous Arxiv paper.

This project aims to

  1. provide a baseline of pedestrian attribute recognition.
  2. provide two new datasets RAPzs and PETAzs following zero-shot pedestrian identity setting.
  3. provide a general training pipeline for pedestrian attribute recognition and multi-label classification task.

This project provide

  1. DDP training, which is mainly used for multi-label classifition.
  2. Training on all attributes, testing on "selected" attribute. Because the proportion of positive samples for other attributes is less than a threshold, such as 0.01.
    1. For PETA and PETAzs, 35 of the 105 attributes are selected for performance evaluation.
    2. For RAPv1, 51 of the 92 attributes are selected for performance evaluation.
    3. For RAPv2 and RAPzs, 54 and 53 of the 152 attributes are selected for performance evaluation.
    4. For PA100k, all attributes are selected for performance evaluation.
    • However, training on all attributes can not bring consistent performance improvement on various datasets.
  3. EMA model.
  4. Transformer-base model, such as swin-transformer (with a huge performance improvement) and vit.
  5. Convenient dataset info file like dataset_all.pkl

Dataset Info

  • PETA: Pedestrian Attribute Recognition At Far Distance [Paper][Project]

  • PA100K[Paper][Github]

  • RAP : A Richly Annotated Dataset for Pedestrian Attribute Recognition

  • PETAzs & RAPzs : Rethinking of Pedestrian Attribute Recognition: A Reliable Evaluation under Zero-Shot Pedestrian Identity Setting Paper [Project]

Performance

Pedestrian Attribute Recognition

Datasets Models ma Acc Prec Rec F1
PA100k resnet50 80.21 79.15 87.79 87.01 87.40
-- resnet50* 79.85 79.13 89.45 85.40 87.38
-- resnet50 + EMA 81.97 80.20 88.06 88.17 88.11
-- bninception 79.13 78.19 87.42 86.21 86.81
-- TresnetM 74.46 68.72 79.82 80.71 80.26
-- swin_s 82.19 80.35 87.85 88.51 88.18
-- vit_s 79.40 77.61 86.41 86.22 86.32
-- vit_b 81.01 79.38 87.60 87.49 87.55
PETA resnet50 83.96 78.65 87.08 85.62 86.35
PETAzs resnet50 71.43 58.69 74.41 69.82 72.04
RAPv1 resnet50 79.27 67.98 80.19 79.71 79.95
RAPv2 resnet50 78.52 66.09 77.20 80.23 78.68
RAPzs resnet50 71.76 64.83 78.75 76.60 77.66
  • The resnet* model is trained by using the weighted function proposed by Tan in AAAI2020.
  • Performance in PETAzs and RAPzs based on the first version of PETAzs and RAPzs as described in paper.
  • Experiments are conducted on the input size of (256, 192), so there may be minor differences from the results in the paper.
  • The reported performance can be achieved at the first drop of learning rate. We also take this model as the best model.
  • Pretrained models are provided now at Google Drive.

Multi-label Classification

Datasets Models mAP CP CR CF1 OP OR OF1
COCO resnet101 82.75 84.17 72.07 77.65 85.16 75.47 80.02

Pretrained Models

Dependencies

  • python 3.7
  • pytorch 1.7.0
  • torchvision 0.8.2
  • cuda 10.1

Get Started

  1. Run git clone https://github.com/valencebond/Rethinking_of_PAR.git
  2. Create a directory to dowload above datasets.
    cd Rethinking_of_PAR
    mkdir data
    
  3. Prepare datasets to have following structure:
    ${project_dir}/data
        PETA
            images/
            PETA.mat
            dataset_all.pkl
            dataset_zs_run0.pkl
        PA100k
            data/
            dataset_all.pkl
        RAP
            RAP_dataset/
            RAP_annotation/
            dataset_all.pkl
        RAP2
            RAP_dataset/
            RAP_annotation/
            dataset_zs_run0.pkl
        COCO14
            train2014/
            val2014/
            ml_anno/
                category.json
                coco14_train_anno.pkl
                coco14_val_anno.pkl
    
  4. Train baseline based on resnet50
    sh train.sh
    

Acknowledgements

Codes are based on the repository from Dangwei Li and Houjing Huang. Thanks for their released code.

Citation

If you use this method or this code in your research, please cite as:

@article{jia2021rethinking,
  title={Rethinking of Pedestrian Attribute Recognition: A Reliable Evaluation under Zero-Shot Pedestrian Identity Setting},
  author={Jia, Jian and Huang, Houjing and Chen, Xiaotang and Huang, Kaiqi},
  journal={arXiv preprint arXiv:2107.03576},
  year={2021}
}
You might also like...
SC-GlowTTS: an Efficient Zero-Shot Multi-Speaker Text-To-Speech Model

SC-GlowTTS: an Efficient Zero-Shot Multi-Speaker Text-To-Speech Model Edresson Casanova, Christopher Shulby, Eren Gölge, Nicolas Michael Müller, Frede

code for CVPR paper Zero-shot Instance Segmentation

Code for CVPR2021 paper Zero-shot Instance Segmentation Code requirements python: python3.7 nvidia GPU pytorch1.1.0 GCC =5.4 NCCL 2 the other python

Zero-shot Synthesis with Group-Supervised Learning (ICLR 2021 paper)
Zero-shot Synthesis with Group-Supervised Learning (ICLR 2021 paper)

GSL - Zero-shot Synthesis with Group-Supervised Learning Figure: Zero-shot synthesis performance of our method with different dataset (iLab-20M, RaFD,

Codes for ACL-IJCNLP 2021 Paper
Codes for ACL-IJCNLP 2021 Paper "Zero-shot Fact Verification by Claim Generation"

Zero-shot-Fact-Verification-by-Claim-Generation This repository contains code and models for the paper: Zero-shot Fact Verification by Claim Generatio

Official Pytorch Implementation of:
Official Pytorch Implementation of: "Semantic Diversity Learning for Zero-Shot Multi-label Classification"(2021) paper

Semantic Diversity Learning for Zero-Shot Multi-label Classification Paper Official PyTorch Implementation Avi Ben-Cohen, Nadav Zamir, Emanuel Ben Bar

 Shared Attention for Multi-label Zero-shot Learning
Shared Attention for Multi-label Zero-shot Learning

Shared Attention for Multi-label Zero-shot Learning Overview This repository contains the implementation of Shared Attention for Multi-label Zero-shot

PyTorch implementation of 1712.06087
PyTorch implementation of 1712.06087 "Zero-Shot" Super-Resolution using Deep Internal Learning

Unofficial PyTorch implementation of "Zero-Shot" Super-Resolution using Deep Internal Learning Unofficial Implementation of 1712.06087 "Zero-Shot" Sup

Zero-Shot Text-to-Image Generation VQGAN+CLIP Dockerized
Zero-Shot Text-to-Image Generation VQGAN+CLIP Dockerized

VQGAN-CLIP-Docker About Zero-Shot Text-to-Image Generation VQGAN+CLIP Dockerized This is a stripped and minimal dependency repository for running loca

An official implementation of "Exploiting a Joint Embedding Space for Generalized Zero-Shot Semantic Segmentation" (ICCV 2021) in PyTorch.

Exploiting a Joint Embedding Space for Generalized Zero-Shot Semantic Segmentation This is an official implementation of the paper "Exploiting a Joint

Comments
  • PETAzs and RAPV2zs download link

    PETAzs and RAPV2zs download link

    Hello,

    Thank you for releasing the code for this awesome work! I wanted to know how to access the proposed datasets in the paper: PETAzs and RAPv2 zs? From where can we download these datasets also which scripts we need to run to keep the dataset in required file structure?

    Thank you in advance!

    opened by chaitrasj 3
  • Data augmentations

    Data augmentations

    I notice that there are many method of data augmentations mentioned in the Ablation Study in the paper. I assume this is relate to the "_C.TRAIN.DATAAUG.TYPE = 'base' " part in the config. How can I use this to test different types of data augmentation as suggested in the paper?

    opened by oho123dt 1
  • Found some mistake in training

    Found some mistake in training

    In my training RAP2, i follow as readme mention, but found in the log: "RAP2 attr_num : 54, eval_attr_num : 54", and testing the saving model file " with shape torch.Size([54, 2048]) from checkpoint " i think attr_num must be 119 as readme said "Training on all attributes, testing on "selected" attribute." so in rapv2.yaml it should be " LABEL: 'all' " instead of " LABEL: 'eval' " , am i right ?

    opened by tangsipeng 1
  • Could you provide us training configuration more precisely?

    Could you provide us training configuration more precisely?

    I trained your model with swin_s for PA100K but there was noticable gap between my experiment(79.48) and your reported performance(80.35) I assume that configuration i used for training is different from your settings. Could you provide us more precise configuration for swin_s (PA100K)? many thanks.

    opened by macarize 1
Owner
Jian
computer vision
Jian
Complete system for facial identity system. Include one-shot model, database operation, features visualization, monitoring

Complete system for facial identity system. Include one-shot model, database operation, features visualization, monitoring

null 2 Dec 28, 2021
[CVPR 2021] Released code for Counterfactual Zero-Shot and Open-Set Visual Recognition

Counterfactual Zero-Shot and Open-Set Visual Recognition This project provides implementations for our CVPR 2021 paper Counterfactual Zero-S

null 144 Dec 24, 2022
PyTorch evaluation code for Delving Deep into the Generalization of Vision Transformers under Distribution Shifts.

Out-of-distribution Generalization Investigation on Vision Transformers This repository contains PyTorch evaluation code for Delving Deep into the Gen

Chongzhi Zhang 72 Dec 13, 2022
Deepface is a lightweight face recognition and facial attribute analysis (age, gender, emotion and race) framework for python

deepface Deepface is a lightweight face recognition and facial attribute analysis (age, gender, emotion and race) framework for python. It is a hybrid

Kushal Shingote 2 Feb 10, 2022
Face Identity Disentanglement via Latent Space Mapping [SIGGRAPH ASIA 2020]

Face Identity Disentanglement via Latent Space Mapping Description Official Implementation of the paper Face Identity Disentanglement via Latent Space

null 150 Dec 7, 2022
Non-Official Pytorch implementation of "Face Identity Disentanglement via Latent Space Mapping" https://arxiv.org/abs/2005.07728 Using StyleGAN2 instead of StyleGAN

Face Identity Disentanglement via Latent Space Mapping - Implement in pytorch with StyleGAN 2 Description Pytorch implementation of the paper Face Ide

Daniel Roich 58 Dec 24, 2022
Official PyTorch Implementation for InfoSwap: Information Bottleneck Disentanglement for Identity Swapping

InfoSwap: Information Bottleneck Disentanglement for Identity Swapping Code usage Please check out the user manual page. Paper Gege Gao, Huaibo Huang,

Grace Hešeri 56 Dec 20, 2022
Reproduce ResNet-v2(Identity Mappings in Deep Residual Networks) with MXNet

Reproduce ResNet-v2 using MXNet Requirements Install MXNet on a machine with CUDA GPU, and it's better also installed with cuDNN v5 Please fix the ran

Wei Wu 531 Dec 4, 2022
Complete system for facial identity system

Complete system for facial identity system. Include one-shot model, database operation, features visualization, monitoring

null 4 May 2, 2022
GB-CosFace: Rethinking Softmax-based Face Recognition from the Perspective of Open Set Classification

GB-CosFace: Rethinking Softmax-based Face Recognition from the Perspective of Open Set Classification This is the official pytorch implementation of t

Alibaba Cloud 5 Nov 14, 2022