Code for AAAI 2021 paper: Sequential End-to-end Network for Efficient Person Search

Overview
图片名称

This repository hosts the source code of our paper: [AAAI 2021]Sequential End-to-end Network for Efficient Person Search. SeqNet achieves the state-of-the-art performance on two widely used benchmarks and runs at 11.5 FPS on a single GPU. You can find a brief Chinese introduction at zhihu.

Performance profile:

Dataset mAP Top-1 Model
CUHK-SYSU 94.8 95.7 model
PRW 47.6 87.6 model

The network structure is simple and suitable as baseline:

SeqNet

Installation

Run pip install -r requirements.txt in the root directory of the project.

Quick Start

Let's say $ROOT is the root directory.

  1. Download CUHK-SYSU and PRW datasets, and unzip them to $ROOT/data
$ROOT/data
├── CUHK-SYSU
└── PRW
  1. Following the link in the above table, download our pretrained model to anywhere you like, e.g., $ROOT/exp_cuhk
  2. Evaluate its performance by specifing the paths of checkpoint and corresponding configuration file.
python train.py --cfg $ROOT/exp_cuhk/config.yaml --eval --ckpt $ROOT/exp_cuhk/epoch_19.pth

Training

Pick one configuration file you like in $ROOT/configs, and run with it.

python train.py --cfg configs/cuhk_sysu.yaml

Note: At present, our script only supports single GPU training, but distributed training will be also supported in future. By default, the batch size and the learning rate during training are set to 5 and 0.003 respectively, which requires about 28GB of GPU memory. If your GPU cannot provide the required memory, try smaller batch size and learning rate (performance may degrade). Specifically, your setting should follow the Linear Scaling Rule: When the minibatch size is multiplied by k, multiply the learning rate by k. For example:

python train.py --cfg configs/cuhk_sysu.yaml INPUT.BATCH_SIZE_TRAIN 2 SOLVER.BASE_LR 0.0012

Tip: If the training process stops unexpectedly, you can resume from the specified checkpoint.

python train.py --cfg configs/cuhk_sysu.yaml --resume --ckpt /path/to/your/checkpoint

Test

Suppose the output directory is $ROOT/exp_cuhk. Test the trained model:

python train.py --cfg $ROOT/exp_cuhk/config.yaml --eval --ckpt $ROOT/exp_cuhk/epoch_19.pth

Test with Context Bipartite Graph Matching algorithm:

python train.py --cfg $ROOT/exp_cuhk/config.yaml --eval --ckpt $ROOT/exp_cuhk/epoch_19.pth EVAL_USE_CBGM True

Test the upper bound of the person search performance by using GT boxes:

python train.py --cfg $ROOT/exp_cuhk/config.yaml --eval --ckpt $ROOT/exp_cuhk/epoch_19.pth EVAL_USE_GT True

Pull Request

Pull request is welcomed! Before submitting a PR, DO NOT forget to run ./dev/linter.sh that provides syntax checking and code style optimation.

Citation

@inproceedings{li2021sequential,
  title={Sequential End-to-end Network for Efficient Person Search},
  author={Li, Zhengjia and Miao, Duoqian},
  booktitle={Proceedings of the AAAI conference on artificial intelligence},
  year={2021}
}
Comments
  • run python train.py can't get the search results

    run python train.py can't get the search results

    When I run python train.py --cfg $ROOT/exp_cuhk/config.yaml --ckpt $ROOT/exp_cuhk/epoch_19.pth, I can't get the search results in demo_imgs. So I want to ask if it should be python demo.py --cfg cfg $ROOT/exp_cuhk/config.yaml --ckpt $ROOT/exp_cuhk/epoch_19.pth. Thanks!

    opened by circlety 6
  • Problems running demo.py

    Problems running demo.py

    Hi, I was running the demo code of seqnet: python demo.py --cfg ~/checkpoints/seqnet/config.yaml --ckpt ~/checkpoints/seqnet/epoch_17.pth

    but I got the following error:

    Traceback (most recent call last): File "demo.py", line 89, in main(args) File "demo.py", line 63, in main query_feat = model(query_img, query_target)[0] File "/home/mmmm/miniconda3/envs/seqnet/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in call result = self.forward(*input, **kwargs) File "/home/mmmm/repo/SeqNet/models/seqnet.py", line 134, in forward return self.inference(images, targets, query_img_as_gallery) File "/home/mmmm/repo/SeqNet/models/seqnet.py", line 117, in inference box_features = self.roi_heads.box_roi_pool(features, boxes, images.image_sizes) File "/home/mmmm/miniconda3/envs/seqnet/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in call result = self.forward(*input, **kwargs) File "/home/mmmm/miniconda3/envs/seqnet/lib/python3.7/site-packages/torchvision/ops/poolers.py", line 141, in forward sampling_ratio=self.sampling_ratio File "/home/mmmm/miniconda3/envs/seqnet/lib/python3.7/site-packages/torchvision/ops/roi_align.py", line 70, in roi_align return _RoIAlignFunction.apply(input, rois, output_size, spatial_scale, sampling_ratio) File "/home/mmmm/miniconda3/envs/seqnet/lib/python3.7/site-packages/torchvision/ops/roi_align.py", line 25, in forward output_size[0], output_size[1], sampling_ratio) RuntimeError: Expected tensor for argument #1 'input' to have the same type as tensor for argument #2 'rois'; but type Variable[CUDAFloatType] does not equal Variable[CUDALongType] (while checking arguments for ROIAlign_forward_cuda) (checkSameType at /pytorch/aten/src/ATen/TensorUtils.cpp:140)

    So far I had no idea what caused this error, could you help me with this problem? Thx a lot.

    PyTorch version mismatch 
    opened by Svithjod 5
  • "Loss is nan, stopping train" appears regularly

    I followed the steps in the READ.ME, configured the file directory structure, and trained the model. But there are always strange problems, like the log information intercepted below.

    ----OUTPUT---- Epoch: [5] [1660/2241] eta: 0:08:56 lr: 0.003000 loss: 2.2882 (2.4257) loss_proposal_cls: 0.0818 (0.0915) loss_proposal_reg: 1.2728 (1.4000) loss_box_cls: 0.1167 (0.1311) loss_box_reg: 0.1667 (0.1707) loss_box_reid: 0.4618 (0.5611) loss_rpn_reg: 0.0283 (0.0344) loss_rpn_cls: 0.0317 (0.0369) time: 0.9248 data: 0.0005 max mem: 24005 Loss is nan, stopping training {'loss_proposal_cls': tensor(0.0837, device='cuda:0', grad_fn=), 'loss_proposal_reg': tensor(1.3923, device='cuda:0', grad_fn=), 'loss_box_cls': tensor(0.1187, device='cuda:0', grad_fn=), 'loss_box_reg': tensor(0.1719, device='cuda:0', grad_fn=), 'loss_box_reid': tensor(nan, device='cuda:0', grad_fn=), 'loss_rpn_reg': tensor(0.0457, device='cuda:0', grad_fn=), 'loss_rpn_cls': tensor(0.0226, device='cuda:0', grad_fn=)}

    This phenomenon occurs after executing a fixed number of epochs. The error "Loss is nan, stopping training" is very regular. For example, after 5 epochs, it will appear after the 1160th batch of the 6th epoch, whether it is training from epoch=0 or using the --resume command .

    Whether the model is trained on the RTX A6000,RTX A5000 or Tesla V100 32G, or whether the batch size and learning rate are adjusted in equal proportions, this error will occur, thus stopping the training.

    I used the --resume command to train for 20 epochs, and observed that every time the problem appeared on the loss_box_reid.

    This should be a bug in the code, but I'm not quite sure how it came about and how to fix it.

    PyTorch version mismatch 
    opened by YizJia 4
  • How I know the searched people is the same as the people in the gallery?

    How I know the searched people is the same as the people in the gallery?

    a person may appear in serveral pictures,and the dataset has the unique person ID for every one? because gallery and query has duplicate pics, and one pic may contain serveral persons, so you compute IOU between searched perosn box with groud truth to get positive case? thank you!

    opened by Yue-Rain 4
  • oim loss is not contain  CQ(circuler queue) softmax loss ?

    oim loss is not contain CQ(circuler queue) softmax loss ?

    74 in oim.py

    loss_oim=F.cross_entropy(projected,label,ignore_index=5554) I have a question, if you ignore_index 5554, you just calculate labelled pids classification loss. And where is unlabelled pids softmax cross_entorpy loss? image

    opened by HenryZhangJianhe 3
  • 工程问题

    工程问题

    作者您好,我最近在Nvidia NX上做一个行人检测+行人识别的项目,我采用yolov5+reid的方案,但是跟踪速度不能满足我的需求,我想用这个算法来替换我的方案。因此我想请教三个问题,1:如果我减小这个网络,将速度提升至实时,是否会对精度有特别大的影响。2:这个算法能否应用于同时对三个行人识别的场景。3:我是否可以理解为,person search可以完全替代且优于检测+reid的方案。4:希望您能提供一点思路和建议。 最后,祝您生活愉快

    opened by Jiangsiping 2
  • Code for the

    Code for the "NAE+ SeqNet" version for compare

    I have tried your code, it is really concise and easy to understand. In order to learn it more comprehensively, May you please also release the NAE+ SeqNet version for compare

    Many thanks!

    opened by Smalltarget108 2
  • For demo.py :

    For demo.py : "RuntimeError: CUDA out of memory."

    Hello,I use the trained model to run demo.py,it passes one gallery image and throws the error.

    Processing demo_imgs/gallery-2.jpg
    Traceback (most recent call last):
      File "demo.py", line 89, in <module>
        main(args)
      File "demo.py", line 69, in main
        gallery_output = model(gallery_img)[0]
      File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 722, in _call_impl
        result = self.forward(*input, **kwargs)
      File "/home/ubuntu/workspace/SeqNet/models/seqnet.py", line 132, in forward
        return self.inference(images, targets, query_img_as_gallery)
      File "/home/ubuntu/workspace/SeqNet/models/seqnet.py", line 107, in inference
        features = self.backbone(images.tensors)
      File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 722, in _call_impl
        result = self.forward(*input, **kwargs)
      File "/home/ubuntu/workspace/SeqNet/models/resnet.py", line 27, in forward
        feat = super(Backbone, self).forward(x)
      File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/container.py", line 117, in forward
        input = module(input)
      File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 722, in _call_impl
        result = self.forward(*input, **kwargs)
      File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/container.py", line 117, in forward
        input = module(input)
      File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 722, in _call_impl
        result = self.forward(*input, **kwargs)
      File "/usr/local/lib/python3.8/dist-packages/torchvision/models/resnet.py", line 113, in forward
        out = self.bn3(out)
      File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 722, in _call_impl
        result = self.forward(*input, **kwargs)
      File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/batchnorm.py", line 131, in forward
        return F.batch_norm(
      File "/usr/local/lib/python3.8/dist-packages/torch/nn/functional.py", line 2014, in batch_norm
        return torch.batch_norm(
    RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 7.80 GiB total capacity; 6.81 GiB already allocated; 1.38 MiB free; 6.89 GiB reserved in total by PyTorch)
    

    Can you give some suggestions to avoid "OOM"?

    opened by developer-young 2
  • question about freezing layers in ResNet backbone

    question about freezing layers in ResNet backbone

    Hi. Thank you for great research and an awesome implementation.

    It's not really an issue, but I have a minor question about your model implementation in models/resnet.py. Here, you freeze conv1/batchnorm layers within ResNet, as follows:

    resnet.conv1.weight.requires_grad_(False)    
    resnet.bn1.weight.requires_grad_(False)    
    resnet.bn1.bias.requires_grad_(False)
    

    Is there a particular reason behind this choice?

    opened by sanghoooon 2
  • hi, have you tried train seqnet with both cuhk-sysu and prw?

    hi, have you tried train seqnet with both cuhk-sysu and prw?

    Thank you for your strong codebase, I recently try to train seqnet with both cuhk-sysu and prw, but the results even get worse. I'm wondering whether the author have tried such a setting?

    opened by caposerenity 1
  • hi,your model in google can not find

    hi,your model in google can not find

    https://drive.google.com/file/d/1wKhCHy7uTHx8zxNS62Y1236GNv5TzFzq/view?usp=sharing https://drive.google.com/file/d/1I9OI6-sfVyop_aLDIWaYwd7Z4hD34hwZ/view?usp=sharing this two we can not download for the reason that in google 回收站

    opened by mikeswf 1
  • how to get the reported mAP?

    how to get the reported mAP?

    Hi, I trained the model with bs 1 and lr 0.0006, but only got mAP = 85.79%, so I wonder how to approach the performance you mentioned in your paper? Do I need to fix any other configs?

    opened by Wuzimeng 1
Owner
Zj Li
ECNU.bachelor.CS | TONGJI.master.CV
Zj Li
Code for the AAAI 2018 publication "SEE: Towards Semi-Supervised End-to-End Scene Text Recognition"

SEE: Towards Semi-Supervised End-to-End Scene Text Recognition Code for the AAAI 2018 publication "SEE: Towards Semi-Supervised End-to-End Scene Text

Christian Bartz 572 Jan 5, 2023
The code of "Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes"

Mask TextSpotter A Pytorch implementation of Mask TextSpotter along with its extension can be find here Introduction This is the official implementati

Pengyuan Lyu 261 Nov 21, 2022
Code related to "Have Your Text and Use It Too! End-to-End Neural Data-to-Text Generation with Semantic Fidelity" paper

DataTuner You have just found the DataTuner. This repository provides tools for fine-tuning language models for a task. See LICENSE.txt for license de

null 81 Jan 1, 2023
An official PyTorch implementation of the paper "Learning by Aligning: Visible-Infrared Person Re-identification using Cross-Modal Correspondences", ICCV 2021.

PyTorch implementation of Learning by Aligning (ICCV 2021) This is an official PyTorch implementation of the paper "Learning by Aligning: Visible-Infr

CV Lab @ Yonsei University 30 Nov 5, 2022
Demo for the paper "Overlap-aware low-latency online speaker diarization based on end-to-end local segmentation"

Streaming speaker diarization Overlap-aware low-latency online speaker diarization based on end-to-end local segmentation by Juan Manuel Coria, Hervé

Juanma Coria 185 Jan 1, 2023
This repository lets you train neural networks models for performing end-to-end full-page handwriting recognition using the Apache MXNet deep learning frameworks on the IAM Dataset.

Handwritten Text Recognition (OCR) with MXNet Gluon These notebooks have been created by Jonathan Chung, as part of his internship as Applied Scientis

Amazon Web Services - Labs 422 Jan 3, 2023
Unofficial implementation of "TableNet: Deep Learning model for end-to-end Table detection and Tabular data extraction from Scanned Document Images"

TableNet Unofficial implementation of ICDAR 2019 paper : TableNet: Deep Learning model for end-to-end Table detection and Tabular data extraction from

Jainam Shah 243 Dec 30, 2022
End-to-end pipeline for real-time scene text detection and recognition.

Real-time-Scene-Text-Detection-and-Recognition-System End-to-end pipeline for real-time scene text detection and recognition. The detection model use

Fangneng Zhan 89 Aug 4, 2022
textspotter - An End-to-End TextSpotter with Explicit Alignment and Attention

An End-to-End TextSpotter with Explicit Alignment and Attention This is initially described in our CVPR 2018 paper. Getting Started Installation Clone

Tong He 323 Nov 10, 2022
CTPN + DenseNet + CTC based end-to-end Chinese OCR implemented using tensorflow and keras

简介 基于Tensorflow和Keras实现端到端的不定长中文字符检测和识别 文本检测:CTPN 文本识别:DenseNet + CTC 环境部署 sh setup.sh 注:CPU环境执行前需注释掉for gpu部分,并解开for cpu部分的注释 Demo 将测试图片放入test_images

Yang Chenguang 2.6k Dec 29, 2022
A Joint Video and Image Encoder for End-to-End Retrieval

Frozen️ in Time ❄️ ️️️️ ⏳ A Joint Video and Image Encoder for End-to-End Retrieval (arXiv) Repository to contain the code, models, data for end-to-end

null 225 Dec 25, 2022
code for our ICCV 2021 paper "DeepCAD: A Deep Generative Network for Computer-Aided Design Models"

DeepCAD This repository provides source code for our paper: DeepCAD: A Deep Generative Network for Computer-Aided Design Models Rundi Wu, Chang Xiao,

Rundi Wu 85 Dec 31, 2022
A facial recognition program that plays a alarm (mp3 file) when a person i seen in the room. A basic theif using Python and OpenCV

Home-Security-Demo A facial recognition program that plays a alarm (mp3 file) when a person is seen in the room. A basic theif using Python and OpenCV

SysKey 4 Nov 2, 2021
A simple python program to record security cam footage by detecting a face and body of a person in the frame.

SecurityCam A simple python program to record security cam footage by detecting a face and body of a person in the frame. This code was created by me,

null 1 Nov 8, 2021
Code for the paper STN-OCR: A single Neural Network for Text Detection and Text Recognition

STN-OCR: A single Neural Network for Text Detection and Text Recognition This repository contains the code for the paper: STN-OCR: A single Neural Net

Christian Bartz 496 Jan 5, 2023
Code for paper "Role-based network embedding via structural features reconstruction with degree-regularized constraint"

Role-based network embedding via structural features reconstruction with degree-regularized constraint Train python main.py --dataset brazil-flights

wang zhang 1 Jun 28, 2022
SceneCollisionNet This repo contains the code for "Object Rearrangement Using Learned Implicit Collision Functions", an ICRA 2021 paper. For more info

SceneCollisionNet This repo contains the code for "Object Rearrangement Using Learned Implicit Collision Functions", an ICRA 2021 paper. For more info

NVIDIA Research Projects 31 Nov 22, 2022
Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd.

Head Detector Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd. The head_detection mod

Ramana Subramanyam 76 Dec 6, 2022
Dataset and Code for ICCV 2021 paper "Real-world Video Super-resolution: A Benchmark Dataset and A Decomposition based Learning Scheme"

Dataset and Code for RealVSR Real-world Video Super-resolution: A Benchmark Dataset and A Decomposition based Learning Scheme Xi Yang, Wangmeng Xiang,

Xi Yang 91 Nov 22, 2022