Global-Local Context Network for Person Search

Peng Zheng

Last update: Oct 17, 2022

Related tags

Overview

Global-Local Context Network for Person Search

Abstract:

Person search aims to jointly localize and identify a query person from natural, uncropped images, which has been actively studied in the computer vision community over the past few years. In this paper, we delve into the rich context information globally and locally surrounding the target person, which we refer to scene and group context,respectively. Unlike previous works that treat the two types of context individually, we exploit them in a unified global-local context network (GLCNet) with the intuitive aim of feature enhancement. Specifically, re-ID embeddings and context features are enhanced simultaneously in a multi-stage fashion, ultimately leading to enhanced, discriminative features for person search. We conduct the experiments on two person search benchmarks (i.e., CUHK-SYSU and PRW) as well as extend our approach to a more challenging setting (i.e., character search on MovieNet). Extensive experimental results demonstrate the consistent improvement of the proposed GLCNet over the state-of-the-art methods on the three datasets.

Overall architecture of our GLCNet:

Performance

Datasets	CUHK-SYSU	CUHK-SYSU	PRW	PRW
Methods	mAP	top-1	mAP	top-1
OIM	75.5	78.7	21.3	49.4
NAE+	92.1	92.9	44.0	81.1
TCTS	93.9	95.1	46.8	87.5
AlignPS+	94.0	94.5	46.1	82.1
SeqNet+CBGM	94.8	95.7	47.6	87.6
GLCNet	95.7	96.3	46.9	85.1
GLCNet+CBGM	96.0	96.3	47.6	88.0

Different gallery size on CUHK-SYSU:

Qualitative Results:

Train

sh ./run_${DATASET}.sh

Test

sh ./test_${DATASET}.sh

Inference

Run the demo.py to make inference on given images. GLCNet runs at 10.3 fps on a single Tesla V100 GPU with batch_size 3.

MovieNet-CS

To extend person search framework to a more challenging task, i.e., character search (CS). We borrow the character detection and ID annotations from the MovieNet dataset to organize MovieNet-CS, and set different levels of training set and different gallery size same as CUHK-SYSU. MovieNet-CS is saved exactly the same format and structure as CUHK-SYSU, which could be of great convenience to further research and experiments. If you want to use MovieNet-CS, please download movie frames on the official website of MovieNet and our reorganized annotations here(TBD).

Acknowledgement

Thanks to the solid codebase from SeqNet.

Citation

@ARTICLE{2021arXiv211202500Z,
    author   = {Peng Zheng and
                Jie Qin and
                Yichao Yan and
                Shengcai Liao and
                Bingbing Ni and
                Xiaogang Cheng and
                Ling Shao},
    title    = {Global-Local Context Network for Person Search},
    journal  = {arXiv e-prints},
    volume   = {abs/2109.00211},
    year     = {2021}
}

You might also like...

codes for paper Combining Dynamic Local Context Focus and Dependency Cluster Attention for Aspect-level sentiment classification

DLCF-DCA codes for paper Combining Dynamic Local Context Focus and Dependency Cluster Attention for Aspect-level sentiment classification. submitted t

15 Aug 30, 2022

Locally Enhanced Self-Attention: Rethinking Self-Attention as Local and Context Terms

LESA Introduction This repository contains the official implementation of Locally Enhanced Self-Attention: Rethinking Self-Attention as Local and Cont

20 Dec 31, 2021

Code for CVPR 2021 paper: Anchor-Free Person Search

Introduction This is the implementationn for Anchor-Free Person Search in CVPR2021 License This project is released under the Apache 2.0 license. Inst

158 Jan 4, 2023

Robust Partial Matching for Person Search in the Wild

APNet for Person Search Introduction This is the code of Robust Partial Matching for Person Search in the Wild accepted in CVPR2020. The Align-to-Part

36 Dec 18, 2022

CM-NAS: Cross-Modality Neural Architecture Search for Visible-Infrared Person Re-Identification (ICCV2021)

CM-NAS Official Pytorch code of paper CM-NAS: Cross-Modality Neural Architecture Search for Visible-Infrared Person Re-Identification in ICCV2021. Vis

40 Nov 25, 2022

Joint Detection and Identification Feature Learning for Person Search

Person Search Project This repository hosts the code for our paper Joint Detection and Identification Feature Learning for Person Search. The code is

712 Dec 17, 2022

PSTR: End-to-End One-Step Person Search With Transformers (CVPR2022)

PSTR (CVPR2022) This code is an official implementation of "PSTR: End-to-End One-Step Person Search With Transformers (CVPR2022)". End-to-end one-step

28 Dec 13, 2022

:hot_pepper: R²SQL: "Dynamic Hybrid Relation Network for Cross-Domain Context-Dependent Semantic Parsing." (AAAI 2021)

R²SQL The PyTorch implementation of paper Dynamic Hybrid Relation Network for Cross-Domain Context-Dependent Semantic Parsing. (AAAI 2021) Requirement

60 Dec 31, 2022

CVPR2021: Temporal Context Aggregation Network for Temporal Action Proposal Refinement

Temporal Context Aggregation Network - Pytorch This repo holds the pytorch-version codes of paper: "Temporal Context Aggregation Network for Temporal

63 Sep 27, 2022

Comments

Pretrained weigths

@ZhengPeng7 hi thanks for the wonderful work and the code base , can you please share the pretrained weight file on google drive or on one drive Thanks in advance

opened by abhigoku10 3
Network Details

Thank you for your excellent work！I have two questions about network details. 1.Scene Context: 对于一张图片的每一个人来说，Scene Context 是怎么区别呢。是不是每个人所对应Scene Context都是相同的？都是该图片resnet最后输出的特征，经过CE模块后变为一个2048维的向量。应该是这样，但是我还想找你确认一下。 For everyone in a picture, how does Scene Context make the difference? Is the Scene Context the same for everyone? These are the features of the last output of the image resnet. After passing through the CE module, it becomes a vector of 2048 dimensions.It should be so, but I still want to check with you.

2.Group Context 在 Group CE之前，128维的特征向量是怎么来的？您是把所有正样本的特征变成一个128维的特征吗？假如图中有两个人，那么有两个ROI区域。每一个ROI就有一个256维的特征向量，您把两个256的特征向量变维一个128维的向量。如果是三个人的话，就把三个256维的向量变为一个128维的向量。还有具体您是怎么实现的，直接cat变为256xN维的特征，然后再做一个1X1的卷积，变为128通道的特征，是这样吗？ Before Group CE, how did the 128-dimensional eigenvectors come from? Are you turning all the features of a positive sample into a 128-dimensional feature? If there are two people in the graph, then there are two ROI regions. Each ROI has a 256-dimensional eigenvector, and you turn two 256 eigenvectors into a 128-dimensional vector. If it were three people, three vectors of 256 dimensions would be changed into one vector of 128 dimensions. And how exactly did you achieve this, directly cat into a 256*N-dimensional feature, and then do a 1X1 convolution to become a 128-channel feature, is that right?

opened by FeboReigns 2

Owner

Peng Zheng

Life sucks, code bugs.

GitHub

Losslandscapetaxonomy - Taxonomizing local versus global structure in neural network loss landscapes

Taxonomizing local versus global structure in neural network loss landscapes Int

8 Dec 30, 2022

PyTorch implementation of "ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context" (INTERSPEECH 2020)

ContextNet ContextNet has CNN-RNN-transducer architecture and features a fully convolutional encoder that incorporates global context information into

24 Nov 24, 2022

Video Contrastive Learning with Global Context

Video Contrastive Learning with Global Context (VCLR) This is the official PyTorch implementation of our VCLR paper. Install dependencies environments

143 Dec 26, 2022

library for nonlinear optimization, wrapping many algorithms for global and local, constrained or unconstrained, optimization

NLopt is a library for nonlinear local and global optimization, for functions with and without gradient information. It is designed as a simple, unifi

1.4k Dec 25, 2022

Deep Image Search is an AI-based image search engine that includes deep transfor learning features Extraction and tree-based vectorized search.

Deep Image Search - AI-Based Image Search Engine Deep Image Search is an AI-based image search engine that includes deep transfer learning features Ex

139 Jan 1, 2023

Global-Local Context Network for Person Search

Related tags

Overview

Global-Local Context Network for Person Search

Performance

Train

Test

Inference

MovieNet-CS

Acknowledgement

Citation

You might also like...

codes for paper Combining Dynamic Local Context Focus and Dependency Cluster Attention for Aspect-level sentiment classification

Locally Enhanced Self-Attention: Rethinking Self-Attention as Local and Context Terms

Code for CVPR 2021 paper: Anchor-Free Person Search

Robust Partial Matching for Person Search in the Wild

CM-NAS: Cross-Modality Neural Architecture Search for Visible-Infrared Person Re-Identification (ICCV2021)

Joint Detection and Identification Feature Learning for Person Search

PSTR: End-to-End One-Step Person Search With Transformers (CVPR2022)

:hot_pepper: R²SQL: "Dynamic Hybrid Relation Network for Cross-Domain Context-Dependent Semantic Parsing." (AAAI 2021)

CVPR2021: Temporal Context Aggregation Network for Temporal Action Proposal Refinement

Comments

Pretrained weigths

Network Details

Owner

Peng Zheng

Losslandscapetaxonomy - Taxonomizing local versus global structure in neural network loss landscapes

PyTorch implementation of "ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context" (INTERSPEECH 2020)

Video Contrastive Learning with Global Context

library for nonlinear optimization, wrapping many algorithms for global and local, constrained or unconstrained, optimization

Conformer: Local Features Coupling Global Representations for Visual Recognition

Implementation of Self-supervised Graph-level Representation Learning with Local and Global Structure (ICML 2021).

Official code for "Focal Self-attention for Local-Global Interactions in Vision Transformers"

Decentralized Reinforcment Learning: Global Decision-Making via Local Economic Transactions (ICML 2020)

Pytorch implementation of 'Fingerprint Presentation Attack Detector Using Global-Local Model'

Deep Image Search is an AI-based image search engine that includes deep transfor learning features Extraction and tree-based vectorized search.