PyTorch Implementation of Sparse DETR

Kakao Brain

Last update: Dec 28, 2022

Related tags

Deep Learning sparse-detr

Overview

Sparse DETR

By Byungseok Roh*, Jaewoong Shin*, Wuhyun Shin*, and Saehoon Kim at Kakao Brain. (*: Equal contribution)

This repository is an official implementation of the paper Sparse DETR: Efficient End-to-End Object Detection with Learnable Sparsity.
The code and some of the instructions are built upon the official Deformable DETR repository.

The code will be updated soon.

Comments

Questions on DAM creation
Hi! Thank you for releasing such a wonderful work. How DAM is generated was a bit unclear to me when reading the paper. Assuming there are N tokens in total from the encoder (considering one feature level, then N = H x W), and M object queries:

Regarding "In the case of the dense attention, DAM can be easily obtained by summing up attention maps from every decoder layer", do you mean the cross-attention with shape N x M?

Regarding "produces a single map of the same size as the feature map from the backbone", how is this achieved? Could you help walk through the calculation and the shapes of the tensors?

Why not directly use the DAM to select the top-k tokens and why have a separate scoring network?

Thanks! I look forward to your reply.
opened by DianCh 6
matrix contains invalid numeric entries?

When I try to test with the pre-trained model, an error reported matrix contains invalid numeric entries, due to backbone output is nan, is there any solution?

opened by ddwhzh 4
RuntimeError: 0INTERNAL ASSERT FAILED at "/opt/conda/conda-bld/pytorch_1639180594101/work/torch/csrc/jit/ir/alias_analysis.cpp":584, please report a bug to PyTorch. We don't have an op for aten::fill_ but it isn't a special case. Argument types: Tensor, bool,

RuntimeError: 0INTERNAL ASSERT FAILED at "/opt/conda/conda-bld/pytorch_1639180594101/work/torch/csrc/jit/ir/alias_analysis.cpp":584, please report a bug to PyTorch. We don't have an op for aten::fill_ but it isn't a special case. Argument types: Tensor, bool,

opened by cxq1 3
Scoring Network

What does the structure of the scoring network consist of? The paper only describes that the scoring network is the same as the detection head of the decoder, is it all the same? Thank you for your help!

opened by Zhuyuan-yuan 0
Request for checkpoint file of swin-t detr

Hello, nice work! In your repo, I find that the AP of swin-t detr is reported. Would you like to kindly offer the checkpoint of 500-epoch swin-t detr with AP of 45.4? Looking for your reply! Cordially, Sean

opened by GSeanCDAT 0
Version check (torchvision) >= 0.10 error fix
TL;DR ) Fixed util/misc.py's version checking method for torchvision to be compatible with torchvision >= 0.10.0

Since I was interested in your paper, Sparse DETR, I was trying to run(train/test) model within Google Colab environment.

During the initial run, I found out that torchvision version checking method does not work as intended:

My torchvision version was0.10.1, but the checking method treated as 0.1.

Solved by replacing

float(torchvision.__version__[:3]) < 0.5

with

float(torchvision.__version__.split('.')[1]) < 5

And for <0.7, similarly fixed.

Therefore, I request pull-request to fix this issue.

Thank you. Taewoo Jeong.
opened by EherSenaw 0
num_classes during model creation

If my understanding is correct, when you do:

self.class_embed = nn.Linear(hidden_dim, num_classes)

at DeformableDETR() in deformable_detr.py, "num_classes" should instead be "num_classes + 1"

The same thing goes for:

self.class_embed.bias.data = torch.ones(num_classes) * bias_value

in the same DeformableDETR() function, where it should be "num_classes + 1" instead of "num_classes".

Otherwise I'm just getting confused somewhere, but I'm pretty sure there should be an extra background class logit.

That is how it was done in the original DETR code anyways.

opened by SupremeLobster 0

Owner

Kakao Brain

Kakao Brain Corp.

GitHub

Differentiable Neural Computers, Sparse Access Memory and Sparse Differentiable Neural Computers, for Pytorch

Differentiable Neural Computers and family, for Pytorch Includes: Differentiable Neural Computers (DNC) Sparse Access Memory (SAM) Sparse Differentiab

302 Dec 14, 2022

Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.

PyTorch Implementation of Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers 1 Using Colab Please notic

489 Jan 7, 2023

[CVPR 2022] Official Pytorch code for OW-DETR: Open-world Detection Transformer

OW-DETR: Open-world Detection Transformer (CVPR 2022) [Paper] Akshita Gupta*, Sanath Narayan*, K J Joseph, Salman Khan, Fahad Shahbaz Khan, Mubarak Sh

127 Dec 27, 2022

Official implementation of the ICCV 2021 paper "Conditional DETR for Fast Training Convergence".

The DETR approach applies the transformer encoder and decoder architecture to object detection and achieves promising performance. In this paper, we handle the critical issue, slow training convergence, and present a conditional cross-attention mechanism for fast DETR training. Our approach is motivated by that the cross-attention in DETR relies highly on the content embeddings and that the spatial embeddings make minor contributions, increasing the need for high-quality content embeddings and thus increasing the training difficulty.

281 Dec 30, 2022

[CVPR2021 Oral] UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers This is the official PyTorch implementation and models for UP-DETR paper: @a

430 Dec 23, 2022

PED: DETR for Crowd Pedestrian Detection

PED: DETR for Crowd Pedestrian Detection Code for PED: DETR For (Crowd) Pedestrian Detection Paper PED: DETR for Crowd Pedestrian Detection Installati

36 Sep 13, 2022

Deformable DETR is an efficient and fast-converging end-to-end object detector.

Deformable DETR: Deformable Transformers for End-to-End Object Detection.

2k Jan 5, 2023

Moment-DETR code and QVHighlights dataset

Moment-DETR QVHighlights: Detecting Moments and Highlights in Videos via Natural Language Queries Jie Lei, Tamara L. Berg, Mohit Bansal For dataset de

133 Dec 22, 2022

PyTorch implementation of: Michieli U. and Zanuttigh P., "Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations", CVPR 2021.

Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations This is the official PyTorch implementation

Multimedia Technology and Telecommunication Lab

42 Nov 9, 2022

🌈 PyTorch Implementation for EMNLP'21 Findings "Reasoning Visual Dialog with Sparse Graph Learning and Knowledge Transfer"

SGLKT-VisDial Pytorch Implementation for the paper: Reasoning Visual Dialog with Sparse Graph Learning and Knowledge Transfer Gi-Cheon Kang, Junseok P

9 Jul 5, 2022

PyTorch code for our paper "Image Super-Resolution with Non-Local Sparse Attention" (CVPR2021).

Image Super-Resolution with Non-Local Sparse Attention This repository is for NLSN introduced in the following paper "Image Super-Resolution with Non-

143 Dec 28, 2022

Official Python implementation of the 'Sparse deconvolution'-v0.3.0

Sparse deconvolution Python v0.3.0 Official Python implementation of the 'Sparse deconvolution', and the CPU (NumPy) and GPU (CuPy) calculation backen

23 Dec 28, 2022

Sparse-dense operators implementation for Paddle

Sparse-dense operators implementation for Paddle This module implements coo, csc and csr matrix formats and their inter-ops with dense matrices. Feel

3 Dec 17, 2022

Implementation of CVPR'2022:Reconstructing Surfaces for Sparse Point Clouds with On-Surface Priors

Reconstructing Surfaces for Sparse Point Clouds with On-Surface Priors (CVPR 2022) Personal Web Pages | Paper | Project Page This repository contains

151 Dec 26, 2022

CPU inference engine that delivers unprecedented performance for sparse models

The DeepSparse Engine is a CPU runtime that delivers unprecedented performance by taking advantage of natural sparsity within neural networks to reduce compute required as well as accelerate memory bound workloads. It is focused on model deployment and scaling machine learning pipelines, fitting seamlessly into your existing deployments as an inference backend.

1.2k Jan 9, 2023

PyTorch Implementation of Sparse DETR

Related tags

Overview

Sparse DETR

Comments

Questions on DAM creation

matrix contains invalid numeric entries?

RuntimeError: 0INTERNAL ASSERT FAILED at "/opt/conda/conda-bld/pytorch_1639180594101/work/torch/csrc/jit/ir/alias_analysis.cpp":584, please report a bug to PyTorch. We don't have an op for aten::fill_ but it isn't a special case. Argument types: Tensor, bool,

Scoring Network

Request for checkpoint file of swin-t detr

Version check (torchvision) >= 0.10 error fix

num_classes during model creation

Owner

Kakao Brain

Differentiable Neural Computers, Sparse Access Memory and Sparse Differentiable Neural Computers, for Pytorch

Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.

[CVPR 2022] Official Pytorch code for OW-DETR: Open-world Detection Transformer

Official implementation of the ICCV 2021 paper "Conditional DETR for Fast Training Convergence".

[CVPR2021 Oral] UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

PED: DETR for Crowd Pedestrian Detection

Deformable DETR is an efficient and fast-converging end-to-end object detector.

Moment-DETR code and QVHighlights dataset

PyTorch implementation of: Michieli U. and Zanuttigh P., "Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations", CVPR 2021.

🌈 PyTorch Implementation for EMNLP'21 Findings "Reasoning Visual Dialog with Sparse Graph Learning and Knowledge Transfer"

PyTorch code for our paper "Image Super-Resolution with Non-Local Sparse Attention" (CVPR2021).

Official Python implementation of the 'Sparse deconvolution'-v0.3.0

Sparse-dense operators implementation for Paddle

Implementation of CVPR'2022:Reconstructing Surfaces for Sparse Point Clouds with On-Surface Priors

CPU inference engine that delivers unprecedented performance for sparse models

QueryDet: Cascaded Sparse Query for Accelerating High-Resolution SmallObject Detection

CondenseNet V2: Sparse Feature Reactivation for Deep Networks

Submanifold sparse convolutional networks

Block Sparse movement pruning