PyTorch Implementation of Sparse DETR

Overview
Comments
  • Questions on DAM creation

    Questions on DAM creation

    Hi! Thank you for releasing such a wonderful work. How DAM is generated was a bit unclear to me when reading the paper. Assuming there are N tokens in total from the encoder (considering one feature level, then N = H x W), and M object queries:

    1. Regarding "In the case of the dense attention, DAM can be easily obtained by summing up attention maps from every decoder layer", do you mean the cross-attention with shape N x M?
    2. Regarding "produces a single map of the same size as the feature map from the backbone", how is this achieved? Could you help walk through the calculation and the shapes of the tensors?
    3. Why not directly use the DAM to select the top-k tokens and why have a separate scoring network?

    Thanks! I look forward to your reply.

    opened by DianCh 6
  • matrix contains invalid numeric entries?

    matrix contains invalid numeric entries?

    When I try to test with the pre-trained model, an error reported matrix contains invalid numeric entries, due to backbone output is nan, is there any solution?

    opened by ddwhzh 4
  • RuntimeError: 0INTERNAL ASSERT FAILED at

    RuntimeError: 0INTERNAL ASSERT FAILED at "/opt/conda/conda-bld/pytorch_1639180594101/work/torch/csrc/jit/ir/alias_analysis.cpp":584, please report a bug to PyTorch. We don't have an op for aten::fill_ but it isn't a special case. Argument types: Tensor, bool,

    RuntimeError: 0INTERNAL ASSERT FAILED at "/opt/conda/conda-bld/pytorch_1639180594101/work/torch/csrc/jit/ir/alias_analysis.cpp":584, please report a bug to PyTorch. We don't have an op for aten::fill_ but it isn't a special case. Argument types: Tensor, bool,

    opened by cxq1 3
  • Scoring Network

    Scoring Network

    What does the structure of the scoring network consist of? The paper only describes that the scoring network is the same as the detection head of the decoder, is it all the same? Thank you for your help!

    opened by Zhuyuan-yuan 0
  • Request for checkpoint file of swin-t detr

    Request for checkpoint file of swin-t detr

    Hello, nice work! In your repo, I find that the AP of swin-t detr is reported. Would you like to kindly offer the checkpoint of 500-epoch swin-t detr with AP of 45.4? Looking for your reply! Cordially, Sean

    opened by GSeanCDAT 0
  • Version check (torchvision) >= 0.10 error fix

    Version check (torchvision) >= 0.10 error fix

    TL;DR ) Fixed util/misc.py's version checking method for torchvision to be compatible with torchvision >= 0.10.0

    Since I was interested in your paper, Sparse DETR, I was trying to run(train/test) model within Google Colab environment.

    During the initial run, I found out that torchvision version checking method does not work as intended:

    • My torchvision version was0.10.1, but the checking method treated as 0.1.

    Solved by replacing

    float(torchvision.__version__[:3]) < 0.5
    

    with

    float(torchvision.__version__.split('.')[1]) < 5
    

    And for <0.7, similarly fixed.

    Therefore, I request pull-request to fix this issue.

    Thank you. Taewoo Jeong.

    opened by EherSenaw 0
  • num_classes during model creation

    num_classes during model creation

    If my understanding is correct, when you do:

    self.class_embed = nn.Linear(hidden_dim, num_classes)

    at DeformableDETR() in deformable_detr.py, "num_classes" should instead be "num_classes + 1"

    The same thing goes for:

    self.class_embed.bias.data = torch.ones(num_classes) * bias_value

    in the same DeformableDETR() function, where it should be "num_classes + 1" instead of "num_classes".

    Otherwise I'm just getting confused somewhere, but I'm pretty sure there should be an extra background class logit.

    That is how it was done in the original DETR code anyways.

    opened by SupremeLobster 0
Owner
Kakao Brain
Kakao Brain Corp.
Kakao Brain
Differentiable Neural Computers, Sparse Access Memory and Sparse Differentiable Neural Computers, for Pytorch

Differentiable Neural Computers and family, for Pytorch Includes: Differentiable Neural Computers (DNC) Sparse Access Memory (SAM) Sparse Differentiab

ixaxaar 302 Dec 14, 2022
Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.

PyTorch Implementation of Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers 1 Using Colab Please notic

Hila Chefer 489 Jan 7, 2023
[CVPR 2022] Official Pytorch code for OW-DETR: Open-world Detection Transformer

OW-DETR: Open-world Detection Transformer (CVPR 2022) [Paper] Akshita Gupta*, Sanath Narayan*, K J Joseph, Salman Khan, Fahad Shahbaz Khan, Mubarak Sh

Akshita Gupta 127 Dec 27, 2022
Official implementation of the ICCV 2021 paper "Conditional DETR for Fast Training Convergence".

The DETR approach applies the transformer encoder and decoder architecture to object detection and achieves promising performance. In this paper, we handle the critical issue, slow training convergence, and present a conditional cross-attention mechanism for fast DETR training. Our approach is motivated by that the cross-attention in DETR relies highly on the content embeddings and that the spatial embeddings make minor contributions, increasing the need for high-quality content embeddings and thus increasing the training difficulty.

null 281 Dec 30, 2022
[CVPR2021 Oral] UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers This is the official PyTorch implementation and models for UP-DETR paper: @a

dddzg 430 Dec 23, 2022
PED: DETR for Crowd Pedestrian Detection

PED: DETR for Crowd Pedestrian Detection Code for PED: DETR For (Crowd) Pedestrian Detection Paper PED: DETR for Crowd Pedestrian Detection Installati

null 36 Sep 13, 2022
Deformable DETR is an efficient and fast-converging end-to-end object detector.

Deformable DETR: Deformable Transformers for End-to-End Object Detection.

null 2k Jan 5, 2023
Moment-DETR code and QVHighlights dataset

Moment-DETR QVHighlights: Detecting Moments and Highlights in Videos via Natural Language Queries Jie Lei, Tamara L. Berg, Mohit Bansal For dataset de

Jie Lei 雷杰 133 Dec 22, 2022
PyTorch implementation of: Michieli U. and Zanuttigh P., "Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations", CVPR 2021.

Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations This is the official PyTorch implementation

Multimedia Technology and Telecommunication Lab 42 Nov 9, 2022
🌈 PyTorch Implementation for EMNLP'21 Findings "Reasoning Visual Dialog with Sparse Graph Learning and Knowledge Transfer"

SGLKT-VisDial Pytorch Implementation for the paper: Reasoning Visual Dialog with Sparse Graph Learning and Knowledge Transfer Gi-Cheon Kang, Junseok P

Gi-Cheon Kang 9 Jul 5, 2022
PyTorch code for our paper "Image Super-Resolution with Non-Local Sparse Attention" (CVPR2021).

Image Super-Resolution with Non-Local Sparse Attention This repository is for NLSN introduced in the following paper "Image Super-Resolution with Non-

null 143 Dec 28, 2022
Official Python implementation of the 'Sparse deconvolution'-v0.3.0

Sparse deconvolution Python v0.3.0 Official Python implementation of the 'Sparse deconvolution', and the CPU (NumPy) and GPU (CuPy) calculation backen

Weisong Zhao 23 Dec 28, 2022
Sparse-dense operators implementation for Paddle

Sparse-dense operators implementation for Paddle This module implements coo, csc and csr matrix formats and their inter-ops with dense matrices. Feel

北海若 3 Dec 17, 2022
Implementation of CVPR'2022:Reconstructing Surfaces for Sparse Point Clouds with On-Surface Priors

Reconstructing Surfaces for Sparse Point Clouds with On-Surface Priors (CVPR 2022) Personal Web Pages | Paper | Project Page This repository contains

null 151 Dec 26, 2022
CPU inference engine that delivers unprecedented performance for sparse models

The DeepSparse Engine is a CPU runtime that delivers unprecedented performance by taking advantage of natural sparsity within neural networks to reduce compute required as well as accelerate memory bound workloads. It is focused on model deployment and scaling machine learning pipelines, fitting seamlessly into your existing deployments as an inference backend.

Neural Magic 1.2k Jan 9, 2023
QueryDet: Cascaded Sparse Query for Accelerating High-Resolution SmallObject Detection

QueryDet-PyTorch This repository is the official implementation of our paper: QueryDet: Cascaded Sparse Query for Accelerating High-Resolution Small O

Chenhongyi Yang 276 Dec 31, 2022
CondenseNet V2: Sparse Feature Reactivation for Deep Networks

CondenseNetV2 This repository is the official Pytorch implementation for "CondenseNet V2: Sparse Feature Reactivation for Deep Networks" paper by Le Y

Haojun Jiang 74 Dec 12, 2022
Submanifold sparse convolutional networks

Submanifold Sparse Convolutional Networks This is the PyTorch library for training Submanifold Sparse Convolutional Networks. Spatial sparsity This li

Facebook Research 1.8k Jan 6, 2023
Block Sparse movement pruning

Movement Pruning: Adaptive Sparsity by Fine-Tuning Magnitude pruning is a widely used strategy for reducing model size in pure supervised learning; ho

Hugging Face 54 Dec 20, 2022