Cross-view Transformers for real-time Map-view Semantic Segmentation (CVPR 2022 Oral)

Overview

Cross View Transformers


This repository contains the source code and data for our paper:

Cross-view Transformers for real-time Map-view Semantic Segmentation
Brady Zhou, Philipp Krähenbühl
CVPR 2022

Demos


Map-view Segmentation: The model uses multi-view images to produce a map-view segmentation at 45 FPS

Map Making: With vehicle pose, we can construct a map by fusing model predictions over time

Cross-view Attention: For a given map-view location, we show which image patches are being attended to

Installation

# Clone repo
git clone https://github.com/bradyz/cross_view_transformers.git

cd cross_view_transformers

# Setup conda environment
conda create -y --name cvt python=3.8

conda activate cvt
conda install -y pytorch torchvision cudatoolkit=11.3 -c pytorch

# Install dependencies
pip install -r requirements.txt
pip install -e .

Data


Documentation:


Download the original datasets and our generated map-view labels

Dataset Labels
nuScenes keyframes + map expansion (60 GB) cvt_labels_nuscenes.tar.gz (361 MB)
Argoverse 1.1 3D tracking coming soon™

The structure of the extracted data should look like the following

/datasets/
├─ nuscenes/
│  ├─ v1.0-trainval/
│  ├─ v1.0-mini/
│  ├─ samples/
│  ├─ sweeps/
│  └─ maps/
│     ├─ basemap/
│     └─ expansion/
└─ cvt_labels_nuscenes/
   ├─ scene-0001/
   ├─ scene-0001.json
   ├─ ...
   ├─ scene-1000/
   └─ scene-1000.json

When everything is setup correctly, check out the dataset with

python3 scripts/view_data.py \
  data=nuscenes \
  data.dataset_dir=/media/datasets/nuscenes \
  data.labels_dir=/media/datasets/cvt_labels_nuscenes \
  data.version=v1.0-mini \
  visualization=nuscenes_viz \
  +split=val

Training

             

An average job of 50k training iterations takes ~8 hours.
Our models were trained using 4 GPU jobs, but also can be trained on single GPU.

To train a model,

python3 scripts/train.py \
  +experiment=cvt_nuscenes_vehicle
  data.dataset_dir=/media/datasets/nuscenes \
  data.labels_dir=/media/datasets/cvt_labels_nuscenes

For more information, see

  • config/config.yaml - base config
  • config/model/cvt.yaml - model architecture
  • config/experiment/cvt_nuscenes_vehicle.yaml - additional overrides

Additional Information

Awesome Related Repos

License

This project is released under the MIT license

Citation

If you find this project useful for your research, please use the following BibTeX entry.

@inproceedings{zhou2022cross,
    title={Cross-view Transformers for real-time Map-view Semantic Segmentation},
    author={Zhou, Brady and Kr{\"a}henb{\"u}hl, Philipp},
    booktitle={CVPR},
    year={2022}
}
Comments
  • Reproducing results of paper

    Reproducing results of paper

    Hello, many thanks for sharing the code of this awesome work !

    I am trying to reproduce your results, but the config file cvt_nuscenes_vehicle.yaml differs from what is described in the paper and the training/evaluation setup of Lift Splat Shoot.

    In particular:

    1. The use of the Center Loss instead of the Focal loss
    2. You use a learning rate of 4E-3 instead of 1E-2
    3. You use the visibility token from Nuscenes annotation to filter-out objects that have a visibility level strictly inferior to 2
    4. You use label_indices: [[4, 5, 6, 7, 8, 10, 11]] (7 classes) whereas the list of classes DYNAMIC contains 8 classes

    Do you know how these factors influence your results?

    Can you share the exact config you used to get the results in Table 1 of your paper ?

    opened by F-Barto 4
  • Error running training

    Error running training

    Hi, when I try to run the follow command to train, an error throws out.

    python scripts/train.py   data=nuscenes +experiment=cvt_nuscenes_vehicle   data.dataset_dir=data/nuscenes   data.labels_dir=data/cvt_labels_nuscenes   visualization=nuscenes_viz
    

    Error:

    /home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:486: PossibleUserWarning: Your `val_dataloader`'s sampler has shuffling enabled, it is strongly recommended that you turn shuffling off for val/test/predict dataloaders.
      rank_zero_warn(
    Epoch 0:   0%|                                                                                                                                                                                         | 0/8538 [00:00<?, ?it/s]Error executing job with overrides: ['data=nuscenes', '+experiment=cvt_nuscenes_vehicle', 'data.dataset_dir=/home/runshengxu/project/data/nuscenes', 'data.labels_dir=/home/runshengxu/project/data/cvt_labels_nuscenes', 'visualization=nuscenes_viz']
    Traceback (most recent call last):
      File "scripts/train.py", line 71, in main
        trainer.fit(model_module, datamodule=data_module, ckpt_path=ckpt_path)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 771, in fit
        self._call_and_handle_interrupt(
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 722, in _call_and_handle_interrupt
        return self.strategy.launcher.launch(trainer_fn, *args, trainer=self, **kwargs)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/strategies/launchers/subprocess_script.py", line 93, in launch
        return function(*args, **kwargs)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 812, in _fit_impl
        results = self._run(model, ckpt_path=self.ckpt_path)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1237, in _run
        results = self._run_stage()
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1324, in _run_stage
        return self._run_train()
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1354, in _run_train
        self.fit_loop.run()
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/loops/base.py", line 204, in run
        self.advance(*args, **kwargs)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/loops/fit_loop.py", line 269, in advance
        self._outputs = self.epoch_loop.run(self._data_fetcher)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/loops/base.py", line 204, in run
        self.advance(*args, **kwargs)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 208, in advance
        batch_output = self.batch_loop.run(batch, batch_idx)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/loops/base.py", line 204, in run
        self.advance(*args, **kwargs)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/loops/batch/training_batch_loop.py", line 88, in advance
        outputs = self.optimizer_loop.run(split_batch, optimizers, batch_idx)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/loops/base.py", line 204, in run
        self.advance(*args, **kwargs)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 203, in advance
        result = self._run_optimization(
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 256, in _run_optimization
        self._optimizer_step(optimizer, opt_idx, batch_idx, closure)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 369, in _optimizer_step
        self.trainer._call_lightning_module_hook(
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1596, in _call_lightning_module_hook
        output = fn(*args, **kwargs)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/core/lightning.py", line 1625, in optimizer_step
        optimizer.step(closure=optimizer_closure)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/core/optimizer.py", line 168, in step
        step_output = self._strategy.optimizer_step(self._optimizer, self._optimizer_idx, closure, **kwargs)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/strategies/ddp.py", line 278, in optimizer_step
        optimizer_output = super().optimizer_step(optimizer, opt_idx, closure, model, **kwargs)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/strategies/strategy.py", line 193, in optimizer_step
        return self.precision_plugin.optimizer_step(model, optimizer, opt_idx, closure, **kwargs)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/plugins/precision/precision_plugin.py", line 155, in optimizer_step
        return optimizer.step(closure=closure, **kwargs)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/torch/optim/lr_scheduler.py", line 65, in wrapper
        return wrapped(*args, **kwargs)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/torch/optim/optimizer.py", line 88, in wrapper
        return func(*args, **kwargs)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
        return func(*args, **kwargs)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/torch/optim/adamw.py", line 100, in step
        loss = closure()
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/plugins/precision/precision_plugin.py", line 140, in _wrap_closure
        closure_result = closure()
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 148, in __call__
        self._result = self.closure(*args, **kwargs)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 134, in closure
        step_output = self._step_fn()
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 427, in _training_step
        training_step_output = self.trainer._call_strategy_hook("training_step", *step_kwargs.values())
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1766, in _call_strategy_hook
        output = fn(*args, **kwargs)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/strategies/ddp.py", line 344, in training_step
        return self.model(*args, **kwargs)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
        return forward_call(*input, **kwargs)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 963, in forward
        output = self.module(*inputs[0], **kwargs[0])
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
        return forward_call(*input, **kwargs)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/overrides/base.py", line 82, in forward
        output = self.module.training_step(*inputs, **kwargs)
      File "/home/runshengxu/project/cross_view_transformers/cross_view_transformer/model/model_module.py", line 41, in training_step
        return self.shared_step(batch, 'train', True,
      File "/home/runshengxu/project/cross_view_transformers/cross_view_transformer/model/model_module.py", line 25, in shared_step
        loss, loss_details = self.loss_func(pred, batch)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
        return forward_call(*input, **kwargs)
      File "/home/runshengxu/project/cross_view_transformers/cross_view_transformer/losses.py", line 113, in forward
        outputs = {k: v(pred, batch) for k, v in self.items()}
      File "/home/runshengxu/project/cross_view_transformers/cross_view_transformer/losses.py", line 113, in <dictcomp>
        outputs = {k: v(pred, batch) for k, v in self.items()}
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
        return forward_call(*input, **kwargs)
      File "/home/runshengxu/project/cross_view_transformers/cross_view_transformer/losses.py", line 50, in forward
        loss = super().forward(pred, label)
      File "/home/runshengxu/project/cross_view_transformers/cross_view_transformer/losses.py", line 24, in forward
        return sigmoid_focal_loss(pred, label, self.alpha, self.gamma, self.reduction)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/fvcore/nn/focal_loss.py", line 34, in sigmoid_focal_loss
        ce_loss = F.binary_cross_entropy_with_logits(inputs, targets, reduction="none")
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/torch/nn/functional.py", line 3130, in binary_cross_entropy_with_logits
        raise ValueError("Target size ({}) must be the same as input size ({})".format(target.size(), input.size()))
    ValueError: Target size (torch.Size([4, 12, 200, 200])) must be the same as input size (torch.Size([4, 1, 200, 200]))
    

    Did I input the wrong command? I didn't change the config.yaml and I only have 1 gpu.

    opened by DerrickXuNu 3
  • Pre-trained model

    Pre-trained model

    Hi Bradyz, many thanks for sharing this amzing work, the idea is elegant. Currently, I am trying to use the code for 3D object detection task, but it takes long time to train the model. Would you mind providing well-pretrained model to speedup training process?

    opened by Benzlxs 3
  • Warning : Grad strdies do not match bucket view strides & Error : internal database error

    Warning : Grad strdies do not match bucket view strides & Error : internal database error

    When I used source code training whole nuscenes dataset, model will be shutdown at epoch 6.

    rank_zero_warn( Epoch 6: 44%|█████████████████████████████████████████████████████▎ | 1866/4270 [00:00<?, ?it/s]

    Messages form TERMINAL: [W reducer.cpp:347] Warning: Grad strides do not match bucket view strides. This may indicate grad was not created according to the gradient layout contract, or that the param's strides changed since DDP was constructed. This is not an error, but may impair performance. grad.sizes() = [2, 64, 1, 1], strides() = [64, 1, 64, 64] bucket_view.sizes() = [2, 64, 1, 1], strides() = [64, 1, 1, 1] (function operator()) [W reducer.cpp:347] Warning: Grad strides do not match bucket view strides. This may indicate grad was not created according to the gradient layout contract, or that the param's strides changed since DDP was constructed. This is not an error, but may impair performance. grad.sizes() = [2, 64, 1, 1], strides() = [64, 1, 64, 64] bucket_view.sizes() = [2, 64, 1, 1], strides() = [64, 1, 1, 1] (function operator())

    wandb debug logs message: 2022-07-19 21:15:58,526 ERROR MainThread:2484870 [internal_api.py:execute():143] 500 response executing GraphQL. 2022-07-19 21:15:58,527 ERROR MainThread:2484870 [internal_api.py:execute():144] {"error":"internal database error"}

    So how should I modify it so that I can train normally?

    opened by NickHezhuolin 2
  • camera internal and external parameters

    camera internal and external parameters

    Hello, model training must be heavily dependent on camera extrinsic and extrinsic parameters? Will the effect be much worse if I use it?

      I_inv = batch['intrinsics'].inverse()           # b n 3 3
      E_inv = batch['extrinsics'].inverse()           # b n 4 4
    

    Thanks a lot!

    opened by duohaoxue 2
  • Scripts to generate the labels

    Scripts to generate the labels

    Hi Brady,

    Thanks for your great work and clean coding style! I am wondering do you still have the scripts to generate the cvt_labels_nuscenes? If so, would you be willing to share it? Look forward to your reply.

    opened by DerrickXuNu 2
  • A tensor size mismatch bug when traing

    A tensor size mismatch bug when traing

    Hi, thanks for your outstanding work! When I was training this model with default configurations, a size mismatch error came up. This error is mainly related to the class "SigmoidFocalLoss", and happens just before the epoch 0. Is there any bug or I did something wrong? Screenshot from 2022-06-09 12-42-56

    opened by zjuliangxun 1
  • Model parameter count

    Model parameter count

    Hi Brady, thanks for the great work!

    I have a question regarding the model parameters. In your paper, you have reported 5M in Table 1 on nuscenes. However, I run your code with the default yaml config, and the log says 1.1M parameters. Did I miss anything?

    image

    opened by vztu 1
  • image embedding calculation

    image embedding calculation

    Hey Brady,

    In your encoder.py line 255, you used:

    img_embed = d_embed - c_embed    
    

    To my understanding, here you want to subtract the camera translation information from the image coordinate embedding. However, I think the translation information is already included in the image coordinate embedding:

      pixel_flat = rearrange(pixel, '... h w -> ... (h w)')                   # 1 1 3 (h w)
      cam = I_inv @ pixel_flat                                                # b n 3 (h w)
      cam = F.pad(cam, (0, 0, 0, 1, 0, 0, 0, 0), value=1)                     # b n 4 (h w)
      d = E_inv @ cam                                                         # b n 4 (h w)
      d_flat = rearrange(d, 'b n d (h w) -> (b n) d h w', h=h, w=w)           # (b n) 4 h w
      d_embed = self.img_embed(d_flat)   
    

    where E_inv contains the translation already. So will the subtraction of the c_embed be redundant?

    opened by DerrickXuNu 1
  • camera intrinsics calculation

    camera intrinsics calculation

    Hey Brady, thanks for the great work and the open-sourced code!

    I found the calculation of rescaled intrinsics is a bit wired here:

    https://github.com/bradyz/cross_view_transformers/blob/8d1d688711c3d7a85004f86da3f0874c38619489/cross_view_transformer/data/transforms.py#L135

    Shouldn't the scaling factor be w_resize and h_resize instead of w and h?

    opened by melights 1
  • Question about cross attention module

    Question about cross attention module

    Thanks for sharing the great work! Have you conwsidered the deformable attention? I believe in the paper you were trying to compare queries at each map location to keys at each pixel accross all six perspective views, right?

    opened by jianingwangind 1
  • Val IoU not good after training

    Val IoU not good after training

    Hey there, thank you for your great contribution! Unfortunately I am running into some issues when I am train the model using the nuscenes vehicle experiment config.

    The iou metrics are as follows: | Train/Val | [email protected] | [email protected]| | --------- | ----------|----------| | Train | 0.4305| 0.3868 | | Val | 0.09296 | 0.03948 |

    So, you can see that there are some issues during validation?

    Interestingly, this can also be visually confirmed as during training the models outputs get better and better. However, during validation the predictions get better, then there is a period where the model only predicts a black screen, and then they start getting better again.
    Following are some photos from the w&b log: Val metrics plot image

    Val output at step 247 image

    Val output at step 329 (notice all black) image

    Val output at step 663 image

    Would really appreciate any feedback you might have regarding this issue. Thanks.

    opened by ArminBaz 0
  • Question about camera extrinsic

    Question about camera extrinsic

    When extracting the camera extrinsic, why need to calculate "egocam_from_world @ world_from_egolidarflat"? Is it for synchronizing with the lidar time and eliminating the z-axis?

    https://github.com/bradyz/cross_view_transformers/blob/4de6e641397ef1ffde996d7549f7f988e49156f7/cross_view_transformer/data/nuscenes_dataset.py#L174

    opened by yangxh11 0
  • About test dataset

    About test dataset

    Hi, i'm training this model with Nuscenes dataset.

    when I train cross-view transformer network using this code, model's best IoU for road reach 71%

    but in the "Cross-view Transformers for real-time Map-view Semantic Segmentation" paper road's best IoU is 74.3%

    Is the reason why my experimental results are different is that the dataset for scoring in this paper is not a validation dataset? Or is it simply because the hyperparameters to get the highest score are different?

    opened by yelin2 1
  • Question about camera extrinsics in batch

    Question about camera extrinsics in batch

    Dear Professor,

    Thank you very much for seeing this email in your busy schedule.
    I have a question for you here: In your ''Cross-view Transformers for real-time Map-view Semantic Segmentation '' paper, when encoding image information, the rotation matrix and translation of the camera external parameters are relative to the ego-vehicle coordinate system, but the ego-vehicle coordinates are also useful in the code to the world coordinate system. So I don't know what the external parameters that go into the encoded batch in the code represent exactly。
    I hope you can answer a question in your free time, thank you very much!
    
    opened by Fengshihao1 0
  • Error help

    Error help

    I train the model in windows, but there are problems with nccl: Traceback (most recent call last): File "scripts/train.py", line 74, in main trainer.fit(model_module, datamodule=data_module, ckpt_path=ckpt_path) File "D:\anaconda3\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 770, in fit self._call_and_handle_interrupt( File "D:\anaconda3\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 721, in _call_and_handle_interrupt return self.strategy.launcher.launch(trainer_fn, *args, trainer=self, **kwargs) File "D:\anaconda3\lib\site-packages\pytorch_lightning\strategies\launchers\subprocess_script.py", line 93, in launch return function(*args, **kwargs) File "D:\anaconda3\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 811, in _fit_impl results = self._run(model, ckpt_path=self.ckpt_path) File "D:\anaconda3\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1171, in _run self.strategy.setup_environment() File "D:\anaconda3\lib\site-packages\pytorch_lightning\strategies\ddp.py", line 152, in setup_environment self.setup_distributed() File "D:\anaconda3\lib\site-packages\pytorch_lightning\strategies\ddp.py", line 205, in setup_distributed init_dist_connection(self.cluster_environment, self._process_group_backend) File "D:\anaconda3\lib\site-packages\pytorch_lightning\utilities\distributed.py", line 355, in init_dist_connection torch.distributed.init_process_group(torch_distributed_backend, rank=global_rank, world_size=world_size, **kwargs) File "D:\anaconda3\lib\site-packages\torch\distributed\distributed_c10d.py", line 537, in init_process_group default_pg = _new_process_group_helper( File "D:\anaconda3\lib\site-packages\torch\distributed\distributed_c10d.py", line 639, in _new_process_group_helper raise RuntimeError("Distributed package doesn't have NCCL " RuntimeError: Distributed package doesn't have NCCL built in

    Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

    I know that running on windows does not support nccl. So how should I modify it so that I can train normally?

    opened by lzm2275965881 1
Owner
Brady Zhou
hey
Brady Zhou
Scribble-Supervised LiDAR Semantic Segmentation, CVPR 2022 (ORAL)

Scribble-Supervised LiDAR Semantic Segmentation Dataset and code release for the paper Scribble-Supervised LiDAR Semantic Segmentation, CVPR 2022 (ORA

null 80 Sep 29, 2022
[CVPR 2022 Oral] TubeDETR: Spatio-Temporal Video Grounding with Transformers

TubeDETR: Spatio-Temporal Video Grounding with Transformers Website • STVG Demo • Paper This repository provides the code for our paper. This includes

Antoine Yang 78 Sep 16, 2022
[CVPR 2022] CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation

CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation Prerequisite Please create and activate the following conda envrionment. To r

Qin Wang 73 Sep 22, 2022
Temporally Efficient Vision Transformer for Video Instance Segmentation, CVPR 2022, Oral

Temporally Efficient Vision Transformer for Video Instance Segmentation Temporally Efficient Vision Transformer for Video Instance Segmentation (CVPR

Hust Visual Learning Team 190 Sep 19, 2022
CVPR2022 (Oral) - Rethinking Semantic Segmentation: A Prototype View

Rethinking Semantic Segmentation: A Prototype View Rethinking Semantic Segmentation: A Prototype View, Tianfei Zhou, Wenguan Wang, Ender Konukoglu and

Tianfei Zhou 202 Sep 25, 2022
Code for "NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video", CVPR 2021 oral

NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video Project Page | Paper NeuralRecon: Real-Time Coherent 3D Reconstruction from Mon

ZJU3DV 1.3k Oct 1, 2022
[CVPR'21] Projecting Your View Attentively: Monocular Road Scene Layout Estimation via Cross-view Transformation

Projecting Your View Attentively: Monocular Road Scene Layout Estimation via Cross-view Transformation Weixiang Yang, Qi Li, Wenxi Liu, Yuanlong Yu, Y

null 104 Sep 24, 2022
[CVPR 2021] Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision

TorchSemiSeg [CVPR 2021] Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision by Xiaokang Chen1, Yuhui Yuan2, Gang Zeng1, Jingdong Wang

Chen XiaoKang 355 Sep 29, 2022
[CVPR 2021] Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

[CVPR 2021] Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

Fudan Zhang Vision Group 860 Oct 1, 2022
Implementation of "Bidirectional Projection Network for Cross Dimension Scene Understanding" CVPR 2021 (Oral)

Bidirectional Projection Network for Cross Dimension Scene Understanding CVPR 2021 (Oral) [ Project Webpage ] [ arXiv ] [ Video ] Existing segmentatio

Hu Wenbo 130 Sep 6, 2022
Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion (CVPR 2021)

Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion (CVPR 2021) This repository is for BAAF-Net introduce

null 85 Aug 29, 2022
[CVPR 2022] Semi-Supervised Semantic Segmentation Using Unreliable Pseudo-Labels

Using Unreliable Pseudo Labels Official PyTorch implementation of Semi-Supervised Semantic Segmentation Using Unreliable Pseudo Labels, CVPR 2022. Ple

Haochen Wang 206 Sep 24, 2022
Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, CVPR 2018

Learning Pixel-level Semantic Affinity with Image-level Supervision This code is deprecated. Please see https://github.com/jiwoon-ahn/irn instead. Int

Jiwoon Ahn 329 Sep 21, 2022
Official repository for HOTR: End-to-End Human-Object Interaction Detection with Transformers (CVPR'21, Oral Presentation)

Official PyTorch Implementation for HOTR: End-to-End Human-Object Interaction Detection with Transformers (CVPR'2021, Oral Presentation) HOTR: End-to-

Kakao Brain 107 Sep 12, 2022
This repository contains a pytorch implementation of "HeadNeRF: A Real-time NeRF-based Parametric Head Model (CVPR 2022)".

HeadNeRF: A Real-time NeRF-based Parametric Head Model This repository contains a pytorch implementation of "HeadNeRF: A Real-time NeRF-based Parametr

null 255 Sep 22, 2022
Real-time Object Detection for Streaming Perception, CVPR 2022

StreamYOLO Real-time Object Detection for Streaming Perception Jinrong Yang, Songtao Liu, Zeming Li, Xiaoping Li, Sun Jian Real-time Object Detection

Jinrong Yang 208 Sep 24, 2022
[CVPR 2022 Oral] Rethinking Minimal Sufficient Representation in Contrastive Learning

Rethinking Minimal Sufficient Representation in Contrastive Learning PyTorch implementation of Rethinking Minimal Sufficient Representation in Contras

null 34 Aug 23, 2022
Official code for the CVPR 2022 (oral) paper "Extracting Triangular 3D Models, Materials, and Lighting From Images".

nvdiffrec Joint optimization of topology, materials and lighting from multi-view image observations as described in the paper Extracting Triangular 3D

NVIDIA Research Projects 1.3k Sep 22, 2022
[CVPR 2022 Oral] Crafting Better Contrastive Views for Siamese Representation Learning

Crafting Better Contrastive Views for Siamese Representation Learning (CVPR 2022 Oral) 2022-03-29: The paper was selected as a CVPR 2022 Oral paper! 2

null 228 Sep 28, 2022