Cross-view Transformers for real-time Map-view Semantic Segmentation (CVPR 2022 Oral)

Overview

Cross View Transformers


This repository contains the source code and data for our paper:

Cross-view Transformers for real-time Map-view Semantic Segmentation
Brady Zhou, Philipp Krähenbühl
CVPR 2022

Demos


Map-view Segmentation: The model uses multi-view images to produce a map-view segmentation at 45 FPS

Map Making: With vehicle pose, we can construct a map by fusing model predictions over time

Cross-view Attention: For a given map-view location, we show which image patches are being attended to

Installation

# Clone repo
git clone https://github.com/bradyz/cross_view_transformers.git

cd cross_view_transformers

# Setup conda environment
conda create -y --name cvt python=3.8

conda activate cvt
conda install -y pytorch torchvision cudatoolkit=11.3 -c pytorch

# Install dependencies
pip install -r requirements.txt
pip install -e .

Data


Documentation:


Download the original datasets and our generated map-view labels

Dataset Labels
nuScenes keyframes + map expansion (60 GB) cvt_labels_nuscenes.tar.gz (361 MB)
Argoverse 1.1 3D tracking coming soon™

The structure of the extracted data should look like the following

/datasets/
├─ nuscenes/
│  ├─ v1.0-trainval/
│  ├─ v1.0-mini/
│  ├─ samples/
│  ├─ sweeps/
│  └─ maps/
│     ├─ basemap/
│     └─ expansion/
└─ cvt_labels_nuscenes/
   ├─ scene-0001/
   ├─ scene-0001.json
   ├─ ...
   ├─ scene-1000/
   └─ scene-1000.json

When everything is setup correctly, check out the dataset with

python3 scripts/view_data.py \
  data=nuscenes \
  data.dataset_dir=/media/datasets/nuscenes \
  data.labels_dir=/media/datasets/cvt_labels_nuscenes \
  data.version=v1.0-mini \
  visualization=nuscenes_viz \
  +split=val

Training

             

An average job of 50k training iterations takes ~8 hours.
Our models were trained using 4 GPU jobs, but also can be trained on single GPU.

To train a model,

python3 scripts/train.py \
  +experiment=cvt_nuscenes_vehicle
  data.dataset_dir=/media/datasets/nuscenes \
  data.labels_dir=/media/datasets/cvt_labels_nuscenes

For more information, see

  • config/config.yaml - base config
  • config/model/cvt.yaml - model architecture
  • config/experiment/cvt_nuscenes_vehicle.yaml - additional overrides

Additional Information

Awesome Related Repos

License

This project is released under the MIT license

Citation

If you find this project useful for your research, please use the following BibTeX entry.

@inproceedings{zhou2022cross,
    title={Cross-view Transformers for real-time Map-view Semantic Segmentation},
    author={Zhou, Brady and Kr{\"a}henb{\"u}hl, Philipp},
    booktitle={CVPR},
    year={2022}
}
Comments
  • Reproducing results of paper

    Reproducing results of paper

    Hello, many thanks for sharing the code of this awesome work !

    I am trying to reproduce your results, but the config file cvt_nuscenes_vehicle.yaml differs from what is described in the paper and the training/evaluation setup of Lift Splat Shoot.

    In particular:

    1. The use of the Center Loss instead of the Focal loss
    2. You use a learning rate of 4E-3 instead of 1E-2
    3. You use the visibility token from Nuscenes annotation to filter-out objects that have a visibility level strictly inferior to 2
    4. You use label_indices: [[4, 5, 6, 7, 8, 10, 11]] (7 classes) whereas the list of classes DYNAMIC contains 8 classes

    Do you know how these factors influence your results?

    Can you share the exact config you used to get the results in Table 1 of your paper ?

    opened by F-Barto 4
  • Error running training

    Error running training

    Hi, when I try to run the follow command to train, an error throws out.

    python scripts/train.py   data=nuscenes +experiment=cvt_nuscenes_vehicle   data.dataset_dir=data/nuscenes   data.labels_dir=data/cvt_labels_nuscenes   visualization=nuscenes_viz
    

    Error:

    /home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:486: PossibleUserWarning: Your `val_dataloader`'s sampler has shuffling enabled, it is strongly recommended that you turn shuffling off for val/test/predict dataloaders.
      rank_zero_warn(
    Epoch 0:   0%|                                                                                                                                                                                         | 0/8538 [00:00<?, ?it/s]Error executing job with overrides: ['data=nuscenes', '+experiment=cvt_nuscenes_vehicle', 'data.dataset_dir=/home/runshengxu/project/data/nuscenes', 'data.labels_dir=/home/runshengxu/project/data/cvt_labels_nuscenes', 'visualization=nuscenes_viz']
    Traceback (most recent call last):
      File "scripts/train.py", line 71, in main
        trainer.fit(model_module, datamodule=data_module, ckpt_path=ckpt_path)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 771, in fit
        self._call_and_handle_interrupt(
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 722, in _call_and_handle_interrupt
        return self.strategy.launcher.launch(trainer_fn, *args, trainer=self, **kwargs)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/strategies/launchers/subprocess_script.py", line 93, in launch
        return function(*args, **kwargs)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 812, in _fit_impl
        results = self._run(model, ckpt_path=self.ckpt_path)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1237, in _run
        results = self._run_stage()
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1324, in _run_stage
        return self._run_train()
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1354, in _run_train
        self.fit_loop.run()
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/loops/base.py", line 204, in run
        self.advance(*args, **kwargs)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/loops/fit_loop.py", line 269, in advance
        self._outputs = self.epoch_loop.run(self._data_fetcher)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/loops/base.py", line 204, in run
        self.advance(*args, **kwargs)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 208, in advance
        batch_output = self.batch_loop.run(batch, batch_idx)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/loops/base.py", line 204, in run
        self.advance(*args, **kwargs)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/loops/batch/training_batch_loop.py", line 88, in advance
        outputs = self.optimizer_loop.run(split_batch, optimizers, batch_idx)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/loops/base.py", line 204, in run
        self.advance(*args, **kwargs)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 203, in advance
        result = self._run_optimization(
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 256, in _run_optimization
        self._optimizer_step(optimizer, opt_idx, batch_idx, closure)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 369, in _optimizer_step
        self.trainer._call_lightning_module_hook(
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1596, in _call_lightning_module_hook
        output = fn(*args, **kwargs)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/core/lightning.py", line 1625, in optimizer_step
        optimizer.step(closure=optimizer_closure)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/core/optimizer.py", line 168, in step
        step_output = self._strategy.optimizer_step(self._optimizer, self._optimizer_idx, closure, **kwargs)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/strategies/ddp.py", line 278, in optimizer_step
        optimizer_output = super().optimizer_step(optimizer, opt_idx, closure, model, **kwargs)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/strategies/strategy.py", line 193, in optimizer_step
        return self.precision_plugin.optimizer_step(model, optimizer, opt_idx, closure, **kwargs)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/plugins/precision/precision_plugin.py", line 155, in optimizer_step
        return optimizer.step(closure=closure, **kwargs)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/torch/optim/lr_scheduler.py", line 65, in wrapper
        return wrapped(*args, **kwargs)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/torch/optim/optimizer.py", line 88, in wrapper
        return func(*args, **kwargs)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
        return func(*args, **kwargs)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/torch/optim/adamw.py", line 100, in step
        loss = closure()
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/plugins/precision/precision_plugin.py", line 140, in _wrap_closure
        closure_result = closure()
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 148, in __call__
        self._result = self.closure(*args, **kwargs)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 134, in closure
        step_output = self._step_fn()
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 427, in _training_step
        training_step_output = self.trainer._call_strategy_hook("training_step", *step_kwargs.values())
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1766, in _call_strategy_hook
        output = fn(*args, **kwargs)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/strategies/ddp.py", line 344, in training_step
        return self.model(*args, **kwargs)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
        return forward_call(*input, **kwargs)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 963, in forward
        output = self.module(*inputs[0], **kwargs[0])
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
        return forward_call(*input, **kwargs)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/pytorch_lightning/overrides/base.py", line 82, in forward
        output = self.module.training_step(*inputs, **kwargs)
      File "/home/runshengxu/project/cross_view_transformers/cross_view_transformer/model/model_module.py", line 41, in training_step
        return self.shared_step(batch, 'train', True,
      File "/home/runshengxu/project/cross_view_transformers/cross_view_transformer/model/model_module.py", line 25, in shared_step
        loss, loss_details = self.loss_func(pred, batch)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
        return forward_call(*input, **kwargs)
      File "/home/runshengxu/project/cross_view_transformers/cross_view_transformer/losses.py", line 113, in forward
        outputs = {k: v(pred, batch) for k, v in self.items()}
      File "/home/runshengxu/project/cross_view_transformers/cross_view_transformer/losses.py", line 113, in <dictcomp>
        outputs = {k: v(pred, batch) for k, v in self.items()}
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
        return forward_call(*input, **kwargs)
      File "/home/runshengxu/project/cross_view_transformers/cross_view_transformer/losses.py", line 50, in forward
        loss = super().forward(pred, label)
      File "/home/runshengxu/project/cross_view_transformers/cross_view_transformer/losses.py", line 24, in forward
        return sigmoid_focal_loss(pred, label, self.alpha, self.gamma, self.reduction)
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/fvcore/nn/focal_loss.py", line 34, in sigmoid_focal_loss
        ce_loss = F.binary_cross_entropy_with_logits(inputs, targets, reduction="none")
      File "/home/runshengxu/anaconda3/envs/cvt/lib/python3.8/site-packages/torch/nn/functional.py", line 3130, in binary_cross_entropy_with_logits
        raise ValueError("Target size ({}) must be the same as input size ({})".format(target.size(), input.size()))
    ValueError: Target size (torch.Size([4, 12, 200, 200])) must be the same as input size (torch.Size([4, 1, 200, 200]))
    

    Did I input the wrong command? I didn't change the config.yaml and I only have 1 gpu.

    opened by DerrickXuNu 3
  • Pre-trained model

    Pre-trained model

    Hi Bradyz, many thanks for sharing this amzing work, the idea is elegant. Currently, I am trying to use the code for 3D object detection task, but it takes long time to train the model. Would you mind providing well-pretrained model to speedup training process?

    opened by Benzlxs 3
  • Warning : Grad strdies do not match bucket view strides & Error : internal database error

    Warning : Grad strdies do not match bucket view strides & Error : internal database error

    When I used source code training whole nuscenes dataset, model will be shutdown at epoch 6.

    rank_zero_warn( Epoch 6: 44%|█████████████████████████████████████████████████████▎ | 1866/4270 [00:00<?, ?it/s]

    Messages form TERMINAL: [W reducer.cpp:347] Warning: Grad strides do not match bucket view strides. This may indicate grad was not created according to the gradient layout contract, or that the param's strides changed since DDP was constructed. This is not an error, but may impair performance. grad.sizes() = [2, 64, 1, 1], strides() = [64, 1, 64, 64] bucket_view.sizes() = [2, 64, 1, 1], strides() = [64, 1, 1, 1] (function operator()) [W reducer.cpp:347] Warning: Grad strides do not match bucket view strides. This may indicate grad was not created according to the gradient layout contract, or that the param's strides changed since DDP was constructed. This is not an error, but may impair performance. grad.sizes() = [2, 64, 1, 1], strides() = [64, 1, 64, 64] bucket_view.sizes() = [2, 64, 1, 1], strides() = [64, 1, 1, 1] (function operator())

    wandb debug logs message: 2022-07-19 21:15:58,526 ERROR MainThread:2484870 [internal_api.py:execute():143] 500 response executing GraphQL. 2022-07-19 21:15:58,527 ERROR MainThread:2484870 [internal_api.py:execute():144] {"error":"internal database error"}

    So how should I modify it so that I can train normally?

    opened by NickHezhuolin 2
  • image embedding calculation

    image embedding calculation

    Hey Brady,

    In your encoder.py line 255, you used:

    img_embed = d_embed - c_embed    
    

    To my understanding, here you want to subtract the camera translation information from the image coordinate embedding. However, I think the translation information is already included in the image coordinate embedding:

      pixel_flat = rearrange(pixel, '... h w -> ... (h w)')                   # 1 1 3 (h w)
      cam = I_inv @ pixel_flat                                                # b n 3 (h w)
      cam = F.pad(cam, (0, 0, 0, 1, 0, 0, 0, 0), value=1)                     # b n 4 (h w)
      d = E_inv @ cam                                                         # b n 4 (h w)
      d_flat = rearrange(d, 'b n d (h w) -> (b n) d h w', h=h, w=w)           # (b n) 4 h w
      d_embed = self.img_embed(d_flat)   
    

    where E_inv contains the translation already. So will the subtraction of the c_embed be redundant?

    opened by DerrickXuNu 2
  • camera internal and external parameters

    camera internal and external parameters

    Hello, model training must be heavily dependent on camera extrinsic and extrinsic parameters? Will the effect be much worse if I use it?

      I_inv = batch['intrinsics'].inverse()           # b n 3 3
      E_inv = batch['extrinsics'].inverse()           # b n 4 4
    

    Thanks a lot!

    opened by duohaoxue 2
  • Scripts to generate the labels

    Scripts to generate the labels

    Hi Brady,

    Thanks for your great work and clean coding style! I am wondering do you still have the scripts to generate the cvt_labels_nuscenes? If so, would you be willing to share it? Look forward to your reply.

    opened by DerrickXuNu 2
  • A tensor size mismatch bug when traing

    A tensor size mismatch bug when traing

    Hi, thanks for your outstanding work! When I was training this model with default configurations, a size mismatch error came up. This error is mainly related to the class "SigmoidFocalLoss", and happens just before the epoch 0. Is there any bug or I did something wrong? Screenshot from 2022-06-09 12-42-56

    opened by zjuliangxun 1
  • Model parameter count

    Model parameter count

    Hi Brady, thanks for the great work!

    I have a question regarding the model parameters. In your paper, you have reported 5M in Table 1 on nuscenes. However, I run your code with the default yaml config, and the log says 1.1M parameters. Did I miss anything?

    image

    opened by vztu 1
  • camera intrinsics calculation

    camera intrinsics calculation

    Hey Brady, thanks for the great work and the open-sourced code!

    I found the calculation of rescaled intrinsics is a bit wired here:

    https://github.com/bradyz/cross_view_transformers/blob/8d1d688711c3d7a85004f86da3f0874c38619489/cross_view_transformer/data/transforms.py#L135

    Shouldn't the scaling factor be w_resize and h_resize instead of w and h?

    opened by melights 1
  • Question about cross attention module

    Question about cross attention module

    Thanks for sharing the great work! Have you conwsidered the deformable attention? I believe in the paper you were trying to compare queries at each map location to keys at each pixel accross all six perspective views, right?

    opened by jianingwangind 1
  • How to segmentate road_segment and dynamic objects together ?

    How to segmentate road_segment and dynamic objects together ?

    Hi , I have run your code,and I see there are only two examples to train your code,label_indices: are [[0, 1]] and [[4, 5, 6, 7, 8, 10, 11]] It seems the label [[2,3]] isn't used. if I want to get road_segment results and different dynamic results at the same time, how to train ?

    opened by Egozjuer 0
  • image name is sample.json is not present in downloaded /media/datasets/nuscenes/samples/CAM_FRONT_LEFT folder

    image name is sample.json is not present in downloaded /media/datasets/nuscenes/samples/CAM_FRONT_LEFT folder

    Hi, i have downloaded files as per instruction from dataset setup

    while trying to run

    python3 scripts/train.py \
      +experiment=cvt_nuscenes_vehicle
      data.dataset_dir=/media/datasets/nuscenes \
      data.labels_dir=/media/datasets/cvt_labels_nuscenes 
    

    im getting error : FileNotFoundError: [Errno 2] No such file or directory: '/media/datasets/nuscenes/samples/CAM_FRONT_LEFT/n015-2018-11-14-18-57-54+0800__CAM_FRONT_LEFT__1542193516504844.jpg'

    in sample.json file, file name mentioned is media/datasets/nuscenes/v1.0-trainval/sample_data.json:33122094:"filename": "samples/CAM_FRONT_LEFT/n015-2018-11-14-18-57-54+0800__CAM_FRONT_LEFT__1542193516504844.jpg"

    I tried looking for correct files as per sample.json, can anyone help with how to download images as per mentioned in sample.json file.

    opened by sharmasushil 0
  • Resume Training

    Resume Training

    Excuse me, I was interrupted during training. I modified the path to resume training to point to the .ckpt file where the last training was interrupted, but the training still cannot continue. How should I solve it? image image

    opened by clownHHu 0
  • A question about Setting 1

    A question about Setting 1

    Thanks for your kindness in sharing your code with us!

    I have a question about the Setting1. According to the Table 1 in your paper, PON is under the Setting 1 that refers to the 100m x 50m at 25 cm resolution. The resolution of 25cm per pixel is mentioned in the 4.2 section of PON's paper, but I think the wording of 100m x 50m is not correct. According to PON's code, pixel resolution of its BEV image is 196x200, so I think Setting 1 should refer to the 49m x 50m at 25cm resolution. Did I get something wrong?

    opened by Junyu-Z 2
Owner
Brady Zhou
hey
Brady Zhou
Scribble-Supervised LiDAR Semantic Segmentation, CVPR 2022 (ORAL)

Scribble-Supervised LiDAR Semantic Segmentation Dataset and code release for the paper Scribble-Supervised LiDAR Semantic Segmentation, CVPR 2022 (ORA

null 102 Dec 25, 2022
[CVPR 2022] CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation

CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation Prerequisite Please create and activate the following conda envrionment. To r

Qin Wang 87 Jan 8, 2023
[CVPR 2022 Oral] TubeDETR: Spatio-Temporal Video Grounding with Transformers

TubeDETR: Spatio-Temporal Video Grounding with Transformers Website • STVG Demo • Paper This repository provides the code for our paper. This includes

Antoine Yang 108 Dec 27, 2022
Temporally Efficient Vision Transformer for Video Instance Segmentation, CVPR 2022, Oral

Temporally Efficient Vision Transformer for Video Instance Segmentation Temporally Efficient Vision Transformer for Video Instance Segmentation (CVPR

Hust Visual Learning Team 203 Dec 31, 2022
Code for "NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video", CVPR 2021 oral

NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video Project Page | Paper NeuralRecon: Real-Time Coherent 3D Reconstruction from Mon

ZJU3DV 1.4k Dec 30, 2022
CVPR2022 (Oral) - Rethinking Semantic Segmentation: A Prototype View

Rethinking Semantic Segmentation: A Prototype View Rethinking Semantic Segmentation: A Prototype View, Tianfei Zhou, Wenguan Wang, Ender Konukoglu and

Tianfei Zhou 239 Dec 26, 2022
[CVPR'21] Projecting Your View Attentively: Monocular Road Scene Layout Estimation via Cross-view Transformation

Projecting Your View Attentively: Monocular Road Scene Layout Estimation via Cross-view Transformation Weixiang Yang, Qi Li, Wenxi Liu, Yuanlong Yu, Y

null 118 Dec 26, 2022
[CVPR 2021] Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision

TorchSemiSeg [CVPR 2021] Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision by Xiaokang Chen1, Yuhui Yuan2, Gang Zeng1, Jingdong Wang

Chen XiaoKang 387 Jan 8, 2023
This repository contains a pytorch implementation of "HeadNeRF: A Real-time NeRF-based Parametric Head Model (CVPR 2022)".

HeadNeRF: A Real-time NeRF-based Parametric Head Model This repository contains a pytorch implementation of "HeadNeRF: A Real-time NeRF-based Parametr

null 294 Jan 1, 2023
Real-time Object Detection for Streaming Perception, CVPR 2022

StreamYOLO Real-time Object Detection for Streaming Perception Jinrong Yang, Songtao Liu, Zeming Li, Xiaoping Li, Sun Jian Real-time Object Detection

Jinrong Yang 237 Dec 27, 2022
[CVPR 2021] Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

[CVPR 2021] Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

Fudan Zhang Vision Group 897 Jan 5, 2023
Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion (CVPR 2021)

Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion (CVPR 2021) This repository is for BAAF-Net introduce

null 90 Dec 29, 2022
[CVPR 2022] Semi-Supervised Semantic Segmentation Using Unreliable Pseudo-Labels

Using Unreliable Pseudo Labels Official PyTorch implementation of Semi-Supervised Semantic Segmentation Using Unreliable Pseudo Labels, CVPR 2022. Ple

Haochen Wang 268 Dec 24, 2022
Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, CVPR 2018

Learning Pixel-level Semantic Affinity with Image-level Supervision This code is deprecated. Please see https://github.com/jiwoon-ahn/irn instead. Int

Jiwoon Ahn 337 Dec 15, 2022
Real-Time-Student-Attendence-System - Real Time Student Attendence System

Real-Time-Student-Attendence-System The Student Attendance Management System Pro

Rounak Das 1 Feb 15, 2022
Implementation of "Bidirectional Projection Network for Cross Dimension Scene Understanding" CVPR 2021 (Oral)

Bidirectional Projection Network for Cross Dimension Scene Understanding CVPR 2021 (Oral) [ Project Webpage ] [ arXiv ] [ Video ] Existing segmentatio

Hu Wenbo 135 Dec 26, 2022
HyperSeg: Patch-wise Hypernetwork for Real-time Semantic Segmentation Official PyTorch Implementation

: We present a novel, real-time, semantic segmentation network in which the encoder both encodes and generates the parameters (weights) of the decoder. Furthermore, to allow maximal adaptivity, the weights at each decoder block vary spatially. For this purpose, we design a new type of hypernetwork, composed of a nested U-Net for drawing higher level context features

Yuval Nirkin 182 Dec 14, 2022
FANet - Real-time Semantic Segmentation with Fast Attention

FANet Real-time Semantic Segmentation with Fast Attention Ping Hu, Federico Perazzi, Fabian Caba Heilbron, Oliver Wang, Zhe Lin, Kate Saenko , Stan Sc

Ping Hu 42 Nov 30, 2022