This repo holds codes of the ICCV21 paper: Visual Alignment Constraint for Continuous Sign Language Recognition.

Yuecong Min

Last update: Dec 19, 2022

Related tags

Overview

VAC_CSLR

This repo holds codes of the paper: Visual Alignment Constraint for Continuous Sign Language Recognition.(ICCV 2021) [paper]

Prerequisites

This project is implemented in Pytorch (>1.8). Thus please install Pytorch first.
ctcdecode==0.4 [parlance/ctcdecode]，for beam search decode.
[Optional] sclite [kaldi-asr/kaldi], install kaldi tool to get sclite for evaluation. After installation, create a soft link toward the sclite:
ln -s PATH_TO_KALDI/tools/sctk-2.4.10/bin/sclite ./software/sclite We also provide a python version evaluation tool for convenience, but sclite can provide more detailed statistics.
[Optional] SeanNaren/warp-ctc At the beginning of this research, we adopt warp-ctc for supervision, and we recently find that pytorch version CTC can reach similar results.

Data Preparation

Download the RWTH-PHOENIX-Weather 2014 Dataset [download link]. Our experiments based on phoenix-2014.v3.tar.gz.
After finishing dataset download, extract it to ./dataset/phoenix, it is suggested to make a soft link toward downloaded dataset.
ln -s PATH_TO_DATASET/phoenix2014-release ./dataset/phienix2014
The original image sequence is 210x260, we resize it to 256x256 for augmentation. Run the following command to generate gloss dict and resize image sequence.
```
cd ./preprocess
python data_preprocess.py --process-image --multiprocessing
```

Inference

We provide the pretrained models for inference, you can download them from:

Backbone	WER on Dev	WER on Test	Pretrained model
ResNet18	21.2%	22.3%	[Baidu] (passwd: qi83) [Dropbox]

To evaluate the pretrained model, run the command below：
python main.py --load-weights resnet18_slr_pretrained.pt --phase test

Training

The priorities of configuration files are: command line > config file > default values of argparse. To train the SLR model on phoenix14, run the command below:

python main.py --work-dir PATH_TO_SAVE_RESULTS --config PATH_TO_CONFIG_FILE --device AVAILABLE_GPUS

Feature Extraction

We also provide feature extraction function to extract frame-wise features for other research purpose, which can be achieved by:

python main.py --load-weights PATH_TO_PRETRAINED_MODEL --phase features

To Do List

Pure python implemented evaluation tools.
WAR and WER calculation scripts.

Citation

If you find this repo useful in your research works, please consider citing:

@InProceedings{Min_2021_ICCV,
    author    = {Min, Yuecong and Hao, Aiming and Chai, Xiujuan and Chen, Xilin},
    title     = {Visual Alignment Constraint for Continuous Sign Language Recognition},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {11542-11551}
}

Relevant paper

Self-Mutual Distillation Learning for Continuous Sign Language Recognition[paper]

@InProceedings{Hao_2021_ICCV,
    author    = {Hao, Aiming and Min, Yuecong and Chen, Xilin},
    title     = {Self-Mutual Distillation Learning for Continuous Sign Language Recognition},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {11303-11312}
}

Acknowledge

We appreciate the help from Runpeng Cui, Hao Zhou@Rhythmblue and Xinzhe Han@GeraldHan :)

Comments

Final accuracy

I want to make sure that you report 22.1 Dev WER and 23.0 Test WER, while 21.2 Dev WER and 22.3 Test WER of released pretrained model ? Thanks in advance for response!

opened by hulianyuyy 8
Are there plans to supplement the code on the CSL dataset?

Thank you very much for your contribution to the community. In the paper, I saw that experiments were carried out on both the PHOENIX14 dataset and the CSL dataset. I would like to ask if there are plans to supplement the data processing part and the training part of the code on the CSL dataset?

opened by HW140701 6

Error when I try to do the inference

Hello, I'm replicating this model but when I execute the command for do the inferece an unknowns error appears. However, I don't know why I have this error. My setup it's:

RTX 3060ti
16GB RAM
Ryzen 7 5800X

The complete error is:

Traceback (most recent call last):
  File "main.py", line 209, in <module>
    processor.start()
  File "main.py", line 61, in start
    dev_wer = seq_eval(self.arg, self.data_loader["dev"], self.model, self.device,
  File "/mnt/d/Universidad/Python_Envs/TFG/VAC/VAC_CSLR/seq_scripts.py", line 56, in seq_eval
    ret_dict = model(vid, vid_lgt, label=label, label_lgt=label_lgt)
  File "/mnt/d/Universidad/Python_Envs/TFG/VAC/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/mnt/d/Universidad/Python_Envs/TFG/VAC/VAC_CSLR/slr_network.py", line 63, in forward
    framewise = self.masked_bn(inputs, len_x)
  File "/mnt/d/Universidad/Python_Envs/TFG/VAC/VAC_CSLR/slr_network.py", line 53, in masked_bn
    x = self.conv2d(x)
  File "/mnt/d/Universidad/Python_Envs/TFG/VAC/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/mnt/d/Universidad/Python_Envs/TFG/VAC/lib/python3.8/site-packages/torchvision/models/resnet.py", line 249, in forward
    return self._forward_impl(x)
  File "/mnt/d/Universidad/Python_Envs/TFG/VAC/lib/python3.8/site-packages/torchvision/models/resnet.py", line 233, in _forward_impl
    x = self.bn1(x)
  File "/mnt/d/Universidad/Python_Envs/TFG/VAC/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/mnt/d/Universidad/Python_Envs/TFG/VAC/lib/python3.8/site-packages/torch/nn/modules/batchnorm.py", line 135, in forward
    return F.batch_norm(
  File "/mnt/d/Universidad/Python_Envs/TFG/VAC/lib/python3.8/site-packages/torch/nn/functional.py", line 2149, in batch_norm
    return torch.batch_norm(
RuntimeError: CUDA error: unknown error

And I have change the config file: -batch_size: 2 +batch_size: 1 -test_batch_size: 8 -num_worker: 10 -device: 0,1,2 +test_batch_size: 1 +num_worker: 1 +device: 0

Also my torch version its 1.8.1+cu111

Thank you for the help!

UPDATE

Also i found this error:

Traceback (most recent call last):
  File "main.py", line 209, in <module>
    processor.start()
  File "main.py", line 61, in start
    dev_wer = seq_eval(self.arg, self.data_loader["dev"], self.model, self.device,
  File "/mnt/d/Universidad/Python_Envs/TFG/VAC/VAC_CSLR/seq_scripts.py", line 56, in seq_eval
    ret_dict = model(vid, vid_lgt, label=label, label_lgt=label_lgt)
  File "/mnt/d/Universidad/Python_Envs/TFG/VAC/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/mnt/d/Universidad/Python_Envs/TFG/VAC/VAC_CSLR/slr_network.py", line 63, in forward
    framewise = self.masked_bn(inputs, len_x)
  File "/mnt/d/Universidad/Python_Envs/TFG/VAC/VAC_CSLR/slr_network.py", line 53, in masked_bn
    x = self.conv2d(x)
  File "/mnt/d/Universidad/Python_Envs/TFG/VAC/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/mnt/d/Universidad/Python_Envs/TFG/VAC/lib/python3.8/site-packages/torchvision/models/resnet.py", line 249, in forward
    return self._forward_impl(x)
  File "/mnt/d/Universidad/Python_Envs/TFG/VAC/lib/python3.8/site-packages/torchvision/models/resnet.py", line 232, in _forward_impl
    x = self.conv1(x)
  File "/mnt/d/Universidad/Python_Envs/TFG/VAC/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/mnt/d/Universidad/Python_Envs/TFG/VAC/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 399, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/mnt/d/Universidad/Python_Envs/TFG/VAC/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 395, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
RuntimeError: CUDA error: unknown error

whit the next config -batch_size: 2 +batch_size: 1 random_seed: 0 -test_batch_size: 8 -num_worker: 10 -device: 0,1,2 +test_batch_size: 2 +num_worker: 2 +device: 0

opened by JoseMoFi 6

Issue about alignment between label and frames.

Thanks for your great job. I'm wondering how to draw a picture like Fig.5 in your paper. The key point lies in how to align labels with frames. Could you provide some advice? Thanks in advance!

opened by hulianyuyy 4
Time to train

Hello, great work with this paper and repo! I would like to ask you how much time you spent training the model (for the dataset Phoenix12) and what kind gpu you used for the training. Because I am trying to replicate it but with other dataset (specificly the Phoenix14-T), and in my first test I spent around 14h to train 10 epochs. I used a TitanXP with 12Gb for the training and a batch = 1.

Thank you again for your work and congratulation for this repo.

opened by JoseMoFi 2
how to solve this error in the training model. I look forward to your answer

Traceback (most recent call last): File "main.py", line 211, in processor.start() File "main.py", line 44, in start seq_train(self.data_loader['train'], self.model, self.optimizer, File "/home/linux/data2/sun/VAC_CSLR/seq_scripts.py", line 18, in seq_train for batch_idx, data in enumerate(tqdm(loader)): File "/home/linux/anaconda3/envs/ssn/lib/python3.8/site-packages/tqdm/std.py", line 1180, in iter for obj in iterable: File "/home/linux/anaconda3/envs/ssn/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 530, in next data = self._next_data() File "/home/linux/anaconda3/envs/ssn/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1224, in _next_data return self._process_data(data) File "/home/linux/anaconda3/envs/ssn/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1250, in _process_data data.reraise() File "/home/linux/anaconda3/envs/ssn/lib/python3.8/site-packages/torch/_utils.py", line 457, in reraise raise exception IndexError: Caught IndexError in DataLoader worker process 0. Original Traceback (most recent call last): File "/home/linux/anaconda3/envs/ssn/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop data = fetcher.fetch(index) File "/home/linux/anaconda3/envs/ssn/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/linux/anaconda3/envs/ssn/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/linux/data2/sun/VAC_CSLR/dataset/dataloader_video.py", line 47, in getitem input_data, label = self.normalize(input_data, label) File "/home/linux/data2/sun/VAC_CSLR/dataset/dataloader_video.py", line 78, in normalize video, label = self.data_aug(video, label, file_id) File "/home/linux/data2/sun/VAC_CSLR/utils/video_augmentation.py", line 24, in call image = t(image) File "/home/linux/data2/sun/VAC_CSLR/utils/video_augmentation.py", line 119, in call if isinstance(clip[0], np.ndarray): IndexError: list index out of range

opened by sunsn1997 2
关于baseline复现结果不一致的问题

您好，我有一些关于实验代码的一些问题。在您的论文表3中，baseline在DEV上的结果是25.4，我在代码中尝试将loss中的ConvCTC和Dist去掉来实现它，但是得到了：仅在epoch=40时，WER=24.8%，最终结果与表3中的结果相差较多，出现这样的结果是否是因为我疏忽了某些应该去掉的部分？

log.txt config.txt

opened by miaomiao9miao 2
Weird glosses in the annotation of phoenix dataset

Hi @ycmin95 , recently, I checked the annotation of phoenix dataset and the gloss dictionary generated during the progress of data preparation. There are many weird glosses, such as "ON", "OFF", "LEFTHAND" ... I wonder whether we should keep these weird glosses in the label... Any advice?

opened by sunke123 2
请问ctcdecode初始化所用的vocab为什么能用chr(20000-21296)生成呢？

您的工作非常出色！在ctcdecode的文档中，vocab要用待解码的字典来初始化，为什么代码实现用chr(20000+(0~1296))就可以实现呢？20000这个数字是特定的吗？另外，您的论文中图5给出了模型生成标签与ground_truth和视频的对齐效果，但是我通过ctcdecode只能生成标签但无法用于对齐标注，请问这部分工作是需要额外的代码实现吗？期待您的答复！

opened by blankspark 2
Pseudo Label

I'm wondering how to assign labels for frames with CTC loss. It seems CTC Loss can be viewed as sequential SoftMax losses. But the key point is how to obtain the pseudo labels for frames via back propagation. Thanks in advance!

opened by hulianyuyy 2
Finetuning and continue training

Hello, Thank you for the awesome work. I am trying to use the model on another dataset, so I figure I should structure my data accordingly to the format of phoenix2014. Is there anything else I should worry about or just running the preprocessing with the same structure is gonna be alright?

Also, since I am training on google colab, I won't be able to train for 80 epochs consecutively and plan to split it into several different runs. Is there a built in function to load the previous model and continue training (or finetuning, if I want to finetune the pretrain) or how should I begin to tackle this problem? I am not sure if --load-weights tag is enough. Thank you so much.

opened by khoapip 1
Video augmentation methods for Pre-trained model

What are the video augmentation options used in the pre-trained model ([Dropbox]) ? In the code I can see that these are the ones uncommented, is that the case for the pretrained model? dataset/dataloader_video.py

opened by Aayush2007 1
Question about CPU or GPU error

I ran your code and found the following error, where are the parameters put into the GPU?

Traceback (most recent call last): File "main.py", line 218, in processor.start() File "main.py", line 46, in start seq_train(self.data_loader['train'], self.model, self.optimizer,self.device, epoch, self.recoder) File "/home/quchunguang/sunday/CSLR/seq_scripts.py", line 24, in seq_train loss = model.criterion_calculation(ret_dict, label, label_lgt) File "/home/quchunguang/sunday/CSLR/slr_network.py", line 96, in criterion_calculation label_lgt.cpu().int()).mean() File "/home/quchunguang/anaconda3/envs/tf/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call result = self.forward(*input, **kwargs) File "/home/quchunguang/anaconda3/envs/tf/lib/python3.6/site-packages/torch/nn/modules/loss.py", line 1295, in forward self.zero_infinity) File "/home/quchunguang/anaconda3/envs/tf/lib/python3.6/site-packages/torch/nn/functional.py", line 1767, in ctc_loss zero_infinity) RuntimeError: Tensor for argument #2 'targets' is on CPU, but expected it to be on GPU (while checking arguments for ctc_loss_gpu)

opened by chunguangqu 5

Owner

Yuecong Min

CS Ph.D. candidate, Computer Vision

GitHub

This repo holds code for TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation

TransUNet This repo holds code for TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation Usage

1.4k Jan 4, 2023

This repository holds the code for the paper "Deep Conditional Gaussian Mixture Model forConstrained Clustering".

Deep Conditional Gaussian Mixture Model for Constrained Clustering. This repository holds the code for the paper Deep Conditional Gaussian Mixture Mod

17 Oct 30, 2022

Implementation of "Distribution Alignment: A Unified Framework for Long-tail Visual Recognition"(CVPR 2021)

105 Nov 7, 2022

The official codes of our CVPR2022 paper: A Differentiable Two-stage Alignment Scheme for Burst Image Reconstruction with Large Shift

TwoStageAlign The official codes of our CVPR2022 paper: A Differentiable Two-stage Alignment Scheme for Burst Image Reconstruction with Large Shift Pa

32 Dec 15, 2022

MLOps will help you to understand how to build a Continuous Integration and Continuous Delivery pipeline for an ML/AI project.

page_type languages products description sample python azure azure-machine-learning-service azure-devops Code which demonstrates how to set up and ope

1 Nov 1, 2021

This repository holds code and data for our PETS'22 article 'From "Onion Not Found" to Guard Discovery'.

From "Onion Not Found" to Guard Discovery (PETS'22) This repository holds the code and data for our PETS'22 paper titled 'From "Onion Not Found" to Gu

3 May 4, 2022

Official code for the paper: Deep Graph Matching under Quadratic Constraint (CVPR 2021)

QC-DGM This is the official PyTorch implementation and models for our CVPR 2021 paper: Deep Graph Matching under Quadratic Constraint. It also contain

55 Nov 14, 2022

Ranking Models in Unlabeled New Environments （iccv21）

Ranking Models in Unlabeled New Environments Prerequisites This code uses the following libraries Python 3.7 NumPy PyTorch 1.7.0 + torchivision 0.8.1

14 Dec 17, 2021

[ICCV21] Self-Calibrating Neural Radiance Fields

Self-Calibrating Neural Radiance Fields, ICCV, 2021 Project Page | Paper | Video Author Information Yoonwoo Jeong [Google Scholar] Seokjun Ahn [Google

381 Dec 30, 2022

[ICCV21] Code for RetrievalFuse: Neural 3D Scene Reconstruction with a Database

RetrievalFuse Paper | Project Page | Video RetrievalFuse: Neural 3D Scene Reconstruction with a Database Yawar Siddiqui, Justus Thies, Fangchang Ma, Q

75 Dec 22, 2022

Official PyTorch Implementation of GAN-Supervised Dense Visual Alignment

GAN-Supervised Dense Visual Alignment — Official PyTorch Implementation Paper | Project Page | Video This repo contains training, evaluation and visua

944 Jan 7, 2023

Combining Reinforcement Learning and Constraint Programming for Combinatorial Optimization

Hybrid solving process for combinatorial optimization problems Combinatorial optimization has found applications in numerous fields, from aerospace to

117 Dec 13, 2022

Robot Reinforcement Learning on the Constraint Manifold

Implementation of "Robot Reinforcement Learning on the Constraint Manifold"

31 Dec 5, 2022

ISBI 2022: Cross-level Contrastive Learning and Consistency Constraint for Semi-supervised Medical Image.

Cross-level Contrastive Learning and Consistency Constraint for Semi-supervised Medical Image Introduction This repository contains the PyTorch implem

25 Nov 9, 2022

Reusable constraint types to use with typing.Annotated

annotated-types PEP-593 added typing.Annotated as a way of adding context-specific metadata to existing types, and specifies that Annotated[T, x] shou

125 Dec 26, 2022

Constraint-based geometry sketcher for blender

Constraint-based sketcher addon for Blender that allows to create precise 2d shapes by defining a set of geometric constraints like tangent, distance,

1.7k Dec 31, 2022

This project demonstrates the use of neural networks and computer vision to create a classifier that interprets the Brazilian Sign Language.

LIBRAS-Image-Classifier This project demonstrates the use of neural networks and computer vision to create a classifier that interprets the Brazilian

26 Oct 14, 2022

Sign Language Transformers (CVPR'20)

Sign Language Transformers (CVPR'20) This repo contains the training and evaluation code for the paper Sign Language Transformers: Sign Language Trans

164 Dec 30, 2022

Sign Language Translation with Transformers (COLING'2020, ECCV'20 SLRTP Workshop)

transformer-slt This repository gathers data and code supporting the experiments in the paper Better Sign Language Translation with STMC-Transformer.

107 Dec 27, 2022

This repo holds codes of the ICCV21 paper: Visual Alignment Constraint for Continuous Sign Language Recognition.

Related tags

Overview

VAC_CSLR

Prerequisites

Data Preparation

Inference

Training

Feature Extraction

To Do List

Citation

Relevant paper

Acknowledge

Comments

UPDATE

I ran your code and found the following error, where are the parameters put into the GPU?

Owner

Yuecong Min

This repo holds code for TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation

This repository holds the code for the paper "Deep Conditional Gaussian Mixture Model forConstrained Clustering".

Implementation of "Distribution Alignment: A Unified Framework for Long-tail Visual Recognition"(CVPR 2021)

The official codes of our CVPR2022 paper: A Differentiable Two-stage Alignment Scheme for Burst Image Reconstruction with Large Shift

MLOps will help you to understand how to build a Continuous Integration and Continuous Delivery pipeline for an ML/AI project.

This repository holds code and data for our PETS'22 article 'From "Onion Not Found" to Guard Discovery'.

Official code for the paper: Deep Graph Matching under Quadratic Constraint (CVPR 2021)

Ranking Models in Unlabeled New Environments （iccv21）

[ICCV21] Self-Calibrating Neural Radiance Fields

[ICCV21] Code for RetrievalFuse: Neural 3D Scene Reconstruction with a Database

Official PyTorch Implementation of GAN-Supervised Dense Visual Alignment

Combining Reinforcement Learning and Constraint Programming for Combinatorial Optimization

Robot Reinforcement Learning on the Constraint Manifold

ISBI 2022: Cross-level Contrastive Learning and Consistency Constraint for Semi-supervised Medical Image.

Reusable constraint types to use with typing.Annotated

Constraint-based geometry sketcher for blender

This project demonstrates the use of neural networks and computer vision to create a classifier that interprets the Brazilian Sign Language.

Sign Language Transformers (CVPR'20)

Sign Language Translation with Transformers (COLING'2020, ECCV'20 SLRTP Workshop)