Sign Language Translation with Transformers (COLING'2020, ECCV'20 SLRTP Workshop)

Overview

transformer-slt

This repository gathers data and code supporting the experiments in the paper Better Sign Language Translation with STMC-Transformer.

Installation

This code is based on OpenNMT v1.0.0 and requires all of its dependencies (torch==1.6.0). Additional requirements are NLTK for NMT evaluation metrics.

The recommended way to install is shown below:

# create a new virtual environment
virtualenv --python=python3 venv
source venv/bin/activate

# clone the repo
git clone https://github.com/kayoyin/transformer-slt.git
cd transformer-slt

# install python dependencies
pip install -r requirements.txt

# install OpenNMT-py
python setup.py install

Sample Usage

Data processing

onmt_preprocess -train_src data/phoenix2014T.train.gloss -train_tgt data/phoenix2014T.train.de -valid_src data/phoenix2014T.dev.gloss -valid_tgt data/phoenix2014T.dev.de -save_data data/dgs -lower 

Training

python  train.py -data data/dgs -save_model model -keep_checkpoint 1 \
          -layers 2 -rnn_size 512 -word_vec_size 512 -transformer_ff 2048 -heads 8  \
          -encoder_type transformer -decoder_type transformer -position_encoding \
          -max_generator_batches 2 -dropout 0.1 \
          -early_stopping 3 -early_stopping_criteria accuracy ppl \
          -batch_size 2048 -accum_count 3 -batch_type tokens -normalization tokens \
          -optim adam -adam_beta2 0.998 -decay_method noam -warmup_steps 3000 -learning_rate 0.5 \
          -max_grad_norm 0 -param_init 0  -param_init_glorot \
          -label_smoothing 0.1 -valid_steps 100 -save_checkpoint_steps 100 \
          -world_size 1 -gpu_ranks 0

Inference

python translate.py -model model [model2 model3 ...] -src data/phoenix2014T.test.gloss -output pred.txt -gpu 0 -replace_unk -beam_size 4

Scoring

# BLEU-1,2,3,4
python tools/bleu.py 1 pred.txt data/phoenix2014T.test.de
python tools/bleu.py 2 pred.txt data/phoenix2014T.test.de
python tools/bleu.py 3 pred.txt data/phoenix2014T.test.de
python tools/bleu.py 4 pred.txt data/phoenix2014T.test.de

# ROUGE
python tools/rouge.py pred.txt data/phoenix2014T.test.de

# METEOR
python tools/meteor.py pred.txt data/phoenix2014T.test.de

To dos:

  • Add configurations & steps to recreate paper results

Reference

Please cite the paper below if you found the resources in this repository useful:

@inproceedings{yin-read-2020-better,
    title = "Better Sign Language Translation with {STMC}-Transformer",
    author = "Yin, Kayo  and
      Read, Jesse",
    booktitle = "Proceedings of the 28th International Conference on Computational Linguistics",
    month = dec,
    year = "2020",
    address = "Barcelona, Spain (Online)",
    publisher = "International Committee on Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2020.coling-main.525",
    doi = "10.18653/v1/2020.coling-main.525",
    pages = "5975--5989",
    abstract = "Sign Language Translation (SLT) first uses a Sign Language Recognition (SLR) system to extract sign language glosses from videos. Then, a translation system generates spoken language translations from the sign language glosses. This paper focuses on the translation system and introduces the STMC-Transformer which improves on the current state-of-the-art by over 5 and 7 BLEU respectively on gloss-to-text and video-to-text translation of the PHOENIX-Weather 2014T dataset. On the ASLG-PC12 corpus, we report an increase of over 16 BLEU. We also demonstrate the problem in current methods that rely on gloss supervision. The video-to-text translation of our STMC-Transformer outperforms translation of GT glosses. This contradicts previous claims that GT gloss translation acts as an upper bound for SLT performance and reveals that glosses are an inefficient representation of sign language. For future SLT research, we therefore suggest an end-to-end training of the recognition and translation models, or using a different sign language annotation scheme.",
}
Comments
  • Error in inference (google colab)

    Error in inference (google colab)

    First, congratulations on this great job.

    I'm trying to replicate your experiments using Google Colab. I managed to do the training, but when I do the inference it triggers an error: RuntimeError: result type Float can't be cast to the desired output type Long

    Could you help me with something that can resolve the above error?

    Thank you for your attention! Congratulations again for the work!

    opened by YanSoares 3
  • Doubt regarding SLT models

    Doubt regarding SLT models

    From Table.9 (under section 6.2) in the paper, I am confused regarding the difference between the S2G->G2T model and the Transformer model (S2G2T?). Could you please clarify the exact difference between these two?

    opened by hshreeshail 1
  • Type of SLT Model

    Type of SLT Model

    Just wanted to confirm that STMC-Transformer and STMC-RNN are both S2G->G2T models and NOT S2G2T models. My understanding of these terms is as follows: S2G->G2T: Where S2G model is trained first, and then, the G2T model is trained on the predictions of the S2G model. S2G2T: Where S2G & G2T models are trained in a joint manner. Please correct me if I am wrong.

    opened by hshreeshail 1
  • SyntaxError: invalid syntax

    SyntaxError: invalid syntax

    Dear sir, I want do the experiment on colab follow your readme.md procedures, but I have encounter some weried problems. I first run the following commands: ! pip install virtualenv ! virtualenv --python=python3 venv ! source venv/bin/activate ! pip install torch==1.6.0 ! git clone https://github.com/kayoyin/transformer-slt.git ! pip install -r requirements.txt ! python setup.py install

    that's fun. but after I copy your onmt-preprocess command which is

    onmt_preprocess -train_src data/phoenix2014T.train.gloss -train_tgt data/phoenix2014T.train.de -valid_src data/phoenix2014T.dev.gloss -valid_tgt data/phoenix2014T.dev.de -save_data data/dgs -lower

    It returns the error:

    File "", line 1 onmt_preprocess -train_src data/phoenix2014T.train.gloss -train_tgt data/phoenix2014T.train.de -valid_src data/phoenix2014T.dev.gloss -valid_tgt data/phoenix2014T.dev.de -save_data data/dgs -lower ^ SyntaxError: invalid syntax

    I really don't know why it happens, could you help me ?

    opened by kokolerk 1
  • STMC network  for SLT

    STMC network for SLT

    Hi, I have read your work carefully and it seems to me well done, congratulations. I am very interested in being able to replicate your results in an end to end manner, and, subsequently, i would like to adapt the work to an Italian Sign Language dataset. Your work is based on two steps, the first is the translation from sign language to glosses through the STMC network. Subsequently, the glosses are translated into text through the transformers. Is there the possibility to share the code regarding the first part in which the STMC network is defined and trained? I'm really interested in the mechanism that takes videos in and translates them into glosses in a continuous way.

    Thanks, Enrico

    opened by enrico310786 1
  • Example of config_file

    Example of config_file

    I am trying to run the server howewer a JSON file is required in def start(self, config_file). I was wondering whether you could provide an example of this file so I am able to see the structure and type of data that is needed. Thank you very much.

    opened by mcabezag 1
  • The size of the train vocabulary

    The size of the train vocabulary

    Although the size of the train vocabulary is 1066 In your paper, the size of the train vocabulary is 1232 which I built from "phoenix2014T.train.gloss".

    Here's the code to find the train vocabulary size.

    
    vocab = {}
    
    with open("phoenix2014T.train.gloss", "r") as f:
        lines = f.readlines()
    
    for sentence in lines:
        for word in sentence.strip().split(" "):
            if word in vocab:
                continue
            vocab[word] = len(vocab)
            
    print(len(vocab))
    
    
    opened by NiwakaDev 1
  • Error while training and testing

    Error while training and testing

    Hi, thanks for this open source repo. I am trying to execute the commands given in the README. I am receiving the following error. Any help would be appricated.

    (py36torch) E:\transformer-slt>python  train.py -data data/dgs -save_model model -keep_checkpoint 1 -layers 2 -rnn_size 512 -word_vec_size 512 -transformer_ff 2048 -heads 8  -encoder_type transformer -decoder_type transformer -position_encoding -max_generator_batches 2 -dropout 0.1 -early_stopping 3 -early_stopping_criteria accuracy ppl -batch_size 2048 -accum_count 3 -batch_type tokens -normalization tokens -optim adam -adam_beta2 0.998 -decay_method noam -warmup_steps 3000 -learning_rate 0.5 -max_grad_norm 0 -param_init 0  -param_init_glorot -label_smoothing 0.1 -valid_steps 100 -save_checkpoint_steps 100 -world_size 1 -gpu_ranks 0
    Traceback (most recent call last):
      File "train.py", line 6, in <module>
        main()
      File "E:\transformer-slt\onmt\bin\train.py", line 204, in main
        train(opt)
      File "E:\transformer-slt\onmt\bin\train.py", line 35, in train
        vocab = torch.load(opt.data + '.vocab.pt')
      File "E:\Envs\py36torch\lib\site-packages\torch\serialization.py", line 584, in load
        with _open_file_like(f, 'rb') as opened_file:
      File "E:\Envs\py36torch\lib\site-packages\torch\serialization.py", line 234, in _open_file_like
        return _open_file(name_or_buffer, mode)
      File "E:\Envs\py36torch\lib\site-packages\torch\serialization.py", line 215, in __init__
        super(_open_file, self).__init__(open(name, mode))
    FileNotFoundError: [Errno 2] No such file or directory: 'data/dgs.vocab.pt'
    

    When i execute the testing command i am getting the following error:

    (py36torch) E:\transformer-slt>python translate.py -model model [model2 model3 ...] -src data/phoenix2014T.test.gloss -output pred.txt -gpu 0 -replace_unk -beam_size 4
    Traceback (most recent call last):
      File "translate.py", line 6, in <module>
        main()
      File "E:\transformer-slt\onmt\bin\translate.py", line 48, in main
        translate(opt)
      File "E:\transformer-slt\onmt\bin\translate.py", line 18, in translate
        translator = build_translator(opt, report_score=True)
      File "E:\transformer-slt\onmt\translate\translator.py", line 28, in build_translator
        fields, model, model_opt = load_test_model(opt)
      File "E:\transformer-slt\onmt\decoders\ensemble.py", line 130, in load_test_model
        onmt.model_builder.load_test_model(opt, model_path=model_path)
      File "E:\transformer-slt\onmt\model_builder.py", line 96, in load_test_model
        map_location=lambda storage, loc: storage)
      File "E:\Envs\py36torch\lib\site-packages\torch\serialization.py", line 584, in load
        with _open_file_like(f, 'rb') as opened_file:
      File "E:\Envs\py36torch\lib\site-packages\torch\serialization.py", line 234, in _open_file_like
        return _open_file(name_or_buffer, mode)
      File "E:\Envs\py36torch\lib\site-packages\torch\serialization.py", line 215, in __init__
        super(_open_file, self).__init__(open(name, mode))
    FileNotFoundError: [Errno 2] No such file or directory: 'model'
    

    Please suggest the solution

    opened by thisisashukla 1
  • Results mismatch from paper

    Results mismatch from paper

    I am assuming you used vanilla embeddings in your first ablation experiment on the number of encoder-decoder layers. Moreover, the paper states that it uses #layers=2 for further experiments. In that case, shouldn't the results in Table-3 (for #layers = 2) match the results in Table-5 (for vanilla embedding)? For example, the BLEU-4 score (for test set) in Table-3 is 21.65, and in Table-5, is 22.22. Shouldn't these not be different?

    opened by hshreeshail 0
  • Conversion to tflite

    Conversion to tflite

    I'm trying to convert the pytorch model to tflite model. But I'm facing issue in providing the dummy model input to torch.onnx.export() method. Do you know what can be the dummy model input ?

    Here is how my code look like:

    `import torch.nn as nn import torch.onnx import torchvision import torch from onmt.model_builder import build_base_model from onmt.utils.parse import ArgumentParser

    checkpoint = torch.load('model_step_1600.pt')

    model_opt = ArgumentParser.ckpt_model_opts(checkpoint['opt']) ArgumentParser.update_model_opts(model_opt) ArgumentParser.validate_model_opts(model_opt) vocab = checkpoint['vocab'] fields = vocab

    model = build_base_model(model_opt, fields, None, checkpoint) model.eval()

    dummy_input = torch.randn(1, 3, 224, 224, requires_grad=True) # dummy_input = torch.from_numpy(X_test[0].reshape(1, -1)).float().to(device)

    torch.onnx.export(model, dummy_input, 'model_simple.onnx') `

    opened by Aayush2007 0
Owner
Kayo Yin
Grad student at CMU LTI @neulab researching multilingual NLP (spoken + signed languages)
Kayo Yin
Sign Language Transformers (CVPR'20)

Sign Language Transformers (CVPR'20) This repo contains the training and evaluation code for the paper Sign Language Transformers: Sign Language Trans

Necati Cihan Camgoz 164 Dec 30, 2022
Source code for "Progressive Transformers for End-to-End Sign Language Production" (ECCV 2020)

Progressive Transformers for End-to-End Sign Language Production Source code for "Progressive Transformers for End-to-End Sign Language Production" (B

null 58 Dec 21, 2022
This repo contains the official code of our work SAM-SLR which won the CVPR 2021 Challenge on Large Scale Signer Independent Isolated Sign Language Recognition.

Skeleton Aware Multi-modal Sign Language Recognition By Songyao Jiang, Bin Sun, Lichen Wang, Yue Bai, Kunpeng Li and Yun Fu. Smile Lab @ Northeastern

Isen (Songyao Jiang) 128 Dec 8, 2022
This project demonstrates the use of neural networks and computer vision to create a classifier that interprets the Brazilian Sign Language.

LIBRAS-Image-Classifier This project demonstrates the use of neural networks and computer vision to create a classifier that interprets the Brazilian

Aryclenio Xavier Barros 26 Oct 14, 2022
Sign Language is detected in realtime using video sequences. Our approach involves MediaPipe Holistic for keypoints extraction and LSTM Model for prediction.

RealTime Sign Language Detection using Action Recognition Approach Real-Time Sign Language is commonly predicted using models whose architecture consi

Rishikesh S 15 Aug 20, 2022
A project to make Amazon Echo respond to sign language using your webcam

Making Alexa respond to Sign Language using Tensorflow.js Try the live demo Read the Blog Post on Tensorflow's Blog Coming Soon Watch the video This p

Abhishek Singh 444 Jan 3, 2023
Model of an AI powered sign language interpreter.

TEXT AND SPEECH TO SIGN LANGUAGE. A web application which takes in text or live audio speech recording as input, converts and displays the relevant Si

Mark Gatere 4 Mar 30, 2022
This is a model to classify Vietnamese sign language using Motion history image (MHI) algorithm and CNN.

Vietnamese sign lagnuage recognition using MHI and CNN This is a model to classify Vietnamese sign language using Motion history image (MHI) algorithm

Phat Pham 3 Feb 24, 2022
Code and data to accompany the camera-ready version of "Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation" in EMNLP 2021

Code and data to accompany the camera-ready version of "Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation" in EMNLP 2021

Mozhdeh Gheini 16 Jul 16, 2022
ICCV2021 Oral SA-ConvONet: Sign-Agnostic Optimization of Convolutional Occupancy Networks

Sign-Agnostic Convolutional Occupancy Networks Paper | Supplementary | Video | Teaser Video | Project Page This repository contains the implementation

null 63 Nov 18, 2022
ICCV2021 Oral SA-ConvONet: Sign-Agnostic Optimization of Convolutional Occupancy Networks

Sign-Agnostic Convolutional Occupancy Networks Paper | Supplementary | Video | Teaser Video | Project Page This repository contains the implementation

null 64 Jan 5, 2023
Multivariate Time Series Forecasting with efficient Transformers. Code for the paper "Long-Range Transformers for Dynamic Spatiotemporal Forecasting."

Spacetimeformer Multivariate Forecasting This repository contains the code for the paper, "Long-Range Transformers for Dynamic Spatiotemporal Forecast

QData 440 Jan 2, 2023
The 1st place solution of track2 (Vehicle Re-Identification) in the NVIDIA AI City Challenge at CVPR 2021 Workshop.

AICITY2021_Track2_DMT The 1st place solution of track2 (Vehicle Re-Identification) in the NVIDIA AI City Challenge at CVPR 2021 Workshop. Introduction

Hao Luo 91 Dec 21, 2022
Code for "ShineOn: Illuminating Design Choices for Practical Video-based Virtual Clothing Try-on", accepted at WACV 2021 Generation of Human Behavior Workshop.

ShineOn: Illuminating Design Choices for Practical Video-based Virtual Clothing Try-on [ Paper ] [ Project Page ] This repository contains the code fo

Andrew Jong 97 Dec 13, 2022
Source codes of CenterTrack++ in 2021 ICME Workshop on Big Surveillance Data Processing and Analysis

MOT Tracked object bounding box association (CenterTrack++) New association method based on CenterTrack. Two new branches (Tracked Size and IOU) are a

null 36 Oct 4, 2022
Image transformations designed for Scene Text Recognition (STR) data augmentation. Published at ICCV 2021 Workshop on Interactive Labeling and Data Augmentation for Vision.

Data Augmentation for Scene Text Recognition (ICCV 2021 Workshop) (Pronounced as "strog") Paper Arxiv Why it matters? Scene Text Recognition (STR) req

Rowel Atienza 152 Dec 28, 2022
Pytorch implementation of winner from VQA Chllange Workshop in CVPR'17

2017 VQA Challenge Winner (CVPR'17 Workshop) pytorch implementation of Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challeng

Mark Dong 166 Dec 11, 2022
Code for the ICCV 2021 Workshop paper: A Unified Efficient Pyramid Transformer for Semantic Segmentation.

Unified-EPT Code for the ICCV 2021 Workshop paper: A Unified Efficient Pyramid Transformer for Semantic Segmentation. Installation Linux, CUDA>=10.0,

null 29 Aug 23, 2022
Tracking code for the winner of track 1 in the MMP-Tracking Challenge at ICCV 2021 Workshop.

Tracking Code for the winner of track1 in MMP-Trakcing challenge This repository contains our tracking code for the Multi-camera Multiple People Track

DamoCV 29 Nov 13, 2022