This repository contains demos I made with the Transformers library by HuggingFace.

Overview

Transformers-Tutorials

Hi there!

This repository contains demos I made with the Transformers library by 🤗 HuggingFace. Currently, all of them are implemented in PyTorch.

NOTE: if you are not familiar with HuggingFace and/or Transformers, I highly recommend to check out our free course, which introduces you to several Transformer architectures (such as BERT, GPT-2, T5, BART, etc.), as well as an overview of the HuggingFace libraries, including Transformers, Tokenizers, Datasets, Accelerate and the hub.

Currently, it contains the following demos:

  • BERT (paper):
    • fine-tuning BertForTokenClassification on a named entity recognition (NER) dataset. Open In Colab
    • fine-tuning BertForSequenceClassification for multi-label text classification. Open In Colab
  • CANINE (paper):
    • fine-tuning CanineForSequenceClassification on IMDb Open In Colab
  • DETR (paper):
    • performing inference with DetrForObjectDetection Open In Colab
    • fine-tuning DetrForObjectDetection on a custom object detection dataset Open In Colab
    • evaluating DetrForObjectDetection on the COCO detection 2017 validation set Open In Colab
    • performing inference with DetrForSegmentation Open In Colab
    • fine-tuning DetrForSegmentation on COCO panoptic 2017 Open In Colab
  • GPT-J-6B (repository):
    • performing inference with GPTJForCausalLM to illustrate few-shot learning and code generation Open In Colab
  • ImageGPT (blog post):
    • (un)conditional image generation with ImageGPTForCausalLM Open In Colab
    • linear probing with ImageGPT Open In Colab
  • LayoutLM (paper):
    • fine-tuning LayoutLMForTokenClassification on the FUNSD dataset Open In Colab
    • fine-tuning LayoutLMForSequenceClassification on the RVL-CDIP dataset Open In Colab
    • adding image embeddings to LayoutLM during fine-tuning on the FUNSD dataset Open In Colab
  • LayoutLMv2 (paper):
    • fine-tuning LayoutLMv2ForSequenceClassification on RVL-CDIP Open In Colab
    • fine-tuning LayoutLMv2ForTokenClassification on FUNSD Open In Colab
    • fine-tuning LayoutLMv2ForTokenClassification on FUNSD using the 🤗 Trainer Open In Colab
    • performing inference with LayoutLMv2ForTokenClassification on FUNSD Open In Colab
    • true inference with LayoutLMv2ForTokenClassification (when no labels are available) + Gradio demo Open In Colab
    • fine-tuning LayoutLMv2ForTokenClassification on CORD Open In Colab
    • fine-tuning LayoutLMv2ForQuestionAnswering on DOCVQA Open In Colab
  • LUKE (paper):
    • fine-tuning LukeForEntityPairClassification on a custom relation extraction dataset using PyTorch Lightning Open In Colab
  • SegFormer (paper):
    • performing inference with SegformerForSemanticSegmentation Open In Colab
    • fine-tuning SegformerForSemanticSegmentation on custom data using native PyTorch Open In Colab
  • Perceiver IO (paper):
    • showcasing masked language modeling and image classification with the Perceiver Open In Colab
    • fine-tuning the Perceiver for image classification Open In Colab
    • fine-tuning the Perceiver for text classification Open In Colab
    • predicting optical flow between a pair of images with PerceiverForOpticalFlowOpen In Colab
    • auto-encoding a video (images, audio, labels) with PerceiverForMultimodalAutoencoding Open In Colab
  • T5 (paper):
    • fine-tuning T5ForConditionalGeneration on a Dutch summarization dataset on TPU using HuggingFace Accelerate Open In Colab
    • fine-tuning T5ForConditionalGeneration (CodeT5) for Ruby code summarization using PyTorch Lightning Open In Colab
  • TAPAS (paper):
  • TrOCR (paper):
    • performing inference with TrOCR to illustrate optical character recognition with Transformers, as well as making a Gradio demo Open In Colab
    • fine-tuning TrOCR on the IAM dataset using the Seq2SeqTrainer Open In Colab
    • fine-tuning TrOCR on the IAM dataset using native PyTorch Open In Colab
    • evaluating TrOCR on the IAM test set Open In Colab
  • Vision Transformer (paper):
    • performing inference with ViTForImageClassification Open In Colab
    • fine-tuning ViTForImageClassification on CIFAR-10 using PyTorch Lightning Open In Colab
    • fine-tuning ViTForImageClassification on CIFAR-10 using the 🤗 Trainer Open In Colab

... more to come! 🤗

If you have any questions regarding these demos, feel free to open an issue on this repository.

Btw, I was also the main contributor to add the following algorithms to the library:

  • TAbular PArSing (TAPAS) by Google AI
  • Vision Transformer (ViT) by Google AI
  • Data-efficient Image Transformers (DeiT) by Facebook AI
  • LUKE by Studio Ousia
  • DEtection TRansformers (DETR) by Facebook AI
  • CANINE by Google AI
  • BEiT by Microsoft Research
  • LayoutLMv2 (and LayoutXLM) by Microsoft Research
  • TrOCR by Microsoft Research
  • SegFormer by NVIDIA
  • ImageGPT by OpenAI
  • Perceiver by Deepmind

All of them were an incredible learning experience. I can recommend anyone to contribute an AI algorithm to the library!

Data preprocessing

Regarding preparing your data for a PyTorch model, there are a few options:

  • a native PyTorch dataset + dataloader. This is the standard way to prepare data for a PyTorch model, namely by subclassing torch.utils.data.Dataset, and then a creating corresponding DataLoader (which is a Python generator that allows to loop over the items of a dataset). When subclassing the Dataset class, one needs to implement 3 methods: __init__, __len__ (which returns the number of examples of the dataset) and __getitem__ (which returns an example of the dataset, given an integer index). Here's an example of creating a basic text classification dataset (assuming one has a CSV that contains 2 columns, namely "text" and "label"):
from torch.utils.data import Dataset

class CustomTrainDataset(Dataset):
    def __init__(self, df, tokenizer):
        self.df = df
        self.tokenizer = tokenizer

    def __len__(self):
        return len(self.df)

    def __getitem__(self, idx):
        # get item
        item = df.iloc[idx]
        text = item['text']
        label = item['label']
        # encode text
        encoding = self.tokenizer(text, padding="max_length", max_length=128, truncation=True, return_tensors="pt")
        # remove batch dimension which the tokenizer automatically adds
        encoding = {k:v.squeeze() for k,v in encoding.items()}
        # add label
        encoding["label"] = torch.tensor(label)
        
        return encoding

Instantiating the dataset then happens as follows:

from transformers import BertTokenizer
import pandas as pd

tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
df = pd.read_csv("path_to_your_csv")

train_dataset = CustomTrainDataset(df=df tokenizer=tokenizer)

Accessing the first example of the dataset can then be done as follows:

encoding = train_dataset[0]

In practice, one creates a corresponding DataLoader, that allows to get batches from the dataset:

from torch.utils.data import DataLoader

train_dataloader = DataLoader(train_dataset, batch_size=4, shuffle=True)

I often check whether the data is created correctly by fetching the first batch from the data loader, and then printing out the shapes of the tensors, decoding the input_ids back to text, etc.

batch = next(iter(train_dataloader))
for k,v in batch.items():
    print(k, v.shape)
# decode the input_ids of the first example of the batch
print(tokenizer.decode(batch['input_ids'][0].tolist())
  • HuggingFace Datasets. Datasets is a library by HuggingFace that allows to easily load and process data in a very fast and memory-efficient way. It is backed by Apache Arrow, and has cool features such as memory-mapping, which allow you to only load data into RAM when it is required. It only has deep interoperability with the HuggingFace hub, allowing to easily load well-known datasets as well as share your own with the community.

Loading a custom dataset as a Dataset object can be done as follows (you can install datasets using pip install datasets):

from datasets import load_dataset

dataset = load_dataset('csv', data_files={'train': ['my_train_file_1.csv', 'my_train_file_2.csv'] 'test': 'my_test_file.csv'})

Here I'm loading local CSV files, but there are other formats supported (including JSON, Parquet, txt) as well as loading data from a local Pandas dataframe or dictionary for instance. You can check out the docs for all details.

Training frameworks

Regarding fine-tuning Transformer models (or more generally, PyTorch models), there are a few options:

  • using native PyTorch. This is the most basic way to train a model, and requires the user to manually write the training loop. The advantage is that this is very easy to debug. The disadvantage is that one needs to implement training him/herself, such as setting the model in the appropriate mode (model.train()/model.eval()), handle device placement (model.to(device)), etc. A typical training loop in PyTorch looks as follows (inspired by this great PyTorch intro tutorial):
import torch

model = ...

# I almost always use a learning rate of 5e-5 when fine-tuning Transformer based models
optimizer = torch.optim.Adam(model.parameters(), lr=5-e5)

# put model on GPU, if available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

for epoch in range(epochs):
    model.train()
    train_loss = 0.0
    for batch in train_dataloader:
        # put batch on device
        batch = {k:v.to(device) for k,v in batch.items()}
        
        # forward pass
        outputs = model(**batch)
        loss = outputs.loss
        
        train_loss += loss.item()
        
        loss.backward()
        optimizer.step()
        optimizer.zero_grad()

    print("Loss after epoch {epoch}:", train_loss/len(train_dataloader))
    
    model.eval()
    val_loss = 0.0
    with torch.no_grad():
        for batch in eval_dataloader:
            # put batch on device
            batch = {k:v.to(device) for k,v in batch.items()}
            
            # forward pass
            outputs = model(**batch)
            loss = outputs.logits
            
            val_loss += loss.item()
                  
    print("Validation loss after epoch {epoch}:", val_loss/len(eval_dataloader))
  • PyTorch Lightning (PL). PyTorch Lightning is a framework that automates the training loop written above, by abstracting it away in a Trainer object. Users don't need to write the training loop themselves anymore, instead they can just do trainer = Trainer() and then trainer.fit(model). The advantage is that you can start training models very quickly (hence the name lightning), as all training-related code is handled by the Trainer object. The disadvantage is that it may be more difficult to debug your model, as the training and evaluation is now abstracted away.
  • HuggingFace Trainer. The HuggingFace Trainer API can be seen as a framework similar to PyTorch Lightning in the sense that it also abstracts the training away using a Trainer object. However, contrary to PyTorch Lightning, it is not meant not be a general framework. Rather, it is made especially for fine-tuning Transformer-based models available in the HuggingFace Transformers library. The Trainer also has an extension called Seq2SeqTrainer for encoder-decoder models, such as BART, T5 and the EncoderDecoderModel classes. Note that all PyTorch example scripts of the Transformers library make use of the Trainer.
  • HuggingFace Accelerate: Accelerate is a new project, that is made for people who still want to write their own training loop (as shown above), but would like to make it work automatically irregardless of the hardware (i.e. multiple GPUs, TPU pods, mixed precision, etc.).
Comments
  • Custom Label Prediction LMv2

    Custom Label Prediction LMv2

    Model i'm using : Layout LMv2 for token classification issue: I trained and fine-tuned the LMv2 model for custom token classification i.e. instead of labelling whole text present in image , only particular labels are generated using bounding box information generated by an external OCR. I'm getting correct desired output for the test dataset.

    How would i preapre an unseen image (image not present in train or test dataset ) to submit to the model and get only the particular labels as output. Right now whole image is getting labelled.

    opened by sheikhasim 12
  • Meta-DETR contribution

    Meta-DETR contribution

    Hello! I would like to incorporate meta-learning to your DETR implementation to perform few-shot object detection. Any suggestions where to start?

    Link to the paper https://arxiv.org/abs/2103.11731 Link to the official implementation https://github.com/ZhangGongjie/Meta-DETR

    opened by NouamaneTazi 10
  • LayoutXLM for Token Classification on FUNSD

    LayoutXLM for Token Classification on FUNSD

    Hello Niels, first thanks a lot for all of your awesome tutorials, I'm trying to apply LayoutLM v2 Token classification tutorial on LayoutXLM, and I'm facing few issues. I'm trying to have a processer for LayoutXLM, so converting this line

    from transformers import LayoutLMv2Processor
    processor = LayoutLMv2Processor.from_pretrained("microsoft/layoutlmv2-base-uncased", revision="no_ocr")
    

    to those, but none worked.

    processor = LayoutLMv2Processor.from_pretrained("microsoft/layoutxlm-base", revision="no_ocr")

    feature_extractor = LayoutLMv2FeatureExtractor(apply_ocr=False)
    tokenizer = LayoutLMv2TokenizerFast.from_pretrained("microsoft/layoutxlm-base")
    processor = LayoutLMv2Processor(feature_extractor, tokenizer)
    

    So, Can you please help me figure out what to change to have it work? Many thanks in advance!.

    opened by TahaDouaji 10
  • Layoutlmv2 - Document Classification predicting same class always

    Layoutlmv2 - Document Classification predicting same class always

    Hi Niels Rogge,

    Thanks for all the awesome tutorials.

    I have been working on fine-tuning LayoutlmV2 model for document classification on my own data. I am facing an issue that the model is predicting same class for all examples, even for training examples. The model's training accuracy was above 90% after certain epochs , but it is always predicting the same class. I have changed the learning rate also, but no improvement. I have trained 400 examples for 3 classes, equally balanced.

    Another strange thing is , if I trained with less examples (5 to 15 examples for each class), it is predicting different classes, but if i train on more examples it is predicting same class.

    Can you please help me regarding this ? Do I need to change any configuration before fine-tuning the model ?

    Thanks in advance

    Jerome

    opened by Jerome-Michael 9
  • Token classification using LayoutXLM

    Token classification using LayoutXLM

    Model : LayoutXLM

    I am able to use LayoutLMv2ForTokenClassification for my problem. Since the data I am using is in German, I would like to use LayoutXLM model. Unfortunately I do not find LayoutLXLMForTokenClassification on Transformers. How could I do Token classification using LayoutXLM?

    To Reproduce Steps to reproduce the behavior:

    from transformers import LayoutXLMForTokenClassification Expected behavior ImportError: cannot import name 'LayoutXLMForTokenClassification' from 'transformers' (/opt/conda/lib/python3.7/site-packages/transformers/init.py)

    Platform: Linux Python version: 3.7

    opened by manangandhi7 7
  • Why does TAPAS perform worse than reported?

    Why does TAPAS perform worse than reported?

    Hi, nice tutorials!

    Thank you for adding TAPAS to huggingface/transformers. It is really helpful.

    However, according to your Evaluating_TAPAS_on_the_Tabfact_test_set.ipynb, the performance of tapas-base-finetuned-tabfact on test set is 77.1 while it is reported as 78.5 in the paper. What attributes to the performance drop?

    Thank you!

    opened by FeiWang96 7
  • Error with ViLT Processor and Feature extractor

    Error with ViLT Processor and Feature extractor

    @NielsRogge I am trying to use ViLT for MLM for pretraining (inspired by your inference tutorial) but the issue that I am facing is the ViltFeatureExtractor or the ViltProcessor.feature_extractor is giving me error (screenshot) when I am trying to send a batch of 32 images rather than just 1 image shown in your ViLT MLM inference tutorial.

    feature_extr = ViltFeatureExtractor.from_pretrained("dandelin/vilt-b32-mlm") encoding_pixels = feature_extr(image, image_mean=[0.48145466, 0.4578275, 0.40821073], image_std=[0.26862954, 0.26130258, 0.27577711], return_tensors='pt').pixel_values

    encoding_pixels = batched_featurextr(image, processor=processor).to(device) encoding_pixels = processor.feature_extractor(image, image_mean=[0.48145466, 0.4578275, 0.40821073], image_std=[0.26862954, 0.26130258, 0.27577711], return_tensors='pt').pixel_values

    image

    opened by sanyalsunny111 6
  • Can the data split further divided into simple_test, complex_test, small_test

    Can the data split further divided into simple_test, complex_test, small_test

    Sure. Here is the raw train, validation, and test data.

    Download the data and run the following script, it is expected to get 79.1% accuracy.

    import os
    from typing import List
    import torch
    import pandas as pd
    from transformers import TapasTokenizer, TapasForSequenceClassification
    from datasets import load_dataset, load_metric, Features, Sequence, ClassLabel, Value, Array2D
    
    def prepare_official_data_loader():
        tokenizer = TapasTokenizer.from_pretrained('google/tapas-base-finetuned-tabfact')
        features = Features({
            'attention_mask': Sequence(Value(dtype='int64')),
            'input_ids': Sequence(feature=Value(dtype='int64')),
            'label': ClassLabel(names=['refuted', 'entailed']),
            'statement': Value(dtype='string'),
            'table_caption': Value(dtype='string'),
            'table_id': Value(dtype='string'),
            'token_type_ids': Array2D(dtype="int64", shape=(512, 7))
        })
        test_set = load_dataset('json', data_files={'test': 'test.jsonl'}, split='test')
    
        def _format_pd_table(table_text: List) -> pd.DataFrame:
            df = pd.DataFrame(columns=table_text[0], data=table_text[1:])
            df = df.astype(str)
            return df
    
        test = test_set.map(
            lambda e: tokenizer(table=_format_pd_table(e['table_text']), queries=e['statement'],
                                truncation=True,
                                padding='max_length'),
            features=features,
            remove_columns=['table_text'],
        )
        # map to PyTorch tensors and only keep columns we need
        test.set_format(type='torch', columns=['input_ids', 'attention_mask', 'token_type_ids', 'label'])
        # create PyTorch dataloader
        test_dataloader = torch.utils.data.DataLoader(test, batch_size=4)
    
        return test_dataloader
    
    def evaluate():
        accuracy = load_metric("accuracy")
        test_dataloader = prepare_official_data_loader()
        batch = next(iter(test_dataloader))
        assert batch["input_ids"].shape == (4, 512)
        assert batch["attention_mask"].shape == (4, 512)
        assert batch["token_type_ids"].shape == (4, 512, 7)
    
        # Evaluate
        device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
        model = TapasForSequenceClassification.from_pretrained('google/tapas-base-finetuned-tabfact')
        model.to(device)
    
        number_processed = 0
        total = len(test_dataloader) * batch["input_ids"].shape[0]  # number of batches * batch_size
        for batch in test_dataloader:
            # get the inputs
            input_ids = batch["input_ids"].to(device)
            attention_mask = batch["attention_mask"].to(device)
            token_type_ids = batch["token_type_ids"].to(device)
            labels = batch["label"].to(device)
    
            # forward pass
            outputs = model(input_ids=input_ids, attention_mask=attention_mask, token_type_ids=token_type_ids,
                            labels=labels)
            model_predictions = outputs.logits.argmax(-1)
    
            # add metric
            accuracy.add_batch(predictions=model_predictions, references=labels)
    
            number_processed += batch["input_ids"].shape[0]
            print(f"Processed {number_processed} / {total} examples")
    
        final_score = accuracy.compute()
        print(final_score)
    
    if __name__ == '__main__':
        evaluate()
    

    Originally posted by @JasperGuo in https://github.com/NielsRogge/Transformers-Tutorials/issues/2#issuecomment-814530233

    opened by qshi95 6
  • Where does image embeddings come from in Layoutlm funsd task?

    Where does image embeddings come from in Layoutlm funsd task?

    Hello @NielsRogge . Thanks a lot for the wonderful tutorial of Fine_tuning_LayoutlmfortokenClassification_on_Funsd. I have been able to successfully run it on my system. However, I have a basic practical doubt. In the paper of Layoutlm, the authors point out that Layoutlm uses small image clips for each bounding box for training purposes too. Page 2 of the paper

    meanwhile the image embedding can capture some appearance features such as font directions, types, and colors.
    

    But these small crops of each bounding box was not made in any of the code you did. Is that task is done while doing forward pass in the below code you uploaded?

    outputs = model(input_ids=input_ids, bbox=bbox, attention_mask=attention_mask, token_type_ids=token_type_ids,
                          labels=labels)
    

    Basically, I cannot find any part of the code that calculated image embedding from bounding boxes while the paper claims it does. Can you help me find the part of code that does that as I think open-source code which they have uploaded might not have that code. Thanks a lot, man. Looking forward to your response.

    opened by akshat-khare 6
  • Leveraging Segment position embeddings during Inference time in LayoutLMv3 Token Classification

    Leveraging Segment position embeddings during Inference time in LayoutLMv3 Token Classification

    @NielsRogge Can you help me understand how we can leverage Segment position embeddings during inference time as we are still predicting based on each token? Do we need to do an external text segmentation (region detection of segments) if the OCR engine gives outputs on a word level?

    It would be great if you could provide some pointers on how to leverage Segment position embeddings to improve accuracy during inference time.

    Thank You

    opened by arunpurohit3799 5
  • ArrowInvalid: Can only convert 1-dimensional array values

    ArrowInvalid: Can only convert 1-dimensional array values

    I run the notebook Fine-tuning LayoutLMv2ForTokenClassification on FUNSD.ipynb and got the error as follow:

    ArrowInvalid Traceback (most recent call last)

    in () 27 28 train_dataset = datasets['train'].map(preprocess_data, batched=True, remove_columns=datasets['train'].column_names, ---> 29 features=features) 30 test_dataset = datasets['test'].map(preprocess_data, batched=True, remove_columns=datasets['test'].column_names, 31 features=features)

    13 frames

    /usr/local/lib/python3.7/dist-packages/pyarrow/error.pxi in pyarrow.lib.check_status()

    ArrowInvalid: Can only convert 1-dimensional array values

    could anyone help me to solve the problem?

    opened by mrtranducdung 5
  • AttributeError: 'TrOCRProcessor' object has no attribute 'feature_extractor'

    AttributeError: 'TrOCRProcessor' object has no attribute 'feature_extractor'

    Following your notebook for the finetuning of TrOCR using Seq2SeqTrainer I get the following error: AttributeError: 'TrOCRProcessor' object has no attribute 'feature_extractor'

    image

    I am using google coloab for training.

    Please help me to solve this error!

    opened by Zahak-Anjum 0
  • "olemeyer/docvqa-en-de-fr-es-it" is not available in huggingface anymore

    "olemeyer/docvqa-en-de-fr-es-it" dataset which is used in the notebook Creating_a_toy_DocVQA_dataset_for_Donut.ipynb is not available in huggingface anymore

    opened by YanaSSS 0
  • Layoutlmv2 Prediction/Inference from fine tuned .pth model

    Layoutlmv2 Prediction/Inference from fine tuned .pth model

    Hai,

    I have finetuned layoutlmv2 model using custom dataset which was annotated in FUNSD format. Now i can save my finetuned model in local(.pth) using torch.save. Now i need to get predictions using finetuned (.pth) state dict. Am beginner to this anyone know how to get predictions. .pth has state dict which is weights and biasis.

    I tried to get predictions using: **state_dict = torch.load("/content/model_17.pth")

    Create the model and tokenizer using the state dictionary

    model = LayoutLMv2ForTokenClassification.from_pretrained("/config.json",state_dict=state_dict, ignore_mismatched_sizes=True)**

    but getting weird prediction output. Anybody knew solution for this? Thanks in advance!

    opened by nikithakriz 0
  • Got dict_keys error in Fine tuning YOLOS

    Got dict_keys error in Fine tuning YOLOS

    There is an error in Fine_tuning_YOLOS_for_object_detection_on_custom_dataset_(balloon).ipynb when creating the batch. The error gives Size must contain 'height' and 'width' keys or 'shortest_edge' and 'longest_edge' keys. Got dict_keys(['shortest_edge']). This seems to be due to an update in one of the libraries

    opened by n12iB 4
  • Maskformer post_process_panoptic_segmentation result does not contains

    Maskformer post_process_panoptic_segmentation result does not contains "segments" key

    In https://github.com/NielsRogge/Transformers-Tutorials/blob/master/MaskFormer/maskformer_minimal_example(with_MaskFormerFeatureExtractor).ipynb

    I think the key is actually segments_info.

    opened by nickponline 0
Owner
ML @HuggingFace. Interested in deep learning, NLP. Contributed TAPAS, ViT, DeiT, LUKE, DETR, CANINE to HuggingFace Transformers
null
Pre-trained Deep Learning models and demos (high quality and extremely fast)

OpenVINOâ„¢ Toolkit - Open Model Zoo repository This repository includes optimized deep learning models and a set of demos to expedite development of hi

OpenVINO Toolkit 3.4k Dec 31, 2022
This repository contains PyTorch code for Robust Vision Transformers.

This repository contains PyTorch code for Robust Vision Transformers.

null 117 Dec 7, 2022
RoBERTa Marathi Language model trained from scratch during huggingface 🤗 x flax community week

RoBERTa base model for Marathi Language (मराठी भाषा) Pretrained model on Marathi language using a masked language modeling (MLM) objective. RoBERTa wa

Nipun Sadvilkar 23 Oct 19, 2022
An image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testingAn image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testing

SVM Données Une base d’images contient 490 images pour l’apprentissage (400 voitures et 90 bateaux), et encore 21 images pour fait des tests. Prétrait

Achraf Rahouti 3 Nov 30, 2021
Contains code for the paper "Vision Transformers are Robust Learners".

Vision Transformers are Robust Learners This repository contains the code for the paper Vision Transformers are Robust Learners by Sayak Paul* and Pin

Sayak Paul 103 Jan 5, 2023
Multivariate Time Series Forecasting with efficient Transformers. Code for the paper "Long-Range Transformers for Dynamic Spatiotemporal Forecasting."

Spacetimeformer Multivariate Forecasting This repository contains the code for the paper, "Long-Range Transformers for Dynamic Spatiotemporal Forecast

QData 440 Jan 2, 2023
This repository contains the code used for Predicting Patient Outcomes with Graph Representation Learning (https://arxiv.org/abs/2101.03940).

Predicting Patient Outcomes with Graph Representation Learning This repository contains the code used for Predicting Patient Outcomes with Graph Repre

Emma Rocheteau 76 Dec 22, 2022
An efficient and effective learning to rank algorithm by mining information across ranking candidates. This repository contains the tensorflow implementation of SERank model. The code is developed based on TF-Ranking.

SERank An efficient and effective learning to rank algorithm by mining information across ranking candidates. This repository contains the tensorflow

Zhihu 44 Oct 20, 2022
This repository contains a pytorch implementation of "StereoPIFu: Depth Aware Clothed Human Digitization via Stereo Vision".

StereoPIFu: Depth Aware Clothed Human Digitization via Stereo Vision | Project Page | Paper | This repository contains a pytorch implementation of "St

null 87 Dec 9, 2022
This repository contains the implementations related to the experiments of a set of publicly available datasets that are used in the time series forecasting research space.

TSForecasting This repository contains the implementations related to the experiments of a set of publicly available datasets that are used in the tim

Rakshitha Godahewa 80 Dec 30, 2022
This repository contains the code for our fast polygonal building extraction from overhead images pipeline.

Polygonal Building Segmentation by Frame Field Learning We add a frame field output to an image segmentation neural network to improve segmentation qu

Nicolas Girard 186 Jan 4, 2023
This repository contains the code for the paper "Hierarchical Motion Understanding via Motion Programs"

Hierarchical Motion Understanding via Motion Programs (CVPR 2021) This repository contains the official implementation of: Hierarchical Motion Underst

Sumith Kulal 40 Dec 5, 2022
This repository contains a PyTorch implementation of "AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis".

AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis | Project Page | Paper | PyTorch implementation for the paper "AD-NeRF: Audio

null 551 Dec 29, 2022
This repository contains all the code and materials distributed in the 2021 Q-Programming Summer of Qode.

Q-Programming Summer of Qode This repository contains all the code and materials distributed in the Q-Programming Summer of Qode. If you want to creat

Sammarth Kumar 11 Jun 11, 2021
null 190 Jan 3, 2023
This repository contains the implementation of Deep Detail Enhancment for Any Garment proposed in Eurographics 2021

Deep-Detail-Enhancement-for-Any-Garment Introduction This repository contains the implementation of Deep Detail Enhancment for Any Garment proposed in

null 40 Dec 13, 2022
This repository contains the source code and data for reproducing results of Deep Continuous Clustering paper

Deep Continuous Clustering Introduction This is a Pytorch implementation of the DCC algorithms presented in the following paper (paper): Sohil Atul Sh

Sohil Shah 197 Nov 29, 2022
This repository contains a re-implementation of the code for the CVPR 2021 paper "Omnimatte: Associating Objects and Their Effects in Video."

Omnimatte in PyTorch This repository contains a re-implementation of the code for the CVPR 2021 paper "Omnimatte: Associating Objects and Their Effect

Erika Lu 728 Dec 28, 2022
This repository contains the source code for the paper "DONeRF: Towards Real-Time Rendering of Compact Neural Radiance Fields using Depth Oracle Networks",

DONeRF: Towards Real-Time Rendering of Compact Neural Radiance Fields using Depth Oracle Networks Project Page | Video | Presentation | Paper | Data L

Facebook Research 281 Dec 22, 2022