1398 Python Video-transformer Libraries

Official Pytorch implementation of the paper: "Locally Shifted Attention With Early Global Integration"

Locally-Shifted-Attention-With-Early-Global-Integration Pretrained models You can download all the models from here. Training Imagenet python -m torch

14 Apr 15, 2022

Implementation for paper "STAR: A Structure-aware Lightweight Transformer for Real-time Image Enhancement" (ICCV 2021).

STAR-pytorch Implementation for paper "STAR: A Structure-aware Lightweight Transformer for Real-time Image Enhancement" (ICCV 2021). CVF (pdf) STAR-DC

43 Dec 21, 2022

This project aims to explore the deployment of Swin-Transformer based on TensorRT, including the test results of FP16 and INT8.

Swin Transformer This project aims to explore the deployment of SwinTransformer based on TensorRT, including the test results of FP16 and INT8. Introd

87 Dec 21, 2022

OSLO: Open Source framework for Large-scale transformer Optimization

O S L O Open Source framework for Large-scale transformer Optimization What's New: December 21, 2021 Released OSLO 1.0. What is OSLO about? OSLO is a

280 Nov 24, 2022

The code for paper "Contrastive Spatio-Temporal Pretext Learning for Self-supervised Video Representation" which is accepted by AAAI 2022

Contrastive Spatio Temporal Pretext Learning for Self-supervised Video Representation (AAAI 2022) The code for paper "Contrastive Spatio-Temporal Pret

8 Jun 30, 2022

TransZero++: Cross Attribute-guided Transformer for Zero-Shot Learning

TransZero++ This repository contains the testing code for the paper "TransZero++: Cross Attribute-guided Transformer for Zero-Shot Learning" submitted

6 Aug 16, 2022

Video Stream: an Advanced Telegram Bot that's allow you to play Video & Music on Telegram Group Video Chat

Video Stream is an Advanced Telegram Bot that's allow you to play Video & Music on Telegram Group Video Chat 🧪 Get SESSION_NAME from below: Pyrogram

6 Feb 8, 2022

MISSFormer: An Effective Medical Image Segmentation Transformer

MISSFormer Code for paper "MISSFormer: An Effective Medical Image Segmentation Transformer". Please read our preprint at the following link: paper_add

22 Dec 24, 2022

Скрипт который выводит видео в консоль. Ничего лишнего)

video-to-ascii Скрипт который выводит видео в консоль. Ничего лишнего) Требования Минимальное разрешение экрана: 1280x720 Видео в качестве 360p 10-45f

155 Nov 28, 2022

TensorFlow implementation of "TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?"

TokenLearner: What Can 8 Learned Tokens Do for Images and Videos? Source: Improving Vision Transformer Efficiency and Accuracy by Learning to Tokenize

23 Dec 24, 2022

A Telegram bot to transcribe audio, video and image into text.

Transcriber Bot A Telegram bot to transcribe audio, video and image into text. Deploy to Heroku Local Deploying Install the FFmpeg. Make sure you have

10 Dec 19, 2022

An Telegram Bot By @ZauteKm To Stream Videos In Telegram Voice Chat Of Both Groups & Channels. Supports Live Streams, YouTube Videos & Telegram Media !!

Telegram Video Stream Bot (Py-TgCalls) An Telegram Bot By @ZauteKm To Stream Videos In Telegram Voice Chat Of Both Groups & Channels. Supports Live St

14 Oct 21, 2022

Convert human motion from video to .bvh

video_to_bvh Convert human motion from video to .bvh with Google Colab Usage 1. Open video_to_bvh.ipynb in Google Colab Go to https://colab.research.g

306 Dec 10, 2022

Efficient 3D human pose estimation in video using 2D keypoint trajectories

3D human pose estimation in video with temporal convolutions and semi-supervised training This is the implementation of the approach described in the

3.1k Dec 29, 2022

Implementing Vision Transformer (ViT) in PyTorch

Lightning-Hydra-Template A clean and scalable template to kickstart your deep learning project 🚀 ⚡ 🔥 Click on Use this template to initialize new re

2 Dec 24, 2021

A python program to download one or multiple videos from YouTube.

YouTube-Video-Downloader A python program to download one or multiple videos from YouTube. Quick Start guide First Clone The Project git clone https:/

1 Sep 11, 2022

Spatio-Temporal Entropy Model (STEM) for end-to-end leaned video compression.

Spatio-Temporal Entropy Model A Pytorch Reproduction of Spatio-Temporal Entropy Model (STEM) for end-to-end leaned video compression. More details can

16 Nov 28, 2022

Automatically remove the mosaics in images and videos, or add mosaics to them.

1.4k Dec 30, 2022

Mixed Transformer UNet for Medical Image Segmentation

MT-UNet Update 2021/11/19 Thank you for your interest in our work. We have uploaded the code of our MTUNet to help peers conduct further research on i

92 Dec 25, 2022

[ICCV 2021] Target Adaptive Context Aggregation for Video Scene Graph Generation

Target Adaptive Context Aggregation for Video Scene Graph Generation This is a PyTorch implementation for Target Adaptive Context Aggregation for Vide

Multimedia Computing Group, Nanjing University

44 Dec 14, 2022

[AAAI 2022] Negative Sample Matters: A Renaissance of Metric Learning for Temporal Grounding

[AAAI 2022] Negative Sample Matters: A Renaissance of Metric Learning for Temporal Grounding Official Pytorch implementation of Negative Sample Matter

69 Dec 26, 2022

Code for "Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose"

Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose We provide PyTorch implementations for our arxiv paper "Audio-dr

497 Jan 9, 2023

AudioDVP:Photorealistic Audio-driven Video Portraits

AudioDVP This is the official implementation of Photorealistic Audio-driven Video Portraits. Major Requirements Ubuntu = 18.04 PyTorch = 1.2 GCC =

232 Jan 3, 2023

This is the official PyTorch implementation of the CVPR 2020 paper "TransMoMo: Invariance-Driven Unsupervised Video Motion Retargeting".

TransMoMo: Invariance-Driven Unsupervised Video Motion Retargeting Project Page | YouTube | Paper This is the official PyTorch implementation of the C

330 Dec 11, 2022

pyYotubemanager is full web automated bot capable of General tasks like:- Uploading a Video , Downloading , adding Title , Description , Listing types , adding Thumbnail

PyYoutubemanager Explore the docs » View Demo · Report Bug · Request Feature About The Project PyYotubemanager is full web automated bot capable of Ge

4 Jun 29, 2022

CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation

[ICCV2021] TransReID: Transformer-based Object Re-Identification [pdf] The official repository for TransReID: Transformer-based Object Re-Identificati

569 Dec 30, 2022

Wonkey - an open source programming language for the creation of cross-platform video games

Wonkey Programming Language Wonkey is an open source programming language for the creation of cross-platform video games, highly inspired by the “Blit

110 Nov 9, 2022

This repository builds a basic vision transformer from scratch so that one beginner can understand the theory of vision transformer.

vision-transformer-from-scratch This repository includes several kinds of vision transformers from scratch so that one beginner can understand the the

1 Dec 24, 2021

A Transformer-Based Feature Segmentation and Region Alignment Method For UAV-View Geo-Localization

University1652-Baseline [Paper] [Slide] [Explore Drone-view Data] [Explore Satellite-view Data] [Explore Street-view Data] [Video Sample] [中文介绍] This

335 Jan 6, 2023

Official PyTorch implementation of SegFormer

SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers Figure 1: Performance of SegFormer-B0 to SegFormer-B5. Project page

1.4k Dec 29, 2022

Rotary Transformer is an MLM pre-trained language model with rotary position embedding (RoPE)

[中文|English] Rotary Transformer Rotary Transformer is an MLM pre-trained language model with rotary position embedding (RoPE). The RoPE is a relative

211 Dec 8, 2021

Model parallel transformers in JAX and Haiku

Table of contents Mesh Transformer JAX Updates Pretrained Models GPT-J-6B Links Acknowledgments License Model Details Zero-Shot Evaluations Architectu

4.9k Jan 4, 2023

Unofficial implementation of Google's FNet: Mixing Tokens with Fourier Transforms

FNet: Mixing Tokens with Fourier Transforms Pytorch implementation of Fnet : Mixing Tokens with Fourier Transforms. Citation: @misc{leethorp2021fnet,

217 Dec 5, 2022

CPT: A Pre-Trained Unbalanced Transformer for Both Chinese Language Understanding and Generation

CPT This repository contains code and checkpoints for CPT. CPT: A Pre-Trained Unbalanced Transformer for Both Chinese Language Understanding and Gener

342 Jan 5, 2023

Edge-Augmented Graph Transformer

Edge-augmented Graph Transformer Introduction This is the official implementation of the Edge-augmented Graph Transformer (EGT) as described in https:

21 Dec 14, 2022

FastFormers - highly efficient transformer models for NLU

FastFormers FastFormers provides a set of recipes and methods to achieve highly efficient inference of Transformer models for Natural Language Underst

678 Jan 5, 2023

Transformer training code for sequential tasks

Sequential Transformer This is a code for training Transformers on sequential tasks such as language modeling. Unlike the original Transformer archite

578 Dec 13, 2022

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

English | 简体中文 | 繁體中文 | 한국어 State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow 🤗 Transformers provides thousands of pretrained models

77.1k Dec 31, 2022

Fine-tuning scripts for evaluating transformer-based models on KLEJ benchmark.

The KLEJ Benchmark Baselines The KLEJ benchmark (Kompleksowa Lista Ewaluacji Językowych) is a set of nine evaluation tasks for the Polish language und

17 Oct 18, 2022

Large-scale pretraining for dialogue

A State-of-the-Art Large-scale Pretrained Response Generation Model (DialoGPT) This repository contains the source code and trained model for a large-

1.8k Jan 7, 2023

The code for two papers: Feedback Transformer and Expire-Span.

transformer-sequential This repo contains the code for two papers: Feedback Transformer Expire-Span The training code is structured for long sequentia

125 Dec 25, 2022

Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing

Introduction Funnel-Transformer is a new self-attention model that gradually compresses the sequence of hidden states to a shorter one and hence reduc

197 Dec 11, 2022

Sinkhorn Transformer - Practical implementation of Sparse Sinkhorn Attention

Sinkhorn Transformer This is a reproduction of the work outlined in Sparse Sinkhorn Attention, with additional enhancements. It includes a parameteriz

217 Nov 25, 2022

An implementation of the Pay Attention when Required transformer

Pay Attention when Required (PAR) Transformer-XL An implementation of the Pay Attention when Required transformer from the paper: https://arxiv.org/pd

7 Aug 11, 2022

Fully featured implementation of Routing Transformer

Routing Transformer A fully featured implementation of Routing Transformer. The paper proposes using k-means to route similar queries / keys into the

246 Jan 2, 2023

Reproducing the Linear Multihead Attention introduced in Linformer paper (Linformer: Self-Attention with Linear Complexity)

Linear Multihead Attention (Linformer) PyTorch Implementation of reproducing the Linear Multihead Attention introduced in Linformer paper (Linformer:

58 Dec 23, 2022

DeBERTa: Decoding-enhanced BERT with Disentangled Attention

DeBERTa: Decoding-enhanced BERT with Disentangled Attention This repository is the official implementation of DeBERTa: Decoding-enhanced BERT with Dis

1.2k Jan 3, 2023

Longformer: The Long-Document Transformer

Longformer Longformer and LongformerEncoderDecoder (LED) are pretrained transformer models for long documents. ***** New December 1st, 2020: Longforme

1.6k Dec 29, 2022

This repository contains the code for running the character-level Sandwich Transformers from our ACL 2020 paper on Improving Transformer Models by Reordering their Sublayers.

Improving Transformer Models by Reordering their Sublayers This repository contains the code for running the character-level Sandwich Transformers fro

53 Sep 26, 2022

Source code of paper "BP-Transformer: Modelling Long-Range Context via Binary Partitioning"

BP-Transformer This repo contains the code for our paper BP-Transformer: Modeling Long-Range Context via Binary Partition Zihao Ye, Qipeng Guo, Quan G

119 Nov 14, 2022

Conditional Transformer Language Model for Controllable Generation

CTRL - A Conditional Transformer Language Model for Controllable Generation Authors: Nitish Shirish Keskar, Bryan McCann, Lav Varshney, Caiming Xiong,

1.7k Dec 28, 2022

Awesome Treasure of Transformers Models Collection

💁 Awesome Treasure of Transformers Models for Natural Language processing contains papers, videos, blogs, official repo along with colab Notebooks. 🛫☑️

577 Jan 7, 2023

A Telegram Userbot to play Audio and Video songs / files in Telegram Voice Chats

TG-MusicPlayer A Telegram Userbot to play Audio and Video songs / files in Telegram Voice Chats. It's made with PyTgCalls and Pyrogram Requirements Py

4 Jul 30, 2022

[ACM MM 2021] Yes, "Attention is All You Need", for Exemplar based Colorization

Transformer for Image Colorization This is an implemention for Yes, "Attention Is All You Need", for Exemplar based Colorization, and the current soft

30 Dec 7, 2022

A self-hosted streaming platform with Discord authentication, auto-recording and more!

331 Dec 27, 2022

SeqFormer: a Frustratingly Simple Model for Video Instance Segmentation

SeqFormer: a Frustratingly Simple Model for Video Instance Segmentation SeqFormer SeqFormer: a Frustratingly Simple Model for Video Instance Segmentat

298 Dec 22, 2022

Mask2Former: Masked-attention Mask Transformer for Universal Image Segmentation in TensorFlow 2

Mask2Former: Masked-attention Mask Transformer for Universal Image Segmentation in TensorFlow 2 Bowen Cheng, Ishan Misra, Alexander G. Schwing, Alexan

1 Dec 16, 2021

👻🟡 Download all Snapchat video & photo memories from a data export.

Snapchat "Memories" Fetcher In compliance with the California Consumer Privacy Act of 2018 (“CCPA”), businesses which collect and store user data must

18 Dec 26, 2022

Dynamic View Synthesis from Dynamic Monocular Video

Dynamic View Synthesis from Dynamic Monocular Video Project Website | Video | Paper Dynamic View Synthesis from Dynamic Monocular Video Chen Gao, Ayus

139 Dec 28, 2022

Official Pytorch implementation for Deep Contextual Video Compression, NeurIPS 2021

Introduction Official Pytorch implementation for Deep Contextual Video Compression, NeurIPS 2021 Prerequisites Python 3.8 and conda, get Conda CUDA 11

51 Dec 3, 2022

YouTube Video Search Engine For Python

YouTube-Video-Search-Engine Introduction With the increasing demand for electronic devices, it is hard for people to choose the best products from mul

1 Dec 21, 2021

Video Matting Refinement For Python

Video-matting refinement Library (use pip to install) scikit-image numpy av matplotlib Run Static background python path_to_video.mp4 Moving backgroun

3 Jan 11, 2022

Dynamic View Synthesis from Dynamic Monocular Video

Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer This repository contains code to compute depth from a

2.3k Jan 1, 2023

Paper and Codes for “Embracing Single Stride 3D Object Detector with Sparse Transformer”

SST: Single-stride Sparse Transformer This is the official implementation of paper: Embracing Single Stride 3D Object Detector with Sparse Transformer

59 Dec 14, 2021

This is the official source code of "BiCAT: Bi-Chronological Augmentation of Transformer for Sequential Recommendation".

BiCAT This is our TensorFlow implementation for the paper: "BiCAT: Sequential Recommendation with Bidirectional Chronological Augmentation of Transfor

15 Dec 6, 2022

The code of "Dependency Learning for Legal Judgment Prediction with a Unified Text-to-Text Transformer".

Code data_preprocess.py: preprocess data for Dependent-T5. parameters.py: define parameters of Dependent-T5. train_tools.py: traning and evaluation co

1 Apr 21, 2022

BOVText: A Large-Scale, Multidimensional Multilingual Dataset for Video Text Spotting

BOVText: A Large-Scale, Bilingual Open World Dataset for Video Text Spotting Updated on December 10, 2021 (Release all dataset(2021 videos)) Updated o

47 Dec 26, 2022

nextdl - download videos from youtube.com or other video platforms

3 Feb 2, 2022

Rune - a video miniplayer made with Python.

1 Dec 13, 2021

(Preprint) Official PyTorch implementation of "How Do Vision Transformers Work?"

656 Dec 30, 2022

Codes to pre-train T5 (Text-to-Text Transfer Transformer) models pre-trained on Japanese web texts

t5-japanese Codes to pre-train T5 (Text-to-Text Transfer Transformer) models pre-trained on Japanese web texts. The following is a list of models that

1 Dec 13, 2021

A Vision Transformer approach that uses concatenated query and reference images to learn the relationship between query and reference images directly.

24 Dec 13, 2022

CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation

CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation [arxiv] This is the official repository for CDTrans: Cross-domain Transformer for

238 Dec 22, 2022

PyTorch implementation of a collections of scalable Video Transformer Benchmarks.

PyTorch implementation of Video Transformer Benchmarks This repository is mainly built upon Pytorch and Pytorch-Lightning. We wish to maintain a colle

156 Jan 8, 2023

Continuous Augmented Positional Embeddings (CAPE) implementation for PyTorch

PyTorch implementation of Continuous Augmented Positional Embeddings (CAPE), by Likhomanenko et al. Enhance your Transformer positional embeddings with easy-to-use augmentations!

26 Dec 13, 2022

BOVText: A Large-Scale, Multidimensional Multilingual Dataset for Video Text Spotting

BOVText: A Large-Scale, Bilingual Open World Dataset for Video Text Spotting Updated on December 10, 2021 (Release all dataset(2021 videos)) Updated o

47 Dec 26, 2022

The Pytorch implementation for "Video-Text Pre-training with Learned Regions"

Region_Learner The Pytorch implementation for "Video-Text Pre-training with Learned Regions" (arxiv) We are still cleaning up the code further and pre

9 Dec 11, 2021

[NeurIPS 2021]: Are Transformers More Robust Than CNNs? (Pytorch implementation & checkpoints)

Are Transformers More Robust Than CNNs? Pytorch implementation for NeurIPS 2021 Paper: Are Transformers More Robust Than CNNs? Our implementation is b

145 Dec 1, 2022

Simple VLC-based media player that can play multiple videos at the same time

Screenshots About Simple VLC-based media player that can play multiple videos at the same time. You can play as many videos as you like, the only limi

161 Jan 5, 2023

Embracing Single Stride 3D Object Detector with Sparse Transformer

SST: Single-stride Sparse Transformer This is the official implementation of paper: Embracing Single Stride 3D Object Detector with Sparse Transformer

385 Dec 28, 2022

Implementation of the paper Recurrent Glimpse-based Decoder for Detection with Transformer.

REGO-Deformable DETR By Zhe Chen, Jing Zhang, and Dacheng Tao. This repository is the implementation of the paper Recurrent Glimpse-based Decoder for

33 Nov 30, 2022

A new video text spotting framework with Transformer

TransVTSpotter: End-to-end Video Text Spotter with Transformer Introduction A Multilingual, Open World Video Text Dataset and End-to-end Video Text Sp

67 Jan 3, 2023

Contrastive Learning of Image Representations with Cross-Video Cycle-Consistency

Contrastive Learning of Image Representations with Cross-Video Cycle-Consistency This is a official implementation of the CycleContrast introduced in

13 Nov 14, 2022

K-PLUG: Knowledge-injected Pre-trained Language Model for Natural Language Understanding and Generation in E-Commerce (EMNLP Founding 2021)

Introduction K-PLUG: Knowledge-injected Pre-trained Language Model for Natural Language Understanding and Generation in E-Commerce. Installation PyTor

21 Nov 16, 2022

Official Code for VideoLT: Large-scale Long-tailed Video Recognition (ICCV 2021)

Pytorch Code for VideoLT [Website][Paper] Updates [10/29/2021] Features uploaded to Google Drive, for access please send us an e-mail: zhangxing18 at

26 Sep 18, 2022

State-of-the-art NLP through transformer models in a modular design and consistent APIs.

Trapper (Transformers wRAPPER) Trapper is an NLP library that aims to make it easier to train transformer based models on downstream tasks. It wraps h

42 Sep 21, 2022

An API that allows you to get full information about TikTok videos

TikTok-API An API that allows you to get full information about TikTok videos without using any third party sources and only the TikTok API. ##API onl

13 Dec 20, 2021

Video Translation Into Text

2021/12/9 The project has been updated Added a home screen Just drag it onto the screen The final results \ 2021/12/9 项目已更新添加了主界面拖到即可最后结果 \ Using t

10 Mar 12, 2022

SimMIM: A Simple Framework for Masked Image Modeling

SimMIM By Zhenda Xie*, Zheng Zhang*, Yue Cao*, Yutong Lin, Jianmin Bao, Zhuliang Yao, Qi Dai and Han Hu*. This repo is the official implementation of

181 Dec 10, 2021

ViSER: Video-Specific Surface Embeddings for Articulated 3D Shape Reconstruction

ViSER: Video-Specific Surface Embeddings for Articulated 3D Shape Reconstruction. NeurIPS 2021.

59 Nov 25, 2022

An end to end ASR Transformer model training repo

END TO END ASR TRANSFORMER 本项目基于transformer 6*encoder+6*decoder的基本结构构造的端到端的语音识别系统 Model Instructions 1.数据准备: 自行下载数据，遵循文件结构如下： ├── data │ ├── train │

10 Jul 19, 2022

Fast and Context-Aware Framework for Space-Time Video Super-Resolution (VCIP 2021)

Fast and Context-Aware Framework for Space-Time Video Super-Resolution Preparation Dependencies PyTorch 1.2.0 CUDA 10.0 DCNv2 cd model/DCNv2 bash make

1 Mar 29, 2022

Godot RL Agents is a fully Open Source packages that allows video game creators

Godot RL Agents The Godot RL Agents is a fully Open Source packages that allows video game creators, AI researchers and hobbiest the opportunity to le

326 Dec 30, 2022

RaceBERT -- A transformer based model to predict race and ethnicty from names

RaceBERT -- A transformer based model to predict race and ethnicty from names Installation pip install racebert Using a virtual environment is highly

3 Nov 2, 2022

[AAAI2022] Source code for our paper《Suppressing Static Visual Cues via Normalizing Flows for Self-Supervised Video Representation Learning》

SSVC The source code for paper [Suppressing Static Visual Cues via Normalizing Flows for Self-Supervised Video Representation Learning] samples of the

7 Oct 26, 2022

Python Video-transformer Resources

Python video-transformer Libraries

Official Pytorch implementation of the paper: "Locally Shifted Attention With Early Global Integration"

Implementation for paper "STAR: A Structure-aware Lightweight Transformer for Real-time Image Enhancement" (ICCV 2021).

This project aims to explore the deployment of Swin-Transformer based on TensorRT, including the test results of FP16 and INT8.

OSLO: Open Source framework for Large-scale transformer Optimization

The code for paper "Contrastive Spatio-Temporal Pretext Learning for Self-supervised Video Representation" which is accepted by AAAI 2022

TransZero++: Cross Attribute-guided Transformer for Zero-Shot Learning

Video Stream: an Advanced Telegram Bot that's allow you to play Video & Music on Telegram Group Video Chat

MISSFormer: An Effective Medical Image Segmentation Transformer

Скрипт который выводит видео в консоль. Ничего лишнего)

TensorFlow implementation of "TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?"

A Telegram bot to transcribe audio, video and image into text.

An Telegram Bot By @ZauteKm To Stream Videos In Telegram Voice Chat Of Both Groups & Channels. Supports Live Streams, YouTube Videos & Telegram Media !!

Convert human motion from video to .bvh

Efficient 3D human pose estimation in video using 2D keypoint trajectories

Implementing Vision Transformer (ViT) in PyTorch

A python program to download one or multiple videos from YouTube.

Spatio-Temporal Entropy Model (STEM) for end-to-end leaned video compression.

Automatically remove the mosaics in images and videos, or add mosaics to them.

Mixed Transformer UNet for Medical Image Segmentation

[ICCV 2021] Target Adaptive Context Aggregation for Video Scene Graph Generation

[AAAI 2022] Negative Sample Matters: A Renaissance of Metric Learning for Temporal Grounding

Code for "Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose"

AudioDVP:Photorealistic Audio-driven Video Portraits

This is the official PyTorch implementation of the CVPR 2020 paper "TransMoMo: Invariance-Driven Unsupervised Video Motion Retargeting".

pyYotubemanager is full web automated bot capable of General tasks like:- Uploading a Video , Downloading , adding Title , Description , Listing types , adding Thumbnail

CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation

Wonkey - an open source programming language for the creation of cross-platform video games

This repository builds a basic vision transformer from scratch so that one beginner can understand the theory of vision transformer.

A Transformer-Based Feature Segmentation and Region Alignment Method For UAV-View Geo-Localization

Official PyTorch implementation of SegFormer

Rotary Transformer is an MLM pre-trained language model with rotary position embedding (RoPE)

Model parallel transformers in JAX and Haiku

Unofficial implementation of Google's FNet: Mixing Tokens with Fourier Transforms

CPT: A Pre-Trained Unbalanced Transformer for Both Chinese Language Understanding and Generation

Edge-Augmented Graph Transformer

FastFormers - highly efficient transformer models for NLU

Transformer training code for sequential tasks

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Fine-tuning scripts for evaluating transformer-based models on KLEJ benchmark.

Large-scale pretraining for dialogue

The code for two papers: Feedback Transformer and Expire-Span.

Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing

Sinkhorn Transformer - Practical implementation of Sparse Sinkhorn Attention

An implementation of the Pay Attention when Required transformer

Fully featured implementation of Routing Transformer

Reproducing the Linear Multihead Attention introduced in Linformer paper (Linformer: Self-Attention with Linear Complexity)

DeBERTa: Decoding-enhanced BERT with Disentangled Attention

Longformer: The Long-Document Transformer

This repository contains the code for running the character-level Sandwich Transformers from our ACL 2020 paper on Improving Transformer Models by Reordering their Sublayers.

Source code of paper "BP-Transformer: Modelling Long-Range Context via Binary Partitioning"

Conditional Transformer Language Model for Controllable Generation

Awesome Treasure of Transformers Models Collection

A Telegram Userbot to play Audio and Video songs / files in Telegram Voice Chats

[ACM MM 2021] Yes, "Attention is All You Need", for Exemplar based Colorization

A self-hosted streaming platform with Discord authentication, auto-recording and more!

SeqFormer: a Frustratingly Simple Model for Video Instance Segmentation

Mask2Former: Masked-attention Mask Transformer for Universal Image Segmentation in TensorFlow 2

👻🟡 Download all Snapchat video & photo memories from a data export.

Dynamic View Synthesis from Dynamic Monocular Video

Official Pytorch implementation for Deep Contextual Video Compression, NeurIPS 2021

YouTube Video Search Engine For Python

Video Matting Refinement For Python

Dynamic View Synthesis from Dynamic Monocular Video

Paper and Codes for “Embracing Single Stride 3D Object Detector with Sparse Transformer”

This is the official source code of "BiCAT: Bi-Chronological Augmentation of Transformer for Sequential Recommendation".

The code of "Dependency Learning for Legal Judgment Prediction with a Unified Text-to-Text Transformer".

BOVText: A Large-Scale, Multidimensional Multilingual Dataset for Video Text Spotting

nextdl - download videos from youtube.com or other video platforms

Rune - a video miniplayer made with Python.

(Preprint) Official PyTorch implementation of "How Do Vision Transformers Work?"

Codes to pre-train T5 (Text-to-Text Transfer Transformer) models pre-trained on Japanese web texts

A Vision Transformer approach that uses concatenated query and reference images to learn the relationship between query and reference images directly.

CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation

PyTorch implementation of a collections of scalable Video Transformer Benchmarks.

Continuous Augmented Positional Embeddings (CAPE) implementation for PyTorch

BOVText: A Large-Scale, Multidimensional Multilingual Dataset for Video Text Spotting

The Pytorch implementation for "Video-Text Pre-training with Learned Regions"

[NeurIPS 2021]: Are Transformers More Robust Than CNNs? (Pytorch implementation & checkpoints)