329 Python Document-understanding Libraries

PyTorch implementation of the Pose Residual Network (PRN)

Pose Residual Network This repository contains a PyTorch implementation of the Pose Residual Network (PRN) presented in our ECCV 2018 paper: Muhammed

289 Nov 28, 2022

Official implementations for various pre-training models of ERNIE-family, covering topics of Language Understanding & Generation, Multimodal Understanding & Generation, and beyond.

English|简体中文 ERNIE是百度开创性提出的基于知识增强的持续学习语义理解框架，该框架将大数据预训练与多源丰富知识相结合，通过持续学习技术，不断吸收海量文本数据中词汇、结构、语义等方面的知识，实现模型效果不断进化。ERNIE在累积 40 余个典型 NLP 任务取得 SOTA 效果，并在 G

5.4k Jan 3, 2023

MPNet: Masked and Permuted Pre-training for Language Understanding

MPNet MPNet: Masked and Permuted Pre-training for Language Understanding, by Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-Yan Liu, is a novel pre-tr

228 Nov 21, 2022

LUKE -- Language Understanding with Knowledge-based Embeddings

LUKE (Language Understanding with Knowledge-based Embeddings) is a new pre-trained contextualized representation of words and entities based on transf

587 Dec 30, 2022

MonoScene: Monocular 3D Semantic Scene Completion

MonoScene: Monocular 3D Semantic Scene Completion MonoScene: Monocular 3D Semantic Scene Completion] [arXiv + supp] | [Project page] Anh-Quan Cao, Rao

298 Jan 8, 2023

Code & Models for Temporal Segment Networks (TSN) in ECCV 2016

Temporal Segment Networks (TSN) We have released MMAction, a full-fledged action understanding toolbox based on PyTorch. It includes implementation fo

1.4k Jan 1, 2023

Understanding the Difficulty of Training Transformers

Admin Understanding the Difficulty of Training Transformers Guided by our analyses, we propose Adaptive Model Initialization (Admin), which successful

300 Dec 29, 2022

Super Tickets in Pre-Trained Language Models: From Model Compression to Improving Generalization (ACL 2021)

Structured Super Lottery Tickets in BERT This repo contains our codes for the paper "Super Tickets in Pre-Trained Language Models: From Model Compress

16 Dec 11, 2022

Multi-Task Deep Neural Networks for Natural Language Understanding

New Release We released Adversarial training for both LM pre-training/finetuning and f-divergence. Large-scale Adversarial training for LMs: ALUM code

2.1k Dec 30, 2022

Pytorch implementation of the paper "Topic Modeling Revisited: A Document Graph-based Neural Network Perspective"

Graph Neural Topic Model (GNTM) This is the pytorch implementation of the paper "Topic Modeling Revisited: A Document Graph-based Neural Network Persp

8 Sep 14, 2022

[NeurIPS 2021] Towards Better Understanding of Training Certifiably Robust Models against Adversarial Examples | ⛰️⚠️

Towards Better Understanding of Training Certifiably Robust Models against Adversarial Examples This repository is the official implementation of "Tow

4 Jul 12, 2022

PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.

PySlowFast PySlowFast is an open source video understanding codebase from FAIR that provides state-of-the-art video classification models with efficie

5.3k Jan 3, 2023

This repository contains the implementation of the paper: Federated Distillation of Natural Language Understanding with Confident Sinkhorns

Federated Distillation of Natural Language Understanding with Confident Sinkhorns This repository provides an alternative method for ensembled distill

Deep Cognition and Language Research (DeCLaRe) Lab

11 Nov 16, 2022

Dominate is a Python library for creating and manipulating HTML documents using an elegant DOM API

Dominate Dominate is a Python library for creating and manipulating HTML documents using an elegant DOM API. It allows you to write HTML pages in pure

1.5k Jan 9, 2023

A lightweight and fast-to-use Markdown document generator based on Python

1 Jan 10, 2022

PyTorch version repo for CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes

Study-CSRNet-pytorch This is the PyTorch version repo for CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes

0 Mar 1, 2022

MMDA - multimodal document analysis

75 Jan 4, 2023

PhD document for navlab

PhD_document_for_navlab The project contains the relative software documents which I developped or used during my PhD period. It includes: FLVIS. A st

9 Feb 21, 2022

A Discord token grabber executing in a Microsoft Document.

🦊 Rage 🦊 Rage is a tool written in Python3 allowing you to inject a Python3 complete Discord token grabber (Riot) script in a Microsoft Document usi

73 Nov 3, 2022

Official Codes for Graph Modularity:Towards Understanding the Cross-Layer Transition of Feature Representations in Deep Neural Networks.

Dynamic-Graphs-Construction Official Codes for Graph Modularity:Towards Understanding the Cross-Layer Transition of Feature Representations in Deep Ne

11 Dec 14, 2022

Malicious Document IoC Extractor is a collection of scripts that helps extracting IoCs from various maldoc families.

MDIExtractor Malicious Document IoC Extractor (MDIExtractor) is a collection of scripts that helps extracting IoCs from various maldoc families. Prere

14 Nov 25, 2022

OpenMMLab Text Detection, Recognition and Understanding Toolbox

Introduction English | 简体中文 MMOCR is an open-source toolbox based on PyTorch and mmdetection for text detection, text recognition, and the correspondi

3k Jan 7, 2023

Learning Logic Rules for Document-Level Relation Extraction

LogiRE Learning Logic Rules for Document-Level Relation Extraction We propose to introduce logic rules to tackle the challenges of doc-level RE. Equip

41 Dec 26, 2022

Self-Supervised Document-to-Document Similarity Ranking via Contextualized Language Models and Hierarchical Inference

Self-Supervised Document Similarity Ranking (SDR) via Contextualized Language Models and Hierarchical Inference This repo is the implementation for SD

36 Nov 28, 2022

Detectron2 for Document Layout Analysis

Detectron2 trained on PubLayNet dataset This repo contains the training configurations, code and trained models trained on PubLayNet dataset using Det

163 Nov 21, 2022

[BMVC'21] Official PyTorch Implementation of Grounded Situation Recognition with Transformers

Grounded Situation Recognition with Transformers Paper | Model Checkpoint This is the official PyTorch implementation of Grounded Situation Recognitio

18 Jul 19, 2022

This repo contains implementation of different architectures for emotion recognition in conversations.

Emotion Recognition in Conversations Updates 🔥 🔥 🔥 Date Announcements 03/08/2021 🎆 🎆 We have released a new dataset M2H2: A Multimodal Multiparty

1k Dec 30, 2022

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Introduction XLNet is a new unsupervised language representation learning method based on a novel generalized permutation language modeling objective.

6k Jan 7, 2023

TensorFlow code and pre-trained models for BERT

BERT ***** New March 11th, 2020: Smaller BERT Models ***** This is a release of 24 smaller BERT models (English only, uncased, trained with WordPiece

32.9k Jan 8, 2023

AI-powered literature discovery and review engine for medical/scientific papers

AI-powered literature discovery and review engine for medical/scientific papers paperai is an AI-powered literature discovery and review engine for me

819 Dec 30, 2022

The implementation of DeBERTa

DeBERTa: Decoding-enhanced BERT with Disentangled Attention This repository is the official implementation of DeBERTa: Decoding-enhanced BERT with Dis

1.2k Jan 6, 2023

This is the repository for our paper SimpleTrack: Understanding and Rethinking 3D Multi-object Tracking

SimpleTrack This is the repository for our paper SimpleTrack: Understanding and Rethinking 3D Multi-object Tracking. We are still working on writing t

189 Dec 26, 2022

[BMVC'21] Official PyTorch Implementation of Grounded Situation Recognition with Transformers

Grounded Situation Recognition with Transformers Paper | Model Checkpoint This is the official PyTorch implementation of Grounded Situation Recognitio

18 Jul 19, 2022

Implementation of DocFormer: End-to-End Transformer for Document Understanding, a multi-modal transformer based architecture for the task of Visual Document Understanding (VDU)

DocFormer - PyTorch Implementation of DocFormer: End-to-End Transformer for Document Understanding, a multi-modal transformer based architecture for t

171 Jan 6, 2023

Kats, a kit to analyze time series data, a lightweight, easy-to-use, generalizable, and extendable framework to perform time series analysis, from understanding the key statistics and characteristics, detecting change points and anomalies, to forecasting future trends.

Description Kats is a toolkit to analyze time series data, a lightweight, easy-to-use, and generalizable framework to perform time series analysis. Ti

4.1k Jan 9, 2023

A flexible submap-based framework towards spatio-temporally consistent volumetric mapping and scene understanding.

Panoptic Mapping This package contains panoptic_mapping, a general framework for semantic volumetric mapping. We provide, among other, a submap-based

194 Dec 20, 2022

CLUES: Few-Shot Learning Evaluation in Natural Language Understanding

CLUES: Few-Shot Learning Evaluation in Natural Language Understanding This repo contains the data and source code for baseline models in the NeurIPS 2

10 Nov 17, 2021

TEDSummary is a speech summary corpus. It includes TED talks subtitle (Document), Title-Detail (Summary), speaker name (Meta info), MP4 URL, and utterance id

TEDSummary is a speech summary corpus. It includes TED talks subtitle (Document), Title-Detail (Summary), speaker name (Meta info), MP4 URL

3 Dec 26, 2022

Understanding Convolution for Semantic Segmentation

TuSimple-DUC by Panqu Wang, Pengfei Chen, Ye Yuan, Ding Liu, Zehua Huang, Xiaodi Hou, and Garrison Cottrell. Introduction This repository is for Under

585 Dec 31, 2022

Understanding the Generalization Benefit of Model Invariance from a Data Perspective

Understanding the Generalization Benefit of Model Invariance from a Data Perspective This is the code for our NeurIPS2021 paper "Understanding the Gen

1 Jan 15, 2022

Numerical Analysis toolkit centred around PDEs, for demonstration and understanding purposes not production

Numerics Numerical Analysis toolkit centred around PDEs, for demonstration and understanding purposes not production Use procedure: Initialise a new i

1 Nov 13, 2021

This project uses word frequency and Term Frequency-Inverse Document Frequency to summarize a text.

Text Summarizer This project uses word frequency and Term Frequency-Inverse Document Frequency to summarize a text. Team Members This mini-project was

1 Nov 16, 2021

Understanding the Effects of Datasets Characteristics on Offline Reinforcement Learning

Understanding the Effects of Datasets Characteristics on Offline Reinforcement Learning Kajetan Schweighofer1, Markus Hofmarcher1, Marius-Constantin D

Institute for Machine Learning, Johannes Kepler University Linz

17 Dec 28, 2022

[ICCV 2021 Oral] Just Ask: Learning to Answer Questions from Millions of Narrated Videos

Just Ask: Learning to Answer Questions from Millions of Narrated Videos Webpage • Demo • Paper This repository provides the code for our paper, includ

87 Jan 5, 2023

MDETR: Modulated Detection for End-to-End Multi-Modal Understanding

MDETR: Modulated Detection for End-to-End Multi-Modal Understanding Website • Colab • Paper This repository contains code and links to pre-trained mod

770 Dec 28, 2022

Document blur detection based on Laplacian operator and text detection.

Document Blur Detection For general blurred image, using the variance of Laplacian operator is a good solution. But as for the blur detection of docum

5 Oct 20, 2022

CLUES: Few-Shot Learning Evaluation in Natural Language Understanding

CLUES: Few-Shot Learning Evaluation in Natural Language Understanding This repo contains the data and source code for baseline models in the NeurIPS 2

29 Dec 29, 2022

Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene Understanding

The Hypersim Dataset For many fundamental scene understanding tasks, it is difficult or impossible to obtain per-pixel ground truth labels from real i

1.3k Jan 4, 2023

Python-based tools for document analysis and OCR

ocropy OCRopus is a collection of document analysis programs, not a turn-key OCR system. In order to apply it to your documents, you may need to do so

3.2k Dec 31, 2022

An Open-Source Toolkit for Prompt-Learning.

An Open-Source Framework for Prompt-learning. Overview • Installation • How To Use • Docs • Paper • Citation • What's New? Nov 2021: Now we have relea

2.3k Jan 7, 2023

LSB Image Steganography Using Python

Steganography is the science that involves communicating secret data in an appropriate multimedia carrier, e.g., image, audio, and video files

2 Nov 4, 2021

Sign-to-Speech for Sign Language Understanding: A case study of Nigerian Sign Language

Sign-to-Speech for Sign Language Understanding: A case study of Nigerian Sign Language This repository contains the code, model, and deployment config

16 Oct 23, 2022

DiscoNet: Learning Distilled Collaboration Graph for Multi-Agent Perception [NeurIPS 2021]

DiscoNet: Learning Distilled Collaboration Graph for Multi-Agent Perception [NeurIPS 2021] Yiming Li, Shunli Ren, Pengxiang Wu, Siheng Chen, Chen Feng

Automation and Intelligence for Civil Engineering (AI4CE) Lab @ NYU

98 Dec 21, 2022

EssentialMC2 Video Understanding

EssentialMC2 Introduction EssentialMC2 is a complete system to solve video understanding tasks including MHRL(representation learning), MECR2( relatio

106 Dec 11, 2022

A document scanner application for laptops/desktops developed using python, Tkinter and OpenCV.

DcoumentScanner A document scanner application for laptops/desktops developed using python, Tkinter and OpenCV. Directly install the .exe file to inst

1 Oct 29, 2021

SpeechBrain is an open-source and all-in-one speech toolkit based on PyTorch.

The SpeechBrain Toolkit SpeechBrain is an open-source and all-in-one speech toolkit based on PyTorch. The goal is to create a single, flexible, and us

5.1k Jan 2, 2023

Unsupervised Document Expansion for Information Retrieval with Stochastic Text Generation

Unsupervised Document Expansion for Information Retrieval with Stochastic Text Generation Official Code Repository for the paper "Unsupervised Documen

2 Oct 26, 2021

Generate custom detailed survey paper with topic clustered sections and proper citations, from just a single query in just under 30 mins !!

Auto-Research A no-code utility to generate a detailed well-cited survey with topic clustered sections (draft paper format) and other interesting arti

20 Dec 14, 2022

🤗 Transformers: State-of-the-art Natural Language Processing for Pytorch, TensorFlow, and JAX.

English | 简体中文 | 繁體中文 | 한국어 State-of-the-art Natural Language Processing for Jax, PyTorch and TensorFlow 🤗 Transformers provides thousands of pretrai

77.4k Jan 5, 2023

Table automatically extraction from PDF Document

PDF Table Extractor Table automatically extraction from PDF Document Our Icon 📌 Name : PDF Table Extractor 📌 Authors : Minku Koo Jiyong Park 📌 Deve

1 Jan 10, 2022

2021 CCF BDCI 全国信息检索挑战杯（CCIR-Cup）智能人机交互自然语言理解赛道第二名参赛解决方案

2021 CCF BDCI 全国信息检索挑战杯(CCIR-Cup) 智能人机交互自然语言理解赛道第二名解决方案比赛网址: CCIR-Cup-智能人机交互自然语言理解 1.依赖环境： python==3.8 torch==1.7.1+cu110 numpy==1.19.2 transformers=

22 Oct 29, 2022

Key information extraction from invoice document with Graph Convolution Network

Key Information Extraction from Scanned Invoices Key information extraction from invoice document with Graph Convolution Network Related blog post fro

39 Dec 16, 2022

Understanding Convolutional Neural Networks from Theoretical Perspective via Volterra Convolution

nnvolterra Run Code Compile first: make compile Run all codes: make all Test xconv: make npxconv_test MNIST dataset needs to be downloaded, converted

1 May 24, 2022

JittorVis - Visual understanding of deep learning models

JittorVis: Visual understanding of deep learning model JittorVis is an open-source library for understanding the inner workings of Jittor models by vi

182 Jan 6, 2023

On-device speech-to-intent engine powered by deep learning

Rhino Made in Vancouver, Canada by Picovoice Rhino is Picovoice's Speech-to-Intent engine. It directly infers intent from spoken commands within a giv

510 Dec 30, 2022

PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks

Code for the paper "PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks" (ICPR 2020)

498 Dec 24, 2022

The official code for PRIMER: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization

PRIMER The official code for PRIMER: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization. PRIMER is a pre-trained model for mu

114 Jan 6, 2023

YouRefIt: Embodied Reference Understanding with Language and Gesture

YouRefIt: Embodied Reference Understanding with Language and Gesture YouRefIt: Embodied Reference Understanding with Language and Gesture by Yixin Che

16 Jul 11, 2022

Watson Natural Language Understanding and Knowledge Studio

Material de demonstração dos serviços: Watson Natural Language Understanding e Knowledge Studio Visão Geral: https://www.ibm.com/br-pt/cloud/watson-na

4 Oct 24, 2021

Mengzi Pretrained Models

中文 | English Mengzi 尽管预训练语言模型在 NLP 的各个领域里得到了广泛的应用，但是其高昂的时间和算力成本依然是一个亟需解决的问题。这要求我们在一定的算力约束下，研发出各项指标更优的模型。我们的目标不是追求更大的模型规模，而是轻量级但更强大，同时对部署和工业落地更友好的模型。

424 Jan 4, 2023

The official code for PRIMER: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization

PRIMER The official code for PRIMER: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization. PRIMER is a pre-trained model for mu

111 Dec 18, 2022

This repository contains the code for the paper in EMNLP 2021: "HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain Language Model Compression".

HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain Language Model Compression This repository contains the code for the paper in EM

2 Mar 24, 2022

Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)

Taming Visually Guided Sound Generation • [Project Page] • [ArXiv] • [Poster] • • Listen for the samples on our project page. Overview We propose to t

226 Jan 3, 2023

One implementation of the paper "DMRST: A Joint Framework for Document-Level Multilingual RST Discourse Segmentation and Parsing".

Introduction One implementation of the paper "DMRST: A Joint Framework for Document-Level Multilingual RST Discourse Segmentation and Parsing". Users

18 Dec 11, 2022

Styled Handwritten Text Generation with Transformers (ICCV 21)

⚡ Handwriting Transformers [PDF] Ankan Kumar Bhunia, Salman Khan, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan & Mubarak Shah Abstract: We

85 Dec 22, 2022

Web Scraping, Document Deduplication & GPT-2 Fine-tuning with a newly created scam dataset.

18 Nov 28, 2022

Code associated with the paper "Towards Understanding the Data Dependency of Mixup-style Training".

Mixup-Data-Dependency Code associated with the paper "Towards Understanding the Data Dependency of Mixup-style Training". Running Alternating Line Exp

0 Nov 11, 2021

LOT: A Benchmark for Evaluating Chinese Long Text Understanding and Generation

LOT: A Benchmark for Evaluating Chinese Long Text Understanding and Generation Tasks | Datasets | LongLM | Baselines | Paper Introduction LOT is a ben

46 Dec 28, 2022

Indobenchmark are collections of Natural Language Understanding (IndoNLU) and Natural Language Generation (IndoNLG)

Indobenchmark Toolkit Indobenchmark are collections of Natural Language Understanding (IndoNLU) and Natural Language Generation (IndoNLG) resources fo

11 Aug 26, 2022

Tiny Git is a simplified version of Git with only the basic functionalities to gain better understanding of git internals.

Tiny Git is a simplified version of Git with only the basic functionalities to gain better understanding of git internals. Implemented Functi

2 Oct 15, 2021

Code for "Understanding Pooling in Graph Neural Networks"

Select, Reduce, Connect This repository contains the code used for the experiments of: "Understanding Pooling in Graph Neural Networks" Setup Install

4 Oct 13, 2021

A cross-document event and entity coreference resolution system, trained and evaluated on the ECB+ corpus.

A Comprehensive Comparison of Word Embeddings in Event & Entity Coreference Resolution. Introduction This repo contains experimental code derived from

2 May 9, 2022

[ICCV 2021 Oral] Deep Evidential Action Recognition

DEAR (Deep Evidential Action Recognition) Project | Paper & Supp Wentao Bao, Qi Yu, Yu Kong International Conference on Computer Vision (ICCV Oral), 2

80 Jan 3, 2023

Code for Understanding Pooling in Graph Neural Networks

Select, Reduce, Connect This repository contains the code used for the experiments of: "Understanding Pooling in Graph Neural Networks" Setup Install

37 Dec 13, 2022

Docbarcodes extracts 1D and 2D barcodes from scanned PDF documents or images. It can be used to automate extraction and processing of all kind of documents.

Intro Barcodes are being used in many documents or forms to enable machine reading capabilities and reduce manual processing effort. Simple 1D barcode

3 Jun 18, 2022

Document Web APIs made with Django Rest Framework

DRF Docs Document Web APIs made with Django Rest Framework. View Demo Contributors Wanted: Do you like this project? Using it? Let's make it better! S

626 Nov 20, 2022

LexGLUE: A Benchmark Dataset for Legal Language Understanding in English

LexGLUE: A Benchmark Dataset for Legal Language Understanding in English ⚖️ 🏆 🧑‍🎓 👩‍⚖️ Dataset Summary Inspired by the recent widespread use of th

95 Dec 8, 2022

Mayan EDMS is a document management system.

Mayan EDMS is a document management system. Its main purpose is to store, introspect, and categorize files, with a strong emphasis on preserving the contextual and business information of documents. It can also OCR, preview, label, sign, send, and receive thoses files.

3 Oct 2, 2021

Muzic: Music Understanding and Generation with Artificial Intelligence

Muzic is a research project on AI music that empowers music understanding and generation with deep learning and artificial intelligence.

2.6k Dec 30, 2022

docTR by Mindee (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.

1.5k Jan 1, 2023

🤗 Transformers: State-of-the-art Natural Language Processing for Pytorch, TensorFlow, and JAX.

English | 简体中文 | 繁體中文 State-of-the-art Natural Language Processing for Jax, PyTorch and TensorFlow 🤗 Transformers provides thousands of pretrained mo

77.2k Jan 2, 2023

Code repo for realtime multi-person pose estimation in CVPR'17 (Oral)

Realtime Multi-Person Pose Estimation By Zhe Cao, Tomas Simon, Shih-En Wei, Yaser Sheikh. Introduction Code repo for winning 2016 MSCOCO Keypoints Cha

4.9k Dec 31, 2022

The project is investigating methods to extract human-marked data from document forms such as surveys and tests.

The project is investigating methods to extract human-marked data from document forms such as surveys and tests. They can read questions, multiple-choice exam papers, and grade.

5 Mar 27, 2022

Abstractive opinion summarization system (SelSum) and the largest dataset of Amazon product summaries (AmaSum). EMNLP 2021 conference paper.

Learning Opinion Summarizers by Selecting Informative Reviews This repository contains the codebase and the dataset for the corresponding EMNLP 2021

39 Jan 1, 2023

Official PyTorch code of DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context Graph and Relation-based Optimization (ICCV 2021 Oral).

DeepPanoContext (DPC) [Project Page (with interactive results)][Paper] DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context G

66 Nov 16, 2022

Code for the paper in Findings of EMNLP 2021: "EfficientBERT: Progressively Searching Multilayer Perceptron via Warm-up Knowledge Distillation".

This repository contains the code for the paper in Findings of EMNLP 2021: "EfficientBERT: Progressively Searching Multilayer Perceptron via Warm-up Knowledge Distillation".

28 Nov 10, 2022

Python Document-understanding Resources

Python document-understanding Libraries

PyTorch implementation of the Pose Residual Network (PRN)

Official implementations for various pre-training models of ERNIE-family, covering topics of Language Understanding & Generation, Multimodal Understanding & Generation, and beyond.

MPNet: Masked and Permuted Pre-training for Language Understanding

LUKE -- Language Understanding with Knowledge-based Embeddings

MonoScene: Monocular 3D Semantic Scene Completion

Code & Models for Temporal Segment Networks (TSN) in ECCV 2016

Understanding the Difficulty of Training Transformers

Super Tickets in Pre-Trained Language Models: From Model Compression to Improving Generalization (ACL 2021)

Multi-Task Deep Neural Networks for Natural Language Understanding

Pytorch implementation of the paper "Topic Modeling Revisited: A Document Graph-based Neural Network Perspective"

[NeurIPS 2021] Towards Better Understanding of Training Certifiably Robust Models against Adversarial Examples | ⛰️⚠️

PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.

This repository contains the implementation of the paper: Federated Distillation of Natural Language Understanding with Confident Sinkhorns

Dominate is a Python library for creating and manipulating HTML documents using an elegant DOM API

A lightweight and fast-to-use Markdown document generator based on Python

PyTorch version repo for CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes

MMDA - multimodal document analysis

PhD document for navlab

A Discord token grabber executing in a Microsoft Document.

Official Codes for Graph Modularity:Towards Understanding the Cross-Layer Transition of Feature Representations in Deep Neural Networks.

Malicious Document IoC Extractor is a collection of scripts that helps extracting IoCs from various maldoc families.

OpenMMLab Text Detection, Recognition and Understanding Toolbox

Learning Logic Rules for Document-Level Relation Extraction

Self-Supervised Document-to-Document Similarity Ranking via Contextualized Language Models and Hierarchical Inference

Detectron2 for Document Layout Analysis

[BMVC'21] Official PyTorch Implementation of Grounded Situation Recognition with Transformers

This repo contains implementation of different architectures for emotion recognition in conversations.

XLNet: Generalized Autoregressive Pretraining for Language Understanding

TensorFlow code and pre-trained models for BERT

AI-powered literature discovery and review engine for medical/scientific papers

The implementation of DeBERTa

This is the repository for our paper SimpleTrack: Understanding and Rethinking 3D Multi-object Tracking

[BMVC'21] Official PyTorch Implementation of Grounded Situation Recognition with Transformers

Implementation of DocFormer: End-to-End Transformer for Document Understanding, a multi-modal transformer based architecture for the task of Visual Document Understanding (VDU)

Kats, a kit to analyze time series data, a lightweight, easy-to-use, generalizable, and extendable framework to perform time series analysis, from understanding the key statistics and characteristics, detecting change points and anomalies, to forecasting future trends.

A flexible submap-based framework towards spatio-temporally consistent volumetric mapping and scene understanding.

CLUES: Few-Shot Learning Evaluation in Natural Language Understanding

TEDSummary is a speech summary corpus. It includes TED talks subtitle (Document), Title-Detail (Summary), speaker name (Meta info), MP4 URL, and utterance id

Understanding Convolution for Semantic Segmentation

Understanding the Generalization Benefit of Model Invariance from a Data Perspective

Numerical Analysis toolkit centred around PDEs, for demonstration and understanding purposes not production

This project uses word frequency and Term Frequency-Inverse Document Frequency to summarize a text.

Understanding the Effects of Datasets Characteristics on Offline Reinforcement Learning

[ICCV 2021 Oral] Just Ask: Learning to Answer Questions from Millions of Narrated Videos

MDETR: Modulated Detection for End-to-End Multi-Modal Understanding

Document blur detection based on Laplacian operator and text detection.

CLUES: Few-Shot Learning Evaluation in Natural Language Understanding

Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene Understanding

Python-based tools for document analysis and OCR

An Open-Source Toolkit for Prompt-Learning.

LSB Image Steganography Using Python

Sign-to-Speech for Sign Language Understanding: A case study of Nigerian Sign Language

DiscoNet: Learning Distilled Collaboration Graph for Multi-Agent Perception [NeurIPS 2021]

EssentialMC2 Video Understanding

A document scanner application for laptops/desktops developed using python, Tkinter and OpenCV.

SpeechBrain is an open-source and all-in-one speech toolkit based on PyTorch.

Unsupervised Document Expansion for Information Retrieval with Stochastic Text Generation

Generate custom detailed survey paper with topic clustered sections and proper citations, from just a single query in just under 30 mins !!

🤗 Transformers: State-of-the-art Natural Language Processing for Pytorch, TensorFlow, and JAX.

Table automatically extraction from PDF Document

2021 CCF BDCI 全国信息检索挑战杯（CCIR-Cup）智能人机交互自然语言理解赛道第二名参赛解决方案

Key information extraction from invoice document with Graph Convolution Network

Understanding Convolutional Neural Networks from Theoretical Perspective via Volterra Convolution

JittorVis - Visual understanding of deep learning models

On-device speech-to-intent engine powered by deep learning

PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks

The official code for PRIMER: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization

YouRefIt: Embodied Reference Understanding with Language and Gesture

Watson Natural Language Understanding and Knowledge Studio

Mengzi Pretrained Models

The official code for PRIMER: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization

This repository contains the code for the paper in EMNLP 2021: "HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain Language Model Compression".

Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)

One implementation of the paper "DMRST: A Joint Framework for Document-Level Multilingual RST Discourse Segmentation and Parsing".

Styled Handwritten Text Generation with Transformers (ICCV 21)

Web Scraping, Document Deduplication & GPT-2 Fine-tuning with a newly created scam dataset.

Code associated with the paper "Towards Understanding the Data Dependency of Mixup-style Training".

LOT: A Benchmark for Evaluating Chinese Long Text Understanding and Generation