1617 Python Train-Test-Validation-Dataset-Generation Libraries

S3-plugin is a high performance PyTorch dataset library to efficiently access datasets stored in S3 buckets.

138 Jan 3, 2023

Test utility for validating OpenAPI documentation

DRF OpenAPI Tester This is a test utility to validate DRF Test Responses against OpenAPI 2 and 3 schema. It has built-in support for: OpenAPI 2/3 yaml

106 Jan 5, 2023

Sane and flexible OpenAPI 3 schema generation for Django REST framework.

drf-spectacular Sane and flexible OpenAPI 3.0 schema generation for Django REST framework. This project has 3 goals: Extract as much schema informatio

1.4k Jan 8, 2023

Automated generation of real Swagger/OpenAPI 2.0 schemas from Django REST Framework code.

drf-yasg - Yet another Swagger generator Generate real Swagger/OpenAPI 2.0 specifications from a Django Rest Framework API. Compatible with Django Res

3k Dec 31, 2022

🎻 Modularized exploit generation framework

Modularized exploit generation framework for x86_64 binaries Overview This project is still at early stage of development, so you might want to come b

30 Jan 17, 2022

My attempt to reverse the Discord nitro token generation function.

discord-theory-I PART: I My attempt to reverse the Discord nitro token generation function. The Nitro generation tools thing is common in Discord now,

29 Aug 14, 2022

Official implementation of "StyleCariGAN: Caricature Generation via StyleGAN Feature Map Modulation" (SIGGRAPH 2021)

StyleCariGAN: Caricature Generation via StyleGAN Feature Map Modulation This repository contains the official PyTorch implementation of the following

270 Dec 30, 2022

Unique image & metadata generation using weighted layer collections.

nft-generator-py nft-generator-py is a python based NFT generator which programatically generates unique images using weighted layer files. The progra

243 Dec 31, 2022

Demonstration that AWS IAM policy evaluation docs are incorrect

The flowchart from the AWS IAM policy evaluation documentation page, as of 2021-09-12, and dating back to at least 2018-12-27, is the following: The f

15 Oct 21, 2022

Abstractive opinion summarization system (SelSum) and the largest dataset of Amazon product summaries (AmaSum). EMNLP 2021 conference paper.

Learning Opinion Summarizers by Selecting Informative Reviews This repository contains the codebase and the dataset for the corresponding EMNLP 2021

39 Jan 1, 2023

Code for the paper "Flexible Generation of Natural Language Deductions"

12 Nov 11, 2022

CDLA: A Chinese document layout analysis (CDLA) dataset

CDLA: A Chinese document layout analysis (CDLA) dataset 介绍 CDLA是一个中文文档版面分析数据集，面向中文文献类（论文）场景。包含以下10个label：正文标题图片图片标题表格表格标题页眉页脚注释公式 Text Title

84 Dec 28, 2022

[ArXiv 2021] Data-Efficient Instance Generation from Instance Discrimination

InsGen - Data-Efficient Instance Generation from Instance Discrimination Data-Efficient Instance Generation from Instance Discrimination Ceyuan Yang,

GenForce: May Generative Force Be with You

93 Dec 25, 2022

CPT: A Pre-Trained Unbalanced Transformer for Both Chinese Language Understanding and Generation

CPT This repository contains code and checkpoints for CPT. CPT: A Pre-Trained Unbalanced Transformer for Both Chinese Language Understanding and Gener

341 Dec 29, 2022

A daily updated JSON dataset of all the Open House London venues, events, and metadata

Open House London listings data All of it. Automatically scraped hourly with updates committed to git, autogenerated per-day CSV's, and autogenerated

4 Jan 1, 2022

A set of tools for converting a darknet dataset to COCO format working with YOLOX

darknet格式数据→COCO darknet训练数据目录结构（详情参见dataset/darknet）： darknet ├── class.names ├── gen_config.data ├── gen_train.txt ├── gen_valid.txt └── images

148 Jan 3, 2023

Orthogonal Jacobian Regularization for Unsupervised Disentanglement in Image Generation (ICCV 2021)

Orthogonal Jacobian Regularization for Unsupervised Disentanglement in Image Generation Home | PyTorch BigGAN Discovery | TensorFlow ProGAN Regulariza

54 Dec 30, 2022

ILVR: Conditioning Method for Denoising Diffusion Probabilistic Models (ICCV 2021 Oral)

ILVR + ADM This is the implementation of ILVR: Conditioning Method for Denoising Diffusion Probabilistic Models (ICCV 2021 Oral). This repository is h

225 Dec 28, 2022

Klara is a static analysis tools to automatic generate test case, based on SMT (z3) solver, with a powerful ast level inference system.

Automatic test case generation for python and static analysis library

253 Dec 21, 2022

This reporistory contains the test-dev data of the paper "xGQA: Cross-lingual Visual Question Answering".

18 Dec 9, 2022

This is a project for socks card label validation where the socks card is validated comparing with the correct socks card whose coordinates are stored in the database. When the test socks card is compared with the correct socks card(master socks card) the software checks whether both test and master socks card mathches or not.

Automation_in_socks_label_validation THEME: MACHINE LEARNING This is a project for socks card label validation where the socks card is validated compa

3 Apr 21, 2022

NLPIR tutorial: pretrain for IR. pre-train on raw textual corpus, fine-tune on MS MARCO Document Ranking

pretrain4ir_tutorial NLPIR tutorial: pretrain for IR. pre-train on raw textual corpus, fine-tune on MS MARCO Document Ranking 用作NLPIR实验室, Pre-training

12 Apr 7, 2022

Cookies is a project inspired by python cookiecutter but used for terraform generation.

Introduction Cookies is a project inspired by python cookiecutter but used for terraform generation. How to run your terraform After you download your

6 Mar 14, 2022

Find existing email addresses by nickname using API/SMTP checking methods without user notification. Please, don't hesitate to improve cat's job! 🐱🔎 📬

mailcat The only cat who can find existing email addresses by nickname. Usage First install requirements: pip3 install -r requirements.txt Then just

282 Dec 30, 2022

A multilingual version of MS MARCO passage ranking dataset

mMARCO A multilingual version of MS MARCO passage ranking dataset This repository presents a neural machine translation-based method for translating t

75 Dec 27, 2022

Experimental python optimistic rollup fraud-proof generation

Macula Experimental python optimistic rollup fraud-proof generation tech by @protolambda. Working on a python version for brevity and simplicity. See

30 Sep 1, 2022

Related resources for our EMNLP 2021 paper Plan-then-Generate: Controlled Data-to-Text Generation via Planning

Plan-then-Generate: Controlled Data-to-Text Generation via Planning Authors: Yixuan Su, David Vandyke, Sihui Wang, Yimai Fang, and Nigel Collier Code

61 Jan 3, 2023

Dynamic Multi-scale Filters for Semantic Segmentation (DMNet ICCV'2019)

Dynamic Multi-scale Filters for Semantic Segmentation (DMNet ICCV'2019) Introduction Official implementation of Dynamic Multi-scale Filters for Semant

23 Oct 21, 2022

This repository contains a set of codes to run (i.e., train, perform inference with, evaluate) a diarization method called EEND-vector-clustering.

EEND-vector clustering The EEND-vector clustering (End-to-End-Neural-Diarization-vector clustering) is a speaker diarization framework that integrates

45 Dec 26, 2022

Official Pytorch implementation of Test-Agnostic Long-Tailed Recognition by Test-Time Aggregating Diverse Experts with Self-Supervision.

This repository is the official Pytorch implementation of Test-Agnostic Long-Tailed Recognition by Test-Time Aggregating Diverse Experts with Self-Supervision.

101 Dec 30, 2022

load .txt to train YOLOX, same as Yolo others

YOLOX train your data you need generate data.txt like follow format (per line- one image). prepare one data.txt like this: img_path1 x1,y1,x2,y2,clas

18 Aug 18, 2022

Demonstrates how to divide a DL model into multiple IR model files (division) and introduce a simplest way to implement a custom layer works with OpenVINO IR models.

Demonstration of OpenVINO techniques - Model-division and a simplest-way to support custom layers Description: Model Optimizer in Intel(r) OpenVINO(tm

12 Nov 9, 2022

Code for lyric-section-to-comment generation based on huggingface transformers.

CommentGeneration Code for lyric-section-to-comment generation based on huggingface transformers. Migrate Guyu model and code (both 12-layers and 24-l

8 Sep 4, 2021

UniLM AI - Large-scale Self-supervised Pre-training across Tasks, Languages, and Modalities

Pre-trained (foundation) models across tasks (understanding, generation and translation), languages (100+ languages), and modalities (language, image, audio, vision + language, audio + language, etc.)

7.6k Jan 1, 2023

A library for interacting with Path of Exile game and economy data, and a unique loot filter generation framework.

wraeblast A library for interfacing with Path of Exile game and economy data, and a set of item filters geared towards trade league players. Filter Ge

29 Aug 28, 2022

neural image generation

pixray Pixray is an image generation system. It combines previous ideas including: Perception Engines which uses image augmentation and iteratively op

398 Dec 17, 2022

The FinQA dataset from paper: FinQA: A Dataset of Numerical Reasoning over Financial Data

Data and code for EMNLP 2021 paper "FinQA: A Dataset of Numerical Reasoning over Financial Data"

114 Dec 29, 2022

PIZZA - a task-oriented semantic parsing dataset

The PIZZA dataset continues the exploration of task-oriented parsing by introducing a new dataset for parsing pizza and drink orders, whose semantics cannot be captured by flat slots and intents.

17 Dec 14, 2022

Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.

img2dataset Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine. Also supports

1.4k Jan 1, 2023

pytest-django allows you to test your Django project/applications with the pytest testing tool.

1.1k Dec 14, 2022

This is an official implementation for the WTW Dataset in "Parsing Table Structures in the Wild " on table detection and table structure recognition.

WTW-Dataset This is an official implementation for the WTW Dataset in "Parsing Table Structures in the Wild " on ICCV 2021. Here, you can download the

109 Dec 29, 2022

borb is a library for reading, creating and manipulating PDF files in python.

2.9k Jan 1, 2023

The purpose of this code base is to add a specified signal-to-noise ratio noise from MUSAN dataset to a pure speech signal and to generate far-field speech data using room impulse response data from BUT Speech@FIT Reverb Database.

Add_noise_and_rir_to_speech The purpose of this code base is to add a specified signal-to-noise ratio noise from MUSAN dataset to a pure speech signal

7 Oct 30, 2022

Train GPT-3 model on V100(16GB Mem) Using improved Transformer.

GPT-X using transformer pytorch

24 Sep 11, 2022

Python scripts for performing stereo depth estimation using the HITNET Tensorflow model.

HITNET-Stereo-Depth-estimation Python scripts for performing stereo depth estimation using the HITNET Tensorflow model from Google Research. Stereo de

76 Jan 2, 2023

Dataset and Code for ICCV 2021 paper "Real-world Video Super-resolution: A Benchmark Dataset and A Decomposition based Learning Scheme"

Dataset and Code for RealVSR Real-world Video Super-resolution: A Benchmark Dataset and A Decomposition based Learning Scheme Xi Yang, Wangmeng Xiang,

91 Nov 22, 2022

Cross Quality LFW: A database for Analyzing Cross-Resolution Image Face Recognition in Unconstrained Environments

Cross-Quality Labeled Faces in the Wild (XQLFW) Here, we release the database, evaluation protocol and code for the following paper: Cross Quality LFW

10 Dec 12, 2022

This repo uses a combination of logits and feature distillation method to teach the PSPNet model of ResNet18 backbone with the PSPNet model of ResNet50 backbone. All the models are trained and tested on the PASCAL-VOC2012 dataset.

PSPNet-logits and feature-distillation Introduction This repository is based on PSPNet and modified from semseg and Pixelwise_Knowledge_Distillation_P

6 Dec 1, 2022

Code and data for the EMNLP 2021 paper "Just Say No: Analyzing the Stance of Neural Dialogue Generation in Offensive Contexts". Coming soon!

ToxiChat Code and data for the EMNLP 2021 paper "Just Say No: Analyzing the Stance of Neural Dialogue Generation in Offensive Contexts". Install depen

11 Jan 1, 2023

A novel benchmark dataset for Monocular Layout prediction

AutoLay AutoLay: Benchmarking Monocular Layout Estimation Kaustubh Mani, N. Sai Shankar, J. Krishna Murthy, and K. Madhava Krishna Abstract In this pa

39 Apr 26, 2022

VIL-100: A New Dataset and A Baseline Model for Video Instance Lane Detection (ICCV 2021)

Preparation Please see dataset/README.md to get more details about our datasets-VIL100 Please see INSTALL.md to install environment and evaluation too

82 Dec 15, 2022

Unoffical implementation about Image Super-Resolution via Iterative Refinement by Pytorch

Image Super-Resolution via Iterative Refinement Paper | Project Brief This is a unoffical implementation about Image Super-Resolution via Iterative Re

2.5k Jan 2, 2023

Official implementation of particle-based models (GNS and DPI-Net) on the Physion dataset.

Physion: Evaluating Physical Prediction from Vision in Humans and Machines [paper] Daniel M. Bear, Elias Wang, Damian Mrowca, Felix J. Binder, Hsiao-Y

18 Dec 19, 2022

Codes to pre-train Japanese T5 models

t5-japanese Codes to pre-train a T5 (Text-to-Text Transfer Transformer) model pre-trained on Japanese web texts. The model is available at https://hug

37 Dec 25, 2022

EMNLP 2021 Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections

Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections Ruiqi Zhong, Kristy Lee*, Zheng Zhang*, Dan Klein EMN

42 Nov 3, 2022

Unofficial PyTorch Implementation of UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation

UnivNet UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation This is an unofficial PyTorch

54 Aug 30, 2021

Official implementation of the MM'21 paper Constrained Graphic Layout Generation via Latent Optimization

[MM'21] Constrained Graphic Layout Generation via Latent Optimization This repository provides the official code for the paper "Constrained Graphic La

73 Dec 27, 2022

The source code of CVPR 2019 paper "Deep Exemplar-based Video Colorization".

Deep Exemplar-based Video Colorization (Pytorch Implementation) Paper | Pretrained Model | Youtube video 🔥 | Colab demo Deep Exemplar-based Video Col

253 Dec 27, 2022

LLVIP: A Visible-infrared Paired Dataset for Low-light Vision

LLVIP: A Visible-infrared Paired Dataset for Low-light Vision Project | Arxiv | Abstract It is very challenging for various visual tasks such as image

377 Jan 7, 2023

NUANCED is a user-centric conversational recommendation dataset that contains 5.1k annotated dialogues and 26k high-quality user turns.

NUANCED: Natural Utterance Annotation for Nuanced Conversation with Estimated Distributions Overview NUANCED is a user-centric conversational recommen

18 Dec 28, 2021

GraphGT: Machine Learning Datasets for Graph Generation and Transformation

GraphGT: Machine Learning Datasets for Graph Generation and Transformation Dataset Website | Paper Installation Using pip To install the core environm

50 Aug 18, 2022

⚖️ A Statutory Article Retrieval Dataset in French.

A Statutory Article Retrieval Dataset in French This repository contains the Belgian Statutory Article Retrieval Dataset (BSARD), as well as the code

19 Nov 17, 2022

GSoC'2021 | TensorFlow implementation of Wav2Vec2

73 Nov 28, 2022

Code used to generate the results appearing in "Train longer, generalize better: closing the generalization gap in large batch training of neural networks"

Train longer, generalize better - Big batch training This is a code repository used to generate the results appearing in "Train longer, generalize bet

145 Sep 16, 2022

PyTorch implementation of SampleRNN: An Unconditional End-to-End Neural Audio Generation Model

samplernn-pytorch A PyTorch implementation of SampleRNN: An Unconditional End-to-End Neural Audio Generation Model. It's based on the reference implem

261 Dec 14, 2022

Generate a bunch of malicious pdf files with phone-home functionality. Can be used with Burp Collaborator

Malicious PDF Generator ☠️ Generate ten different malicious pdf files with phone-home functionality. Can be used with Burp Collaborator. Used for pene

1.9k Jan 1, 2023

WRENCH: Weak supeRvision bENCHmark

🔧 What is it? Wrench is a benchmark platform containing diverse weak supervision tasks. It also provides a common and easy framework for development

176 Dec 28, 2022

Generate vibrant and detailed images using only text.

CLIP Guided Diffusion From RiversHaveWings. Generate vibrant and detailed images using only text. See captions and more generations in the Gallery See

401 Dec 28, 2022

Simple tool/toolkit for evaluating NLG (Natural Language Generation) offering various automated metrics.

Simple tool/toolkit for evaluating NLG (Natural Language Generation) offering various automated metrics. Jury offers a smooth and easy-to-use interface. It uses datasets for underlying metric computation, and hence adding custom metric is easy as adopting datasets.Metric.

129 Jan 6, 2023

Music Source Separation; Train & Eval & Inference piplines and pretrained models we used for 2021 ISMIR MDX Challenge.

Music Source Separation with Channel-wise Subband Phase Aware ResUnet (CWS-PResUNet) Introduction This repo contains the pretrained Music Source Separ

100 Dec 25, 2022

A Survey of Natural Language Generation in Task-Oriented Dialogue System (TOD): Recent Advances and New Frontiers

132 Nov 25, 2022

PyTorch DepthNet Training on Still Box dataset

DepthNet training on Still Box Project page This code can replicate the results of our paper that was published in UAVg-17. If you use this repo in yo

115 Nov 21, 2022

PyTorch experiments with the Zalando fashion-mnist dataset

zalando-pytorch PyTorch experiments with the Zalando fashion-mnist dataset Project Organization ├── LICENSE ├── Makefile - Makefile with co

31 Sep 25, 2021

IMS-Toucan is a toolkit to train state-of-the-art Speech Synthesis models

IMS-Toucan is a toolkit to train state-of-the-art Speech Synthesis models. Everything is pure Python and PyTorch based to keep it as simple and beginner-friendly, yet powerful as possible.

Digital Phonetics at the University of Stuttgart

247 Jan 5, 2023

YOLOv5 🚀 is a family of object detection architectures and models pretrained on the COCO dataset

YOLOv5 🚀 is a family of object detection architectures and models pretrained on the COCO dataset, and represents Ultralytics open-source research int

73 Dec 16, 2022

Korean Simple Contrastive Learning of Sentence Embeddings using SKT KoBERT and kakaobrain KorNLU dataset

KoSimCSE Korean Simple Contrastive Learning of Sentence Embeddings implementation using pytorch SimCSE Installation git clone https://github.com/BM-K/

34 Nov 24, 2022

pix2tex: Using a ViT to convert images of equations into LaTeX code.

The goal of this project is to create a learning based system that takes an image of a math formula and returns corresponding LaTeX code.

2.6k Dec 30, 2022

Progressive Coordinate Transforms for Monocular 3D Object Detection

Progressive Coordinate Transforms for Monocular 3D Object Detection This repository is the official implementation of PCT. Introduction In this paper,

58 Nov 6, 2022

A weakly-supervised scene graph generation codebase. The implementation of our CVPR2021 paper ``Linguistic Structures as Weak Supervision for Visual Scene Graph Generation''

README.md shall be finished soon. WSSGG 0 Overview 1 Installation 1.1 Faster-RCNN 1.2 Language Parser 1.3 GloVe Embeddings 2 Settings 2.1 VG-GT-Graph

35 Nov 20, 2022

A pytorch implementation of the CVPR2021 paper "VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild"

VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild A pytorch implementation of the CVPR2021 paper "VSPW: A Large-scale Dataset for Video

45 Nov 29, 2022

PyTorch implementation of "Learn to Dance with AIST++: Music Conditioned 3D Dance Generation."

Learn to Dance with AIST++: Music Conditioned 3D Dance Generation. Installation pip install -r requirements.txt Prepare Dataset bash data/scripts/pre

8 Sep 7, 2021

Vision-Language Transformer and Query Generation for Referring Segmentation (ICCV 2021)

Vision-Language Transformer and Query Generation for Referring Segmentation Please consider citing our paper in your publications if the project helps

143 Dec 23, 2022

Seeing Dynamic Scene in the Dark: High-Quality Video Dataset with Mechatronic Alignment (ICCV2021)

Seeing Dynamic Scene in the Dark: High-Quality Video Dataset with Mechatronic Alignment This is a pytorch project for the paper Seeing Dynamic Scene i

21 Nov 28, 2022

Learn meanings behind words is a key element in NLP. This project concentrates on the disambiguation of preposition senses. Therefore, we train a bert-transformer model and surpass the state-of-the-art.

New State-of-the-Art in Preposition Sense Disambiguation Supervisor: Prof. Dr. Alexander Mehler Alexander Henlein Institutions: Goethe University TTLa

4 Apr 6, 2022

Pytorch implementation for Semantic Segmentation/Scene Parsing on MIT ADE20K dataset

Semantic Segmentation on MIT ADE20K dataset in PyTorch This is a PyTorch implementation of semantic segmentation models on MIT ADE20K scene parsing da

4.5k Jan 8, 2023

HDR Video Reconstruction: A Coarse-to-fine Network and A Real-world Benchmark Dataset (ICCV 2021)

Code for HDR Video Reconstruction HDR Video Reconstruction: A Coarse-to-fine Network and A Real-world Benchmark Dataset (ICCV 2021) Guanying Chen, Cha

64 Nov 19, 2022

This repository is to support contributions for tools for the Project CodeNet dataset hosted in DAX

The goal of Project CodeNet is to provide the AI-for-Code research community with a large scale, diverse, and high quality curated dataset to drive innovation in AI techniques.

1.2k Jan 4, 2023

We evaluate our method on different datasets (including ShapeNet, CUB-200-2011, and Pascal3D+) and achieve state-of-the-art results, outperforming all the other supervised and unsupervised methods and 3D representations, all in terms of performance, accuracy, and training time.

An Effective Loss Function for Generating 3D Models from Single 2D Image without Rendering Papers with code | Paper Nikola Zubić Pietro Lio University

213 Dec 27, 2022

FedScale: Benchmarking Model and System Performance of Federated Learning

FedScale: Benchmarking Model and System Performance of Federated Learning (Paper) This repository contains scripts and instructions of building FedSca

268 Jan 1, 2023

Tracing Versus Freehand for Evaluating Computer-Generated Drawings (SIGGRAPH 2021)

Tracing Versus Freehand for Evaluating Computer-Generated Drawings (SIGGRAPH 2021) Zeyu Wang, Sherry Qiu, Nicole Feng, Holly Rushmeier, Leonard McMill

23 Dec 9, 2022

Text-to-Image generation

Generate vivid Images for Any (Chinese) text CogView is a pretrained (4B-param) transformer for text-to-image generation in general domain. Read our p

1.3k Dec 29, 2022

Video Visual Relation Detection (VidVRD) tracklets generation. also for ACM MM Visual Relation Understanding Grand Challenge

VidVRD-tracklets This repository contains codes for Video Visual Relation Detection (VidVRD) tracklets generation based on MEGA and deepSORT. These tr

25 Dec 21, 2022

LightSeq is a high performance training and inference library for sequence processing and generation implemented in CUDA

LightSeq: A High Performance Library for Sequence Processing and Generation

2.5k Jan 6, 2023

Django-google-optimize is a Django application designed to make running server side Google Optimize A/B tests easy.

Django-google-optimize Django-google-optimize is a Django application designed to make running Google Optimize A/B tests easy. Here is a tutorial on t

39 Oct 25, 2022

Test utility for validating OpenAPI documentation

DRF OpenAPI Tester This is a test utility to validate DRF Test Responses against OpenAPI 2 and 3 schema. It has built-in support for: OpenAPI 2/3 yaml

103 Dec 21, 2022

Test django schema and data migrations, including migrations' order and best practices.

django-test-migrations Features Allows to test django schema and data migrations Allows to test both forward and rollback migrations Allows to test th

382 Dec 27, 2022

Genalog is an open source, cross-platform python package allowing generation of synthetic document images with custom degradations and text alignment capabilities.

235 Dec 22, 2022

Python Train-Test-Validation-Dataset-Generation Resources

Python Train-Test-Validation-Dataset-Generation Libraries

S3-plugin is a high performance PyTorch dataset library to efficiently access datasets stored in S3 buckets.

Test utility for validating OpenAPI documentation

Sane and flexible OpenAPI 3 schema generation for Django REST framework.

Automated generation of real Swagger/OpenAPI 2.0 schemas from Django REST Framework code.

🎻 Modularized exploit generation framework

My attempt to reverse the Discord nitro token generation function.

Official implementation of "StyleCariGAN: Caricature Generation via StyleGAN Feature Map Modulation" (SIGGRAPH 2021)

Unique image & metadata generation using weighted layer collections.

Demonstration that AWS IAM policy evaluation docs are incorrect

Abstractive opinion summarization system (SelSum) and the largest dataset of Amazon product summaries (AmaSum). EMNLP 2021 conference paper.

Code for the paper "Flexible Generation of Natural Language Deductions"

CDLA: A Chinese document layout analysis (CDLA) dataset

[ArXiv 2021] Data-Efficient Instance Generation from Instance Discrimination

CPT: A Pre-Trained Unbalanced Transformer for Both Chinese Language Understanding and Generation

A daily updated JSON dataset of all the Open House London venues, events, and metadata

A set of tools for converting a darknet dataset to COCO format working with YOLOX

Orthogonal Jacobian Regularization for Unsupervised Disentanglement in Image Generation (ICCV 2021)

ILVR: Conditioning Method for Denoising Diffusion Probabilistic Models (ICCV 2021 Oral)

Klara is a static analysis tools to automatic generate test case, based on SMT (z3) solver, with a powerful ast level inference system.

This reporistory contains the test-dev data of the paper "xGQA: Cross-lingual Visual Question Answering".

NLPIR tutorial: pretrain for IR. pre-train on raw textual corpus, fine-tune on MS MARCO Document Ranking

Cookies is a project inspired by python cookiecutter but used for terraform generation.

Find existing email addresses by nickname using API/SMTP checking methods without user notification. Please, don't hesitate to improve cat's job! 🐱🔎 📬

A multilingual version of MS MARCO passage ranking dataset

Experimental python optimistic rollup fraud-proof generation

Related resources for our EMNLP 2021 paper Plan-then-Generate: Controlled Data-to-Text Generation via Planning

Dynamic Multi-scale Filters for Semantic Segmentation (DMNet ICCV'2019)

This repository contains a set of codes to run (i.e., train, perform inference with, evaluate) a diarization method called EEND-vector-clustering.

Official Pytorch implementation of Test-Agnostic Long-Tailed Recognition by Test-Time Aggregating Diverse Experts with Self-Supervision.

load .txt to train YOLOX, same as Yolo others

Demonstrates how to divide a DL model into multiple IR model files (division) and introduce a simplest way to implement a custom layer works with OpenVINO IR models.

Code for lyric-section-to-comment generation based on huggingface transformers.

UniLM AI - Large-scale Self-supervised Pre-training across Tasks, Languages, and Modalities

A library for interacting with Path of Exile game and economy data, and a unique loot filter generation framework.

neural image generation

The FinQA dataset from paper: FinQA: A Dataset of Numerical Reasoning over Financial Data

PIZZA - a task-oriented semantic parsing dataset

Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.

pytest-django allows you to test your Django project/applications with the pytest testing tool.

This is an official implementation for the WTW Dataset in "Parsing Table Structures in the Wild " on table detection and table structure recognition.

borb is a library for reading, creating and manipulating PDF files in python.

The purpose of this code base is to add a specified signal-to-noise ratio noise from MUSAN dataset to a pure speech signal and to generate far-field speech data using room impulse response data from BUT Speech@FIT Reverb Database.

Train GPT-3 model on V100(16GB Mem) Using improved Transformer.

Python scripts for performing stereo depth estimation using the HITNET Tensorflow model.

Dataset and Code for ICCV 2021 paper "Real-world Video Super-resolution: A Benchmark Dataset and A Decomposition based Learning Scheme"

Cross Quality LFW: A database for Analyzing Cross-Resolution Image Face Recognition in Unconstrained Environments

This repo uses a combination of logits and feature distillation method to teach the PSPNet model of ResNet18 backbone with the PSPNet model of ResNet50 backbone. All the models are trained and tested on the PASCAL-VOC2012 dataset.

Code and data for the EMNLP 2021 paper "Just Say No: Analyzing the Stance of Neural Dialogue Generation in Offensive Contexts". Coming soon!

A novel benchmark dataset for Monocular Layout prediction

VIL-100: A New Dataset and A Baseline Model for Video Instance Lane Detection (ICCV 2021)

Unoffical implementation about Image Super-Resolution via Iterative Refinement by Pytorch

Official implementation of particle-based models (GNS and DPI-Net) on the Physion dataset.

Codes to pre-train Japanese T5 models

EMNLP 2021 Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections

Unofficial PyTorch Implementation of UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation

Official implementation of the MM'21 paper Constrained Graphic Layout Generation via Latent Optimization

The source code of CVPR 2019 paper "Deep Exemplar-based Video Colorization".

LLVIP: A Visible-infrared Paired Dataset for Low-light Vision

NUANCED is a user-centric conversational recommendation dataset that contains 5.1k annotated dialogues and 26k high-quality user turns.

GraphGT: Machine Learning Datasets for Graph Generation and Transformation

⚖️ A Statutory Article Retrieval Dataset in French.

GSoC'2021 | TensorFlow implementation of Wav2Vec2

Code used to generate the results appearing in "Train longer, generalize better: closing the generalization gap in large batch training of neural networks"

PyTorch implementation of SampleRNN: An Unconditional End-to-End Neural Audio Generation Model

Generate a bunch of malicious pdf files with phone-home functionality. Can be used with Burp Collaborator

WRENCH: Weak supeRvision bENCHmark

Generate vibrant and detailed images using only text.

Simple tool/toolkit for evaluating NLG (Natural Language Generation) offering various automated metrics.

Music Source Separation; Train & Eval & Inference piplines and pretrained models we used for 2021 ISMIR MDX Challenge.

A Survey of Natural Language Generation in Task-Oriented Dialogue System (TOD): Recent Advances and New Frontiers

PyTorch DepthNet Training on Still Box dataset

PyTorch experiments with the Zalando fashion-mnist dataset

IMS-Toucan is a toolkit to train state-of-the-art Speech Synthesis models

YOLOv5 🚀 is a family of object detection architectures and models pretrained on the COCO dataset

Korean Simple Contrastive Learning of Sentence Embeddings using SKT KoBERT and kakaobrain KorNLU dataset

pix2tex: Using a ViT to convert images of equations into LaTeX code.

Progressive Coordinate Transforms for Monocular 3D Object Detection

A weakly-supervised scene graph generation codebase. The implementation of our CVPR2021 paper ``Linguistic Structures as Weak Supervision for Visual Scene Graph Generation''