1592 Repositories
Python ocr-open-dataset Libraries
Multiple paper open-source codes of the Microsoft Research Asia DKI group
📫 Paper Code Collection (MSRA DKI Group) This repo hosts multiple open-source codes of the Microsoft Research Asia DKI Group. You could find the corr
WRENCH: Weak supeRvision bENCHmark
🔧 What is it? Wrench is a benchmark platform containing diverse weak supervision tasks. It also provides a common and easy framework for development
PyTorch implementation of CVPR 2020 paper (Reference-Based Sketch Image Colorization using Augmented-Self Reference and Dense Semantic Correspondence) and pre-trained model on ImageNet dataset
Reference-Based-Sketch-Image-Colorization-ImageNet This is a PyTorch implementation of CVPR 2020 paper (Reference-Based Sketch Image Colorization usin
Launchr is an open source SaaS starter kit, based on Django.
Launchr Launchr is an open source SaaS starter kit. About Launchr is a fully-equipped starter template, ready to start a SaaS web app. It implements t
Hand Gesture Volume Control | Open CV | Computer Vision
Gesture Volume Control Hand Gesture Volume Control | Open CV | Computer Vision Use gesture control to change the volume of a computer. First we look i
Distiller is an open-source Python package for neural network compression research.
Wiki and tutorials | Documentation | Getting Started | Algorithms | Design | FAQ Distiller is an open-source Python package for neural network compres
docTR by Mindee (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.
docTR by Mindee (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.
This is the open-source reference implementation of the SIGGRAPH 2021 paper Intersection-free Rigid Body Dynamics.
Robust, intersection-free, simulations of rigid bodies.
Open Resource Calculator Module for Python
Calculator Module for Python By White Night Install #
Ross Virtual Assistant is a programme which can play Music, search Wikipedia, open Websites and much more.
Ross-Virtual-Assistant Ross Virtual Assistant is a programme which can play Music, search Wikipedia, open Websites and much more. Installation Downloa
PyTorch Language Model for 1-Billion Word (LM1B / GBW) Dataset
PyTorch Large-Scale Language Model A Large-Scale PyTorch Language Model trained on the 1-Billion Word (LM1B) / (GBW) dataset Latest Results 39.98 Perp
WIT (Wikipedia-based Image Text) Dataset is a large multimodal multilingual dataset comprising 37M+ image-text sets with 11M+ unique images across 100+ languages.
WIT (Wikipedia-based Image Text) Dataset is a large multimodal multilingual dataset comprising 37M+ image-text sets with 11M+ unique images across 100+ languages.
The PASS dataset: pretrained models and how to get the data - PASS: Pictures without humAns for Self-Supervised Pretraining
The PASS dataset: pretrained models and how to get the data - PASS: Pictures without humAns for Self-Supervised Pretraining
A crowdsourced dataset of dialogues grounded in social contexts involving utilization of commonsense.
A crowdsourced dataset of dialogues grounded in social contexts involving utilization of commonsense.
In this repository we have tested 3 VQA models on the ImageCLEF-2019 dataset.
Med-VQA In this repository we have tested 3 VQA models on the ImageCLEF-2019 dataset. Two of these are made on top of Facebook AI Reasearch's Multi-Mo
A simple PyTorch Implementation of Generative Adversarial Networks, focusing on anime face drawing.
AnimeGAN A simple PyTorch Implementation of Generative Adversarial Networks, focusing on anime face drawing. Randomly Generated Images The images are
Pixel-wise segmentation on VOC2012 dataset using pytorch.
PiWiSe Pixel-wise segmentation on the VOC2012 dataset using pytorch. FCN SegNet PSPNet UNet RefineNet For a more complete implementation of segmentati
The tool to make NLP datasets ready to use
chazutsu photo from Kaikado, traditional Japanese chazutsu maker chazutsu is the dataset downloader for NLP. import chazutsu r = chazutsu.data
Faker is a Python package that generates fake data for you.
Faker is a Python package that generates fake data for you. Whether you need to bootstrap your database, create good-looking XML documents, fill-in yo
Kornia is a open source differentiable computer vision library for PyTorch.
Open Source Differentiable Computer Vision Library
Code and dataset for the EMNLP 2021 Finding paper "Can NLI Models Verify QA Systems’ Predictions?"
Code and dataset for the EMNLP 2021 Finding paper "Can NLI Models Verify QA Systems’ Predictions?"
💛 Code and Dataset for our EMNLP 2021 paper: "Perspective-taking and Pragmatics for Generating Empathetic Responses Focused on Emotion Causes"
Perspective-taking and Pragmatics for Generating Empathetic Responses Focused on Emotion Causes Official PyTorch implementation and EmoCause evaluatio
S3-plugin is a high performance PyTorch dataset library to efficiently access datasets stored in S3 buckets.
S3-plugin is a high performance PyTorch dataset library to efficiently access datasets stored in S3 buckets.
Brawl Stars private server for version 30.242
Brawl Stars v30 Brawl Stars v30.242 server emulator written in Python. Requirements: Python 3.7 or higher pymongo dnspython colorama Running the serve
AnnIE - Annotation Platform, tool for open information extraction annotations using text files.
AnnIE - Annotation Platform, tool for open information extraction annotations using text files.
Abstractive opinion summarization system (SelSum) and the largest dataset of Amazon product summaries (AmaSum). EMNLP 2021 conference paper.
Learning Opinion Summarizers by Selecting Informative Reviews This repository contains the codebase and the dataset for the corresponding EMNLP 2021
This is an open-source toolkit for Heterogeneous Graph Neural Network(OpenHGNN) based on DGL [Deep Graph Library] and PyTorch.
This is an open-source toolkit for Heterogeneous Graph Neural Network(OpenHGNN) based on DGL [Deep Graph Library] and PyTorch.
Open Source CRM based on Django
Django-CRM Django CRM is opensource CRM developed on django framework. It has all the basic features of CRM to start with. We welcome code contributio
CDLA: A Chinese document layout analysis (CDLA) dataset
CDLA: A Chinese document layout analysis (CDLA) dataset 介绍 CDLA是一个中文文档版面分析数据集,面向中文文献类(论文)场景。包含以下10个label: 正文 标题 图片 图片标题 表格 表格标题 页眉 页脚 注释 公式 Text Title
Mathics is a general-purpose computer algebra system (CAS). It is an open-source alternative to Mathematica
Mathics is a general-purpose computer algebra system (CAS). It is an open-source alternative to Mathematica. It is free both as in "free beer" and as in "freedom".
A daily updated JSON dataset of all the Open House London venues, events, and metadata
Open House London listings data All of it. Automatically scraped hourly with updates committed to git, autogenerated per-day CSV's, and autogenerated
A set of tools for converting a darknet dataset to COCO format working with YOLOX
darknet格式数据→COCO darknet训练数据目录结构(详情参见dataset/darknet): darknet ├── class.names ├── gen_config.data ├── gen_train.txt ├── gen_valid.txt └── images
An open source API to validate the EU Covid Certificates / Green Certificates
Open Covid Certificate Validator This an open source API to validate EU Digital COVID Certificates. It receives a COVID certificate and validates it u
An open-source Python project series where beginners can contribute and practice coding.
Python Mini Projects A collection of easy Python small projects to help you improve your programming skills. Table Of Contents Aim Of The Project Cont
Experimental Brawl Stars v36.218 server emulator written in Python.
Brawl Stars v36 Experimental Brawl Stars v36.218 server emulator written in Python. Requirements: Python 3.7 or higher colorama Running the server In
Experimental Brawl Stars v37.222 server emulator written in Python.
Brawl Stars v37 Experimental Brawl Stars v37.222 server emulator written in Python. Requirements: Python 3.7 or higher colorama Running the server In
Kedro is an open-source Python framework for creating reproducible, maintainable and modular data science code
A Python framework for creating reproducible, maintainable and modular data science code.
A Screen Translator/OCR Translator made by using Python and Tesseract, the user interface are made using Tkinter. All code written in python.
About An OCR translator tool. Made by me by utilizing Tesseract, compiled to .exe using pyinstaller. I made this program to learn more about python. I
A multilingual version of MS MARCO passage ranking dataset
mMARCO A multilingual version of MS MARCO passage ranking dataset This repository presents a neural machine translation-based method for translating t
customer care chatbot made with Rasa Open Source.
Customer Care Bot Customer care bot for ecomm company which can solve faq and chitchat with users, can contact directly to team. 🛠 Features Basic E-c
A simple Python library to integrate with the Heron Data API
Heron Python This library provides easy access to the Heron Data API from applications written in Python. Documentation No language-specific docs are
step by step guide for beginners for getting started with open source
Step-by-Step Guide for beginners for getting started with Open-Source Here The Contribution Begins 💻 If you are a beginner then this repository is fo
The FinQA dataset from paper: FinQA: A Dataset of Numerical Reasoning over Financial Data
Data and code for EMNLP 2021 paper "FinQA: A Dataset of Numerical Reasoning over Financial Data"
This repository contains Python Projects for Beginners as well as for Intermediate Developers built by Contributors.
Python Projects {Open Source} Introduction The repository was built with a tree-like structure in mind, it contains collections of Python Projects. Mo
PIZZA - a task-oriented semantic parsing dataset
The PIZZA dataset continues the exploration of task-oriented parsing by introducing a new dataset for parsing pizza and drink orders, whose semantics cannot be captured by flat slots and intents.
A free and open-source discord webhook spammer.
Discord-Webhook-Spammer A free and open-source discord webhook spammer. Usage Depending on your python installation your commands may vary. Below are
An open source CMS, in python and integrable in Django
Python CMS based on the Django Framework
Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
img2dataset Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine. Also supports
This is an official implementation for the WTW Dataset in "Parsing Table Structures in the Wild " on table detection and table structure recognition.
WTW-Dataset This is an official implementation for the WTW Dataset in "Parsing Table Structures in the Wild " on ICCV 2021. Here, you can download the
The open source version of Tentro - A multipurpose Discord bot.
Welcome to Tentro 👋 A multipurpose Discord bot. 🏠 Homepage Install pip install -r requirements.txt Usage py Tentro.py Contributors 👤 Tentro Dev Tea
Generate text line images for training deep learning OCR model (e.g. CRNN)
Generate text line images for training deep learning OCR model (e.g. CRNN)
Open Source Tool - Cybersecurity Graph Database in Neo4j
GraphKer Open Source Tool - Cybersecurity Graph Database in Neo4j |G|r|a|p|h|K|e|r| { open source tool for a cybersecurity graph database in neo4j } W
💻 Open recent VS Code folders and files using Ulauncher
ulauncher-vscode-recent 💻 Open recent VS Code folders and files using Ulauncher. Quickly open recently-opened VS Code project directories and files.
The purpose of this code base is to add a specified signal-to-noise ratio noise from MUSAN dataset to a pure speech signal and to generate far-field speech data using room impulse response data from BUT Speech@FIT Reverb Database.
Add_noise_and_rir_to_speech The purpose of this code base is to add a specified signal-to-noise ratio noise from MUSAN dataset to a pure speech signal
Python scripts for performing stereo depth estimation using the HITNET Tensorflow model.
HITNET-Stereo-Depth-estimation Python scripts for performing stereo depth estimation using the HITNET Tensorflow model from Google Research. Stereo de
Dataset and Code for ICCV 2021 paper "Real-world Video Super-resolution: A Benchmark Dataset and A Decomposition based Learning Scheme"
Dataset and Code for RealVSR Real-world Video Super-resolution: A Benchmark Dataset and A Decomposition based Learning Scheme Xi Yang, Wangmeng Xiang,
Cross Quality LFW: A database for Analyzing Cross-Resolution Image Face Recognition in Unconstrained Environments
Cross-Quality Labeled Faces in the Wild (XQLFW) Here, we release the database, evaluation protocol and code for the following paper: Cross Quality LFW
Parametric open source reconstructions of Voron printed parts
The Parametric Voron This repository contains Fusion 360 reconstructions of various printed parts from the Voron printers
This is a Innexia Chat Bot Open Source Code 🤬
⚡ Innexia ⚡ A Powerful, Smart And Simple Chat Bot ... Written with Python... Available on Telegram as @InnexiaChatBot ❤️ Support ⭐️ Thanks to everyone
This repo uses a combination of logits and feature distillation method to teach the PSPNet model of ResNet18 backbone with the PSPNet model of ResNet50 backbone. All the models are trained and tested on the PASCAL-VOC2012 dataset.
PSPNet-logits and feature-distillation Introduction This repository is based on PSPNet and modified from semseg and Pixelwise_Knowledge_Distillation_P
A novel benchmark dataset for Monocular Layout prediction
AutoLay AutoLay: Benchmarking Monocular Layout Estimation Kaustubh Mani, N. Sai Shankar, J. Krishna Murthy, and K. Madhava Krishna Abstract In this pa
VIL-100: A New Dataset and A Baseline Model for Video Instance Lane Detection (ICCV 2021)
Preparation Please see dataset/README.md to get more details about our datasets-VIL100 Please see INSTALL.md to install environment and evaluation too
the code for paper "Energy-Based Open-World Uncertainty Modeling for Confidence Calibration"
EOW-Softmax This code is for the paper "Energy-Based Open-World Uncertainty Modeling for Confidence Calibration". Accepted by ICCV21. Usage Commnd exa
Learning Open-World Object Proposals without Learning to Classify
Learning Open-World Object Proposals without Learning to Classify Pytorch implementation for "Learning Open-World Object Proposals without Learning to
Check subdomains for Open S3 buckets
SuBuket v1.0 Check subdomains for Open S3 buckets Coded by kaiz3n Basically, this tool makes use of another tool (sublist3r) to fetch subdomains, and
Official implementation of particle-based models (GNS and DPI-Net) on the Physion dataset.
Physion: Evaluating Physical Prediction from Vision in Humans and Machines [paper] Daniel M. Bear, Elias Wang, Damian Mrowca, Felix J. Binder, Hsiao-Y
Docker is an open platform for developing, shipping, and running applications OS-level virtualization to deliver software in packages called containers However, 'security' is a top request on Docker's public roadmap This project aims at vulnerability check for such docker containers. New contributions are accepted
Docker-Vulnerability-Check Docker is an open platform for developing, shipping, and running applications OS-level virtualization to deliver software i
EMNLP 2021 Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections
Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections Ruiqi Zhong, Kristy Lee*, Zheng Zhang*, Dan Klein EMN
LLVIP: A Visible-infrared Paired Dataset for Low-light Vision
LLVIP: A Visible-infrared Paired Dataset for Low-light Vision Project | Arxiv | Abstract It is very challenging for various visual tasks such as image
NUANCED is a user-centric conversational recommendation dataset that contains 5.1k annotated dialogues and 26k high-quality user turns.
NUANCED: Natural Utterance Annotation for Nuanced Conversation with Estimated Distributions Overview NUANCED is a user-centric conversational recommen
⚖️ A Statutory Article Retrieval Dataset in French.
A Statutory Article Retrieval Dataset in French This repository contains the Belgian Statutory Article Retrieval Dataset (BSARD), as well as the code
An open source Jetson Nano baseboard and tools to design your own.
My Jetson Nano Baseboard This basic baseboard gives the user the foundation and the flexibility to design their own baseboard for the Jetson Nano. It
GSoC'2021 | TensorFlow implementation of Wav2Vec2
GSoC'2021 | TensorFlow implementation of Wav2Vec2
PyNIF3D is an open-source PyTorch-based library for research on neural implicit functions (NIF)-based 3D geometry representation.
PyNIF3D is an open-source PyTorch-based library for research on neural implicit functions (NIF)-based 3D geometry representation. It aims to accelerate research by providing a modular design that allows for easy extension and combination of NIF-related components, as well as readily available paper implementations and dataset loaders.
TorchDistiller - a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.
This project is a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.
A pytorch implementation of Reading Wikipedia to Answer Open-Domain Questions.
DrQA A pytorch implementation of the ACL 2017 paper Reading Wikipedia to Answer Open-Domain Questions (DrQA). Reading comprehension is a task to produ
WRENCH: Weak supeRvision bENCHmark
🔧 What is it? Wrench is a benchmark platform containing diverse weak supervision tasks. It also provides a common and easy framework for development
An open-source, low-cost, image-based weed detection device for fallow scenarios.
Welcome to the OpenWeedLocator (OWL) project, an opensource hardware and software green-on-brown weed detector that uses entirely off-the-shelf compon
This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.
Speech-Backbones This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab. Grad-TTS Official implementation of the Grad-
Open-source Monocular Python HawkEye for Tennis
Tennis Tracking 🎾 Objectives Track the ball Detect court lines Detect the players To track the ball we used TrackNet - deep learning network for trac
🔍 IRIS: An open-source intelligence framework
IRIS is an open-source OSINT framework, consisting of modules to find information about a target by scraping sites and fetching data from APIs.
一个多语言支持、易使用的 OCR 项目。An easy-to-use OCR project with multilingual support.
AgentOCR 简介 AgentOCR 是一个基于 PaddleOCR 和 ONNXRuntime 项目开发的一个使用简单、调用方便的 OCR 项目 本项目目前包含 Python Package 【AgentOCR】 和 OCR 标注软件 【AgentOCRLabeling】 使用指南 Pytho
PyTorch DepthNet Training on Still Box dataset
DepthNet training on Still Box Project page This code can replicate the results of our paper that was published in UAVg-17. If you use this repo in yo
PyTorch experiments with the Zalando fashion-mnist dataset
zalando-pytorch PyTorch experiments with the Zalando fashion-mnist dataset Project Organization ├── LICENSE ├── Makefile - Makefile with co
YOLOv5 🚀 is a family of object detection architectures and models pretrained on the COCO dataset
YOLOv5 🚀 is a family of object detection architectures and models pretrained on the COCO dataset, and represents Ultralytics open-source research int
Korean Simple Contrastive Learning of Sentence Embeddings using SKT KoBERT and kakaobrain KorNLU dataset
KoSimCSE Korean Simple Contrastive Learning of Sentence Embeddings implementation using pytorch SimCSE Installation git clone https://github.com/BM-K/
pix2tex: Using a ViT to convert images of equations into LaTeX code.
The goal of this project is to create a learning based system that takes an image of a math formula and returns corresponding LaTeX code.
Progressive Coordinate Transforms for Monocular 3D Object Detection
Progressive Coordinate Transforms for Monocular 3D Object Detection This repository is the official implementation of PCT. Introduction In this paper,
A pytorch implementation of the CVPR2021 paper "VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild"
VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild A pytorch implementation of the CVPR2021 paper "VSPW: A Large-scale Dataset for Video
FastReID is a research platform that implements state-of-the-art re-identification algorithms.
FastReID is a research platform that implements state-of-the-art re-identification algorithms.
OpenStickFirmware is open source software designed to handle any and all tasks required in a custom Fight Stick
OpenStickFirmware is open source software designed to handle any and all tasks required in a custom Fight Stick. It can handle being the brains of your entire stick, or just handling the bells and whistles while your Brook board talks to your console.
The tl;dr on a few notable transformer/language model papers + other papers (alignment, memorization, etc).
The tl;dr on a few notable transformer/language model papers + other papers (alignment, memorization, etc).
Seeing Dynamic Scene in the Dark: High-Quality Video Dataset with Mechatronic Alignment (ICCV2021)
Seeing Dynamic Scene in the Dark: High-Quality Video Dataset with Mechatronic Alignment This is a pytorch project for the paper Seeing Dynamic Scene i
Open source implementation of AceNAS: Learning to Rank Ace Neural Architectures with Weak Supervision of Weight Sharing
AceNAS This repo is the experiment code of AceNAS, and is not considered as an official release. We are working on integrating AceNAS as a built-in st
CredData is a set of files including credentials in open source projects
CredData is a set of files including credentials in open source projects. CredData includes suspicious lines with manual review results and more information such as credential types for each suspicious line. CredData can be used to develop new tools or improve existing tools. Furthermore, using the benchmark result of the CredData, users can choose a proper tool among open source credential scanning tools according to their use case.
Pytorch implementation for Semantic Segmentation/Scene Parsing on MIT ADE20K dataset
Semantic Segmentation on MIT ADE20K dataset in PyTorch This is a PyTorch implementation of semantic segmentation models on MIT ADE20K scene parsing da
GPT-Code-Clippy (GPT-CC) is an open source version of GitHub Copilot
GPT-Code-Clippy (GPT-CC) is an open source version of GitHub Copilot, a language model -- based on GPT-3, called GPT-Codex -- that is fine-tuned on publicly available code from GitHub.
HDR Video Reconstruction: A Coarse-to-fine Network and A Real-world Benchmark Dataset (ICCV 2021)
Code for HDR Video Reconstruction HDR Video Reconstruction: A Coarse-to-fine Network and A Real-world Benchmark Dataset (ICCV 2021) Guanying Chen, Cha
This repository is to support contributions for tools for the Project CodeNet dataset hosted in DAX
The goal of Project CodeNet is to provide the AI-for-Code research community with a large scale, diverse, and high quality curated dataset to drive innovation in AI techniques.
We evaluate our method on different datasets (including ShapeNet, CUB-200-2011, and Pascal3D+) and achieve state-of-the-art results, outperforming all the other supervised and unsupervised methods and 3D representations, all in terms of performance, accuracy, and training time.
An Effective Loss Function for Generating 3D Models from Single 2D Image without Rendering Papers with code | Paper Nikola Zubić Pietro Lio University