566 Repositories
Python awesome-public-datasets Libraries
Codes for processing meeting summarization datasets AMI and ICSI.
Meeting Summarization Dataset Meeting plays an essential part in our daily life, which allows us to share information and collaborate with others. Wit
🖍️This is a feature-complete clone of the awesome Chalk (JavaScript) library.
Terminal string styling done right This is a feature-complete clone of the awesome Chalk (JavaScript) library. All credits go to Sindre Sorhus. Highli
Code Repo for the ACL21 paper "Common Sense Beyond English: Evaluating and Improving Multilingual LMs for Commonsense Reasoning"
Common Sense Beyond English: Evaluating and Improving Multilingual LMs for Commonsense Reasoning This is the Github repository of our paper, "Common S
[IJCAI'21] Deep Automatic Natural Image Matting
Deep Automatic Natural Image Matting [IJCAI-21] This is the official repository of the paper Deep Automatic Natural Image Matting. Introduction | Netw
atmaCup #11 の Public 4th / Pricvate 5th Solution のリポジトリです。
#11 atmaCup 2021-07-09 ~ 2020-07-21 に行われた #11 [初心者歓迎! / 画像編] atmaCup のリポジトリです。結果は Public 4th / Private 5th でした。 フレームワークは PyTorch で、実装は pytorch-image-m
Collect super-resolution related papers, data, repositories
Collect super-resolution related papers, data, repositories
A repository containing an introduction to Panel made to be support videos and talks.
👍 Awesome Panel - Introduction to Panel THIS REPO IS WORK IN PROGRESS. PRE-ALPHA Panel is a very powerful framework for exploratory data analysis and
A tiny, friendly, strong baseline code for Person-reID (based on pytorch).
Pytorch ReID Strong, Small, Friendly A tiny, friendly, strong baseline code for Person-reID (based on pytorch). Strong. It is consistent with the new
A curated list of amazingly awesome Cybersecurity datasets
A curated list of amazingly awesome Cybersecurity datasets
Awesome examples for my Sun Valley ttk theme!
Sun Valley ttk theme examples This is the examples repo for my stunning Sun Valley ttk theme! Be sure to start and atch this repo, because I will uplo
PyTorch implementation of popular datasets and models in remote sensing
PyTorch Remote Sensing (torchrs) (WIP) PyTorch implementation of popular datasets and models in remote sensing tasks (Change Detection, Image Super Re
Deduplicating Training Data Makes Language Models Better
Deduplicating Training Data Makes Language Models Better This repository contains code to deduplicate language model datasets as descrbed in the paper
A collection of research papers and software related to explainability in graph machine learning.
A collection of research papers and software related to explainability in graph machine learning.
A list of awesome PyTorch scholarship articles, guides, blogs, courses and other resources.
Awesome PyTorch Scholarship Resources A collection of awesome PyTorch and Python learning resources. Contributions are always welcome! Course Informat
A selection of State Of The Art research papers (and code) on human locomotion (pose + trajectory) prediction (forecasting)
A selection of State Of The Art research papers (and code) on human trajectory prediction (forecasting). Papers marked with [W] are workshop papers.
Official PyTorch implementation of "Rapid Neural Architecture Search by Learning to Generate Graphs from Datasets" (ICLR 2021)
Rapid Neural Architecture Search by Learning to Generate Graphs from Datasets This is the official PyTorch implementation for the paper Rapid Neural A
Project page for the paper Semi-Supervised Raw-to-Raw Mapping 2021.
Project page for the paper Semi-Supervised Raw-to-Raw Mapping 2021.
A curated list of papers, code and resources pertaining to image composition
A curated list of resources including papers, datasets, and relevant links pertaining to image composition.
A Pytorch implementation of CVPR 2021 paper "RSG: A Simple but Effective Module for Learning Imbalanced Datasets"
RSG: A Simple but Effective Module for Learning Imbalanced Datasets (CVPR 2021) A Pytorch implementation of our CVPR 2021 paper "RSG: A Simple but Eff
chiarose(XCR) based on chia(XCH) source code fork, open source public chain
chia-rosechain 一个无耻的小活动 | A shameless little event 如果您喜欢这个项目,请点击star 将赠送您520朵玫瑰,可以去 facebook 留下您的(xcr)地址,和github用户名。 If you like this project, please
Meerkat provides fast and flexible data structures for working with complex machine learning datasets.
Meerkat makes it easier for ML practitioners to interact with high-dimensional, multi-modal data. It provides simple abstractions for data inspection, model evaluation and model training supported by efficient and robust IO under the hood.
Gamestonk Terminal is an awesome stock and crypto market terminal
Gamestonk Terminal is an awesome stock and crypto market terminal. A FOSS alternative to Bloomberg Terminal.
Hand gesture detection project with aweome UI implementation.
an awesome hand gesture detection project for you to be creative! Imagination is the limit to do with this project.
This a simple tool to query the awesome ippsec.rocks website from your terminal
ippsec-cli This a simple tool to query the awesome ippsec.rocks website from your terminal Installation and usage cd /opt git clone https://github.com
This is the official repository for evaluation on the NoW Benchmark Dataset. The goal of the NoW benchmark is to introduce a standard evaluation metric to measure the accuracy and robustness of 3D face reconstruction methods from a single image under variations in viewing angle, lighting, and common occlusions.
NoW Evaluation This is the official repository for evaluation on the NoW Benchmark Dataset. The goal of the NoW benchmark is to introduce a standard e
Preprocessed Datasets for our Multimodal NER paper
Unified Multimodal Transformer (UMT) for Multimodal Named Entity Recognition (MNER) Two MNER Datasets and Codes for our ACL'2020 paper: Improving Mult
a public repository helping ML/DL engineers and DS to beautify the notebook with minimal coding.
ml-helper-functions a public repository helping ML/DL engineers and DS to beautify the notebook with minimal coding.
Code and datasets for our paper "PTR: Prompt Tuning with Rules for Text Classification"
PTR Code and datasets for our paper "PTR: Prompt Tuning with Rules for Text Classification" If you use the code, please cite the following paper: @art
Official PyTorch code for CVPR 2020 paper "Deep Active Learning for Biased Datasets via Fisher Kernel Self-Supervision"
Deep Active Learning for Biased Datasets via Fisher Kernel Self-Supervision https://arxiv.org/abs/2003.00393 Abstract Active learning (AL) aims to min
This script books automatically a slot on Doctolib in one of the public vaccination centers in Berlin.
BOOKING IN BERLINS VACCINATION CENTERS This python script books automatically a slot on Doctolib in one of the public vaccination centers in Berlin. T
A growing collection of search plugins for the qBittorrent, an awesome and opensource torrent client
qBittorrent Search Plugins This is a still growing collection of search plugins for qBittorent, an amazing and open source torrent client, maintained
Winning solution of the Indoor Location & Navigation Kaggle competition
This repository contains the code to generate the winning solution of the Kaggle competition on indoor location and navigation organized by Microsoft
A comprehensive reference for all topics related to building and maintaining microservices
This pandect (πανδέκτης is Ancient Greek for encyclopedia) was created to help you find and understand almost anything related to Microservices that i
A Curated Collection of Awesome Python Scripts
A Curated Collection of Awesome Python Scripts that will make you go wow. This repository will help you in getting those green squares. Hop in and enjoy the journey of open source. 🚀
Public implementation of "Learning from Suboptimal Demonstration via Self-Supervised Reward Regression" from CoRL'21
Self-Supervised Reward Regression (SSRR) Codebase for CoRL 2021 paper "Learning from Suboptimal Demonstration via Self-Supervised Reward Regression "
The toolkit to generate auto labeled datasets
Ozeu Ozeu is the toolkit to autolabal dataset for instance segmentation. You can generate datasets labaled with segmentation mask and bounding box fro
🔬 A curated list of awesome machine learning strategies & tools in financial market.
🔬 A curated list of awesome machine learning strategies & tools in financial market.
Official code repository of the paper Learning Associative Inference Using Fast Weight Memory by Schlag et al.
Learning Associative Inference Using Fast Weight Memory This repository contains the offical code for the paper Learning Associative Inference Using F
A curated list of awesome resources related to Semantic Search🔎 and Semantic Similarity tasks.
A curated list of awesome resources related to Semantic Search🔎 and Semantic Similarity tasks.
Datasets of Automatic Keyphrase Extraction
This repository contains 20 annotated datasets of Automatic Keyphrase Extraction made available by the research community. Following are the datasets and the original papers that proposed them. If you know more datasets, and want to contribute, please, notify me.
This repository contains the implementations related to the experiments of a set of publicly available datasets that are used in the time series forecasting research space.
TSForecasting This repository contains the implementations related to the experiments of a set of publicly available datasets that are used in the tim
A public available dataset for road boundary detection in aerial images
Topo-boundary This is the official github repo of paper Topo-boundary: A Benchmark Dataset on Topological Road-boundary Detection Using Aerial Images
covid question answering datasets and fine tuned models
Covid-QA Fine tuned models for question answering on Covid-19 data. Hosted Inference This model has been contributed to huggingface.Click here to see
Combines Bayesian analyses from many datasets.
PosteriorStacker Combines Bayesian analyses from many datasets. Introduction Method Tutorial Output plot and files Introduction Fitting a model to a d
BOOKSUM: A Collection of Datasets for Long-form Narrative Summarization
BOOKSUM: A Collection of Datasets for Long-form Narrative Summarization Authors: Wojciech Kryściński, Nazneen Rajani, Divyansh Agarwal, Caiming Xiong,
A collection of Korean Text Datasets ready to use using Tensorflow-Datasets.
tfds-korean A collection of Korean Text Datasets ready to use using Tensorflow-Datasets. TensorFlow-Datasets를 이용한 한국어/한글 데이터셋 모음입니다. Dataset Catalog |
Cloud-native, data onboarding architecture for the Google Cloud Public Datasets program
Public Datasets Pipelines Cloud-native, data pipeline architecture for onboarding datasets to the Google Cloud Public Datasets Program. Overview Requi
A list of papers about point cloud based place recognition, also known as loop closure detection in SLAM (processing)
A list of papers about point cloud based place recognition, also known as loop closure detection in SLAM (processing)
Awesome anomaly detection in medical images
A curated list of awesome anomaly detection works in medical imaging, inspired by the other awesome-* initiatives.
Awesome Video Datasets
Awesome Video Datasets
flask extension for integration with the awesome pydantic package
flask extension for integration with the awesome pydantic package
Awesome Django Markdown Editor, supported for Bootstrap & Semantic-UI
martor Martor is a Markdown Editor plugin for Django, supported for Bootstrap & Semantic-UI. Features Live Preview Integrated with Ace Editor Supporte
Official implementation of "Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets" (CVPR2021)
Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets This is the official implementation of "Towards Good Pract
A curated list of awesome packages, articles, and other cool resources from the Wagtail community.
Awesome Wagtail A curated list of awesome packages, articles, and other cool resources from the Wagtail community. Wagtail is a Python CMS powered by
laTEX is awesome but we are lazy - groff with markdown syntax and inline code execution
pyGroff A wrapper for groff using python to have a nicer syntax for groff documents DOCUMENTATION Very similar to markdown. So if you know what that i
A curated list of awesome things related to Pydantic! 🌪️
Awesome Pydantic A curated list of awesome things related to Pydantic. These packages have not been vetted or approved by the pydantic team. Feel free
Task-based datasets, preprocessing, and evaluation for sequence models.
SeqIO: Task-based datasets, preprocessing, and evaluation for sequence models. SeqIO is a library for processing sequential data to be fed into downst
Code and datasets for the paper "Combining Events and Frames using Recurrent Asynchronous Multimodal Networks for Monocular Depth Prediction" (RA-L, 2021)
Combining Events and Frames using Recurrent Asynchronous Multimodal Networks for Monocular Depth Prediction This is the code for the paper Combining E
🤗 The largest hub of ready-to-use NLP datasets for ML models with fast, easy-to-use and efficient data manipulation tools
🤗 The largest hub of ready-to-use NLP datasets for ML models with fast, easy-to-use and efficient data manipulation tools
Implementation of "Debiasing Item-to-Item Recommendations With Small Annotated Datasets" (RecSys '20)
Debiasing Item-to-Item Recommendations With Small Annotated Datasets This is the code for our RecSys '20 paper. Other materials can be found here: Ful
🏅 The Most Comprehensive List of Kaggle Solutions and Ideas 🏅
🏅 Collection of Kaggle Solutions and Ideas 🏅
A list of multi-task learning papers and projects.
A list of multi-task learning papers and projects.
[CVPR2021] DoDNet: Learning to segment multi-organ and tumors from multiple partially labeled datasets
DoDNet This repo holds the pytorch implementation of DoDNet: DoDNet: Learning to segment multi-organ and tumors from multiple partially labeled datase
This repository contains the code for "Generating Datasets with Pretrained Language Models".
Datasets from Instructions (DINO 🦕 ) This repository contains the code for Generating Datasets with Pretrained Language Models. The paper introduces
SSH-Restricted deploys an SSH compliance rule (AWS Config) with auto-remediation via AWS Lambda if SSH access is public.
SSH-Restricted SSH-Restricted deploys an SSH compliance rule with auto-remediation via AWS Lambda if SSH access is public. SSH-Auto-Restricted checks
Draw datasets from within Jupyter.
drawdata This small python app allows you to draw a dataset in a jupyter notebook. This should be very useful when teaching machine learning algorithm
Backend logic implementation for realworld with awesome FastAPI
Backend logic implementation for realworld with awesome FastAPI
Awesome-NLP-Research (ANLP)
Awesome-NLP-Research (ANLP)
A Pythonic introduction to methods for scaling your data science and machine learning work to larger datasets and larger models, using the tools and APIs you know and love from the PyData stack (such as numpy, pandas, and scikit-learn).
This tutorial's purpose is to introduce Pythonistas to methods for scaling their data science and machine learning work to larger datasets and larger models, using the tools and APIs they know and love from the PyData stack (such as numpy, pandas, and scikit-learn).
User friendly Rasterio plugin to read raster datasets.
rio-tiler User friendly Rasterio plugin to read raster datasets. Documentation: https://cogeotiff.github.io/rio-tiler/ Source Code: https://github.com
Client library for interfacing with USGS datasets
USGS API USGS is a python module for interfacing with the US Geological Survey's API. It provides submodules to interact with various endpoints, and c
Summary statistics of geospatial raster datasets based on vector geometries.
rasterstats rasterstats is a Python module for summarizing geospatial raster datasets based on vector geometries. It includes functions for zonal stat
xeuledoc - Fetch information about a public Google document.
xeuledoc - Fetch information about a public Google document.
A collection of scripts to preprocess ASR datasets and finetune language-specific Wav2Vec2 XLSR models
wav2vec-toolkit A collection of scripts to preprocess ASR datasets and finetune language-specific Wav2Vec2 XLSR models This repository accompanies the
UA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrainian Language
UA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrainian Language This repository contains UA-GEC data and an accompanying Python lib
Recommender System Papers
Included Conferences: SIGIR 2020, SIGKDD 2020, RecSys 2020, CIKM 2020, AAAI 2021, WSDM 2021, WWW 2021
A collection of resources on GAN Inversion.
This repo is a collection of resources on GAN inversion, as a supplement for our survey
A novel method to tune language models. Codes and datasets for paper ``GPT understands, too''.
P-tuning A novel method to tune language models. Codes and datasets for paper ``GPT understands, too''. How to use our code We have released the code
Rank 1st in the public leaderboard of ScanRefer (2021-03-18)
InstanceRefer InstanceRefer: Cooperative Holistic Understanding for Visual Grounding on Point Clouds through Instance Multi-level Contextual Referring
Public repo of the bot
wiki-reddit-bot Public repo of u/wikipedia_answer_bot Tools Language: Python Libraries: praw (Reddit API) mediawikiapi (Wikipedia API) tenacity How it
A Python Package to Tackle the Curse of Imbalanced Datasets in Machine Learning
imbalanced-learn imbalanced-learn is a python package offering a number of re-sampling techniques commonly used in datasets showing strong between-cla
Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
Petastorm Contents Petastorm Installation Generating a dataset Plain Python API Tensorflow API Pytorch API Spark Dataset Converter API Analyzing petas
A Python library for detecting patterns and anomalies in massive datasets using the Matrix Profile
matrixprofile-ts matrixprofile-ts is a Python 2 and 3 library for evaluating time series data using the Matrix Profile algorithms developed by the Keo
Technical Analysis Library using Pandas and Numpy
Technical Analysis Library in Python It is a Technical Analysis library useful to do feature engineering from financial time series datasets (Open, Cl
Rasterio reads and writes geospatial raster datasets
Rasterio Rasterio reads and writes geospatial raster data. Geographic information systems use GeoTIFF and other formats to organize and store gridded,
Source code, datasets and trained models for the paper Learning Advanced Mathematical Computations from Examples (ICLR 2021), by François Charton, Amaury Hayat (ENPC-Rutgers) and Guillaume Lample
Maths from examples - Learning advanced mathematical computations from examples This is the source code and data sets relevant to the paper Learning a
🏆 A ranked list of awesome python libraries for web development. Updated weekly.
Best-of Web Development with Python 🏆 A ranked list of awesome python libraries for web development. Updated weekly. This curated list contains 540 a
Fetch information about a public Google document.
xeuledoc Fetch information about any public Google document. It's working on : Google Docs Google Spreadsheets Google Slides Google Drawning Google My
A curated list of awesome synthetic data for text location and recognition
awesome-SynthText A curated list of awesome synthetic data for text location and recognition and OCR datasets. Text location SynthText SynthText_Chine
A curated list of promising OCR resources
Call for contributor(paper summary,dataset generation,algorithm implementation and any other useful resources) awesome-ocr A curated list of promising
A collection of resources (including the papers and datasets) of OCR (Optical Character Recognition).
OCR Resources This repository contains a collection of resources (including the papers and datasets) of OCR (Optical Character Recognition). Contents
Links to awesome OCR projects
Awesome OCR This list contains links to great software tools and libraries and literature related to Optical Character Recognition (OCR). Contribution
A curated list of papers and resources for scene text detection and recognition
Awesome Scene Text A curated list of papers and resources for scene text detection and recognition The year when a paper was first published, includin
A curated list of resources for text detection/recognition (optical character recognition ) with deep learning methods.
awesome-deep-text-detection-recognition A curated list of awesome deep learning based papers on text detection and recognition. Text Detection Papers
A general list of resources to image text localization and recognition 场景文本位置感知与识别的论文资源与实现合集 シーンテキストの位置認識と識別のための論文リソースの要約
Scene Text Localization & Recognition Resources Read this institute-wise: English, 简体中文. Read this year-wise: English, 简体中文. Tags: [STL] (Scene Text L
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
English | 简体中文 Introduction PaddleOCR aims to create multilingual, awesome, leading, and practical OCR tools that help users train better models and a
Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics
Dataset Cartography Code for the paper Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics at EMNLP 2020. This repository cont
RANZCR-CLiP 7th Place Solution
RANZCR-CLiP 7th Place Solution This repository is WIP. (18 Mar 2021) Installation git clone https://github.com/analokmaus/kaggle-ranzcr-clip-public.gi
Participants of Bertelsmann Technology Scholarship created an awesome list of resources and they want to share it with the world, if you find illegal resources please report to us and we will remove.
Participants of Bertelsmann Technology Scholarship created an awesome list of resources and they want to share it with the world, if you find illegal
A curated list of neural network pruning resources.
A curated list of neural network pruning and related resources. Inspired by awesome-deep-vision, awesome-adversarial-machine-learning, awesome-deep-learning-papers and Awesome-NAS.