3636 Repositories
Python data-efficient-gan-training Libraries
Continuously evaluated, functional, incremental, time-series forecasting
timemachines Autonomous, univariate, k-step ahead time-series forecasting functions assigned Elo ratings You can: Use some of the functionality of a s
Codes for building and training the neural network model described in Domain-informed neural networks for interaction localization within astroparticle experiments.
Domain-informed Neural Networks Codes for building and training the neural network model described in Domain-informed neural networks for interaction
Code for: Imagine by Reasoning: A Reasoning-Based Implicit Semantic Data Augmentation for Long-Tailed Classification
Imagine by Reasoning: A Reasoning-Based Implicit Semantic Data Augmentation for Long-Tailed Classification Prerequisite PyTorch = 1.2.0 Python3 torch
This repository includes code of my study about Asynchronous in Frequency domain of GAN images.
Exploring the Asynchronous of the Frequency Spectra of GAN-generated Facial Images Binh M. Le & Simon S. Woo, "Exploring the Asynchronous of the Frequ
BLEND: A Fast, Memory-Efficient, and Accurate Mechanism to Find Fuzzy Seed Matches
BLEND is a mechanism that can efficiently find fuzzy seed matches between sequences to significantly improve the performance and accuracy while reducing the memory space usage of two important applications: 1) finding overlapping reads and 2) read mapping. Described by Firtina et al.
On the Complementarity between Pre-Training and Back-Translation for Neural Machine Translation (Findings of EMNLP 2021))
PTvsBT On the Complementarity between Pre-Training and Back-Translation for Neural Machine Translation (Findings of EMNLP 2021) Citation Please cite a
Ensembling Off-the-shelf Models for GAN Training
Vision-aided GAN video (3m) | website | paper Can the collective knowledge from a large bank of pretrained vision models be leveraged to improve GAN t
Fine-grained Post-training for Improving Retrieval-based Dialogue Systems - NAACL 2021
Fine-grained Post-training for Multi-turn Response Selection Implements the model described in the following paper Fine-grained Post-training for Impr
ARKitScenes - A Diverse Real-World Dataset for 3D Indoor Scene Understanding Using Mobile RGB-D Data
ARKitScenes This repo accompanies the research paper, ARKitScenes - A Diverse Real-World Dataset for 3D Indoor Scene Understanding Using Mobile RGB-D
Training and Evaluation Code for Neural Volumes
Neural Volumes This repository contains training and evaluation code for the paper Neural Volumes. The method learns a 3D volumetric representation of
PyTorch implementation of our ICCV 2019 paper: Liquid Warping GAN: A Unified Framework for Human Motion Imitation, Appearance Transfer and Novel View Synthesis
Impersonator PyTorch implementation of our ICCV 2019 paper: Liquid Warping GAN: A Unified Framework for Human Motion Imitation, Appearance Transfer an
A DiY holiday project to demonstrate how you can send data from adafruitIO cloud to a balena edge device
holiday-star balena ❤️ adafruitIO Introduction A DiY holiday project to demonstrate how you can send data from adafruitIO cloud to a balena edge devic
A hangman game that I created. Thanks to Data Flair for giving me the code!
Hangman A hangman game that I created. Thanks to Data Flair for giving me the code! Run python3 hangman.py in a terminal if you have Python 3. Please
A Streamlit web-app for a data-science project that aims to evaluate if the answer to a question is helpful.
How useful is the aswer? A Streamlit web-app for a data-science project that aims to evaluate if the answer to a question is helpful. If you want to l
DataShare - Simple library for data sharing between scripts and public functions calling
DataShare - Simple library for data sharing between scripts and public functions calling. Installation. Install code, Delete LICENSE, README, readme.t
Ensembling Off-the-shelf Models for GAN Training
Data-Efficient GANs with DiffAugment project | paper | datasets | video | slides Generated using only 100 images of Obama, grumpy cats, pandas, the Br
The official repository for ROOT: analyzing, storing and visualizing big data, scientifically
About The ROOT system provides a set of OO frameworks with all the functionality needed to handle and analyze large amounts of data in a very efficien
Fit models to your data in Python with Sherpa.
Table of Contents Sherpa License How To Install Sherpa Using Anaconda Using pip Building from source History Release History Sherpa Sherpa is a modeli
Download candlestick data fast & easy for analysis
crypto-candlesticks 📈 The goal behind this project is to facilitate downloading cryptocurrency candlestick data fast & simple. Currently only the Bit
Label data using HuggingFace's transformers and automatically get a prediction service
Label Studio for Hugging Face's Transformers Website • Docs • Twitter • Join Slack Community Transfer learning for NLP models by annotating your textu
🐍 A hyper-fast Python module for reading/writing JSON data using Rust's serde-json.
A hyper-fast, safe Python module to read and write JSON data. Works as a drop-in replacement for Python's built-in json module. This is alpha software
Codes for realizing theories learned from Data Mining, Machine Learning, Deep Learning without using the present Python packages.
Codes-for-Algorithms Codes for realizing theories learned from Data Mining, Machine Learning, Deep Learning without using the present Python packages.
A model that attempts to learn and benefit from data collected on card counting.
A model that attempts to learn and benefit from data collected on card counting. A decision tree like model is built to win more often than loose and increase the bet of the player appropriately to come out winning as much money as possible.
Offline Multi-Agent Reinforcement Learning Implementations: Solving Overcooked Game with Data-Driven Method
Overcooked-AI We suppose to apply traditional offline reinforcement learning technique to multi-agent algorithm. In this repository, we implemented be
For radiometrically calibrating and PSF deconvolving IRIS data
irispreppy For radiometrically calibrating and PSF deconvolving IRIS data. I dislike how I need to own proprietary software (IDL) just to simply prepa
Data 25 Star Wars Project With Python
Data 25 Star Wars Project Instructions The character data in your MongoDB database has been pulled from https://swapi.tech/. As well as 'people', the
Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognition
SEW (Squeezed and Efficient Wav2vec) The repo contains the code of the paper "Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speec
FastFormers - highly efficient transformer models for NLU
FastFormers FastFormers provides a set of recipes and methods to achieve highly efficient inference of Transformer models for Natural Language Underst
Transformer training code for sequential tasks
Sequential Transformer This is a code for training Transformers on sequential tasks such as language modeling. Unlike the original Transformer archite
Tools to download and cleanup Common Crawl data
cc_net Tools to download and clean Common Crawl as introduced in our paper CCNet. If you found these resources useful, please consider citing: @inproc
Pre-training with Extracted Gap-sentences for Abstractive SUmmarization Sequence-to-sequence models
PEGASUS library Pre-training with Extracted Gap-sentences for Abstractive SUmmarization Sequence-to-sequence models, or PEGASUS, uses self-supervised
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Hiring We are hiring at all levels (including FTE researchers and interns)! If you are interested in working with us on NLP and large-scale pre-traine
Unsupervised Language Model Pre-training for French
FlauBERT and FLUE FlauBERT is a French BERT trained on a very large and heterogeneous French corpus. Models of different sizes are trained using the n
Large-scale pretraining for dialogue
A State-of-the-Art Large-scale Pretrained Response Generation Model (DialoGPT) This repository contains the source code and trained model for a large-
Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing
Introduction Funnel-Transformer is a new self-attention model that gradually compresses the sequence of hidden states to a shorter one and hence reduc
Code for EMNLP20 paper: "ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training"
ProphetNet-X This repo provides the code for reproducing the experiments in ProphetNet. In the paper, we propose a new pre-trained language model call
ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
ELECTRA Introduction ELECTRA is a method for self-supervised language representation learning. It can be used to pre-train transformer networks using
A tensorflow/keras implementation of StyleGAN to generate images of new Pokemon.
PokeGAN A tensorflow/keras implementation of StyleGAN to generate images of new Pokemon. Dataset The model has been trained on dataset that includes 8
A dashboard for Raspberry Pi to display environmental weather data, rain radar, weather forecast, etc. written in Python
Weather Clock for Raspberry PI This project is a dashboard for Raspberry Pi to display environmental weather data, rain radar, weather forecast, etc.
Pytorch codes for Feature Transfer Learning for Face Recognition with Under-Represented Data
FTLNet_Pytorch Pytorch codes for Feature Transfer Learning for Face Recognition with Under-Represented Data 1. Introduction This repo is an unofficial
A Pytorch reproduction of Range Loss, which is proposed in paper 《Range Loss for Deep Face Recognition with Long-Tailed Training Data》
RangeLoss Pytorch This is a Pytorch reproduction of Range Loss, which is proposed in paper 《Range Loss for Deep Face Recognition with Long-Tailed Trai
Efficient and Scalable Physics-Informed Deep Learning and Scientific Machine Learning on top of Tensorflow for multi-worker distributed computing
Notice: Support for Python 3.6 will be dropped in v.0.2.1, please plan accordingly! Efficient and Scalable Physics-Informed Deep Learning Collocation-
DANet for Tabular data classification/ regression.
Deep Abstract Networks A pyTorch implementation for AAAI-2022 paper DANets: Deep Abstract Networks for Tabular Data Classification and Regression. Bri
A hackerank problems, solution repository
This is a repository for all hackerank challenges kindly note this is for learning purposes and if you wish to contribute, dont hesitate all submision
Using CNN to mimic the driver based on training data from Torcs
Behavioural-Cloning-in-autonomous-driving Using CNN to mimic the driver based on training data from Torcs. Approach First, the data was collected from
👻🟡 Download all Snapchat video & photo memories from a data export.
Snapchat "Memories" Fetcher In compliance with the California Consumer Privacy Act of 2018 (“CCPA”), businesses which collect and store user data must
High performance, editable, stylable datagrids in jupyter and jupyterlab
An ipywidgets wrapper of regular-table for Jupyter. Examples Two Billion Rows Notebook Click Events Notebook Edit Events Notebook Styling Notebook Pan
Tools for calculating and visualizing Elo-like ratings of MLB teams using Retosheet data
Overview This project uses historical baseball games data to calculate an Elo-like rating for MLB teams based on regular season match ups. The Elo rat
Converts a Bangla numeric string to literal words.
Bangla Number in Words Converts a Bangla numeric string to literal words. Install $ pip install banglanum2words Usage
Kaggle Competition using 15 numerical predictors to predict a continuous outcome.
Kaggle-Comp.-Data-Mining Kaggle Competition using 15 numerical predictors to predict a continuous outcome as part of a final project for a stats data
A public data repository for datasets created from TransLink GTFS data.
TransLink Spatial Data What: TransLink is the statutory public transit authority for the Metro Vancouver region. This GitHub repository is a collectio
Webtesting for course Data Structures & Algorithms
Selenium job to automate queries to check last posts of Module Data Structures & Algorithms Web-testing for course Data Structures & Algorithms Struct
Robbing the FED: Directly Obtaining Private Data in Federated Learning with Modified Models
Robbing the FED: Directly Obtaining Private Data in Federated Learning with Modified Models This repo contains a barebones implementation for the atta
A simple program for training and testing vit
Vit This is a simple program for training and testing vit. Key requirements: torch, torchvision and timm. Dataset I put 5 categories of the cub classi
The python SDK for Eto, the AI focused data platform for teams bringing AI models to production
Eto Labs Python SDK This is the python SDK for Eto, the AI focused data platform for teams bringing AI models to production. The python SDK makes it e
A script that publishes power usage data of iDrac enabled servers to an MQTT broker for integration into automation and power monitoring systems
iDracPowerMonitorMQTT This script publishes iDrac power draw data for iDrac 6 enabled servers to an MQTT broker. This can be used to integrate the pow
SPEAR: Semi suPErvised dAta progRamming
Semi-Supervised Data Programming for Data Efficient Machine Learning SPEAR is a library for data programming with semi-supervision. The package implem
PantheonRL is a package for training and testing multi-agent reinforcement learning environments.
PantheonRL is a package for training and testing multi-agent reinforcement learning environments. PantheonRL supports cross-play, fine-tuning, ad-hoc coordination, and more.
Official code base for the poster "On the use of Cortical Magnification and Saccades as Biological Proxies for Data Augmentation" published in NeurIPS 2021 Workshop (SVRHM)
Self-Supervised Learning (SimCLR) with Biological Plausible Image Augmentations Official code base for the poster "On the use of Cortical Magnificatio
A high-performance distributed deep learning system targeting large-scale and automated distributed training.
HETU Documentation | Examples Hetu is a high-performance distributed deep learning system targeting trillions of parameters DL model training, develop
pandas: powerful Python data analysis toolkit
pandas is a Python package that provides fast, flexible, and expressive data structures designed to make working with "relational" or "labeled" data both easy and intuitive.
A mutable set that remembers the order of its entries. One of Python's missing data types.
An OrderedSet is a mutable data structure that is a hybrid of a list and a set. It remembers the order of its entries, and every entry has an index number that can be looked up.
Estoult - a Python toolkit for data mapping with an integrated query builder for SQL databases
Estoult Estoult is a Python toolkit for data mapping with an integrated query builder for SQL databases. It currently supports MySQL, PostgreSQL, and
A real-time financial data streaming pipeline and visualization platform using Apache Kafka, Cassandra, and Bokeh.
Realtime Financial Market Data Visualization and Analysis Introduction This repo shows my project about real-time stock data pipeline. All the code is
Generate high quality pictures. GAN. Generative Adversarial Networks
ESRGAN generate high quality pictures. GAN. Generative Adversarial Networks """ Super-resolution of CelebA using Generative Adversarial Networks. The
Pyspark project that able to do joins on the spark data frames.
SPARK JOINS This project is to perform inner, all outer joins and semi joins. create_df.py: load_data.py : helps to put data into Spark data frames. d
yt is an open-source, permissively-licensed Python library for analyzing and visualizing volumetric data.
The yt Project yt is an open-source, permissively-licensed Python library for analyzing and visualizing volumetric data. yt supports structured, varia
MLReef is an open source ML-Ops platform that helps you collaborate, reproduce and share your Machine Learning work with thousands of other users.
The collaboration platform for Machine Learning MLReef is an open source ML-Ops platform that helps you collaborate, reproduce and share your Machine
GebPy is a Python-based, open source tool for the generation of geological data of minerals, rocks and complete lithological sequences.
GebPy is a Python-based, open source tool for the generation of geological data of minerals, rocks and complete lithological sequences. The data can be generated randomly or with respect to user-defined constraints, for example a specific element concentration within minerals and rocks or the order of units within a complete lithological profile.
A python interface for training Reinforcement Learning bots to battle on pokemon showdown
The pokemon showdown Python environment A Python interface to create battling pokemon agents. poke-env offers an easy-to-use interface for creating ru
Predicting Baseball Metric Clusters: Clustering Application in Python Using scikit-learn
Clustering Clustering Application in Python Using scikit-learn This repository contains the prediction of baseball metric clusters using MLB Statcast
PyGAD, a Python 3 library for building the genetic algorithm and training machine learning algorithms (Keras & PyTorch).
PyGAD: Genetic Algorithm in Python PyGAD is an open-source easy-to-use Python 3 library for building the genetic algorithm and optimizing machine lear
Selecting Parallel In-domain Sentences for Neural Machine Translation Using Monolingual Texts
DataSelection-NMT Selecting Parallel In-domain Sentences for Neural Machine Translation Using Monolingual Texts Quick update: The paper got accepted o
FinRL-Meta: A Universe for Data-Driven Financial Reinforcement Learning. 🔥
FinRL-Meta: A Universe of Market Environments. FinRL-Meta is a universe of market environments for data-driven financial reinforcement learning. Users
Centroid-UNet is deep neural network model to detect centroids from satellite images.
Centroid UNet - Locating Object Centroids in Aerial/Serial Images Introduction Centroid-UNet is deep neural network model to detect centroids from Aer
This is the official source code of "BiCAT: Bi-Chronological Augmentation of Transformer for Sequential Recommendation".
BiCAT This is our TensorFlow implementation for the paper: "BiCAT: Sequential Recommendation with Bidirectional Chronological Augmentation of Transfor
Pydantic model generator for easy conversion of JSON, OpenAPI, JSON Schema, and YAML data sources.
datamodel-code-generator This code generator creates pydantic model from an openapi file and others. Help See documentation for more details. Supporte
A Django app that creates automatic web UIs for Python scripts.
Wooey is a simple web interface to run command line Python scripts. Think of it as an easy way to get your scripts up on the web for routine data anal
RDFLib is a Python library for working with RDF, a simple yet powerful language for representing information.
RDFLib RDFLib is a pure Python package for working with RDF. RDFLib contains most things you need to work with RDF, including: parsers and serializers
WebRTC and ORTC implementation for Python using asyncio
aiortc What is aiortc? aiortc is a library for Web Real-Time Communication (WebRTC) and Object Real-Time Communication (ORTC) in Python. It is built o
Titanic data analysis for python
Titanic-data-analysis This Repo is an analysis on Titanic_mod.csv This csv file contains some assumed data of the Titanic ship after sinking This full
The Python-Weather-App is a service that provides weather data
The Python-Weather-App is a service that provides weather data, including current weather data to the developers of web services and mobile applications.
Big data on k8s
# microsoft azure # https://docs.microsoft.com/en-us/cli/azure/install-azure-cli az account set --subscription [] az aks get-credentials --resource-g
A super lightweight Lagrangian model for calculating millions of trajectories using ERA5 data
Easy-ERA5-Trck Easy-ERA5-Trck Galleries Install Usage Repository Structure Module Files Version iteration Easy-ERA5-Trck is a super lightweight Lagran
The geospatial toolkit for redistricting data.
maup maup is the geospatial toolkit for redistricting data. The package streamlines the basic workflows that arise when working with blocks, precincts
Delta Sharing: An Open Protocol for Secure Data Sharing
Delta Sharing: An Open Protocol for Secure Data Sharing Delta Sharing is an open protocol for secure real-time exchange of large datasets, which enabl
PyIOmica (pyiomica) is a Python package for omics analyses.
PyIOmica (pyiomica) This repository contains PyIOmica, a Python package that provides bioinformatics utilities for analyzing (dynamic) omics datasets.
A part of HyRiver software stack for accessing hydrology data through web services
Package Description Status PyNHD Navigate and subset NHDPlus (MR and HR) using web services Py3DEP Access topographic data through National Map's 3DEP
Random dataframe and database table generator
Random database/dataframe generator Authored and maintained by Dr. Tirthajyoti Sarkar, Fremont, USA Introduction Often, beginners in SQL or data scien
A 3D Slicer Extension to view data from the flywheel heirarchy
flywheel-connect A 3D Slicer Extension to view, select, and download images from a Flywheel instance to 3D Slicer and storing Slicer outputs back to F
Streaming parser for multipart/form-data written in Python
Streaming multipart/form-data parser streaming_form_data provides a Python parser for parsing multipart/form-data input chunks (the encoding used when
A python module for retrieving and parsing WHOIS data
pythonwhois A WHOIS retrieval and parsing library for Python. Dependencies None! All you need is the Python standard library. Instructions The manual
Libextract: extract data from websites
Libextract is a statistics-enabled data extraction library that works on HTML and XML documents and written in Python
Persistent/Immutable/Functional data structures for Python
Pyrsistent Pyrsistent is a number of persistent collections (by some referred to as functional data structures). Persistent in the sense that they are
Persistent dict, backed by sqlite3 and pickle, multithread-safe.
sqlitedict -- persistent dict, backed-up by SQLite and pickle A lightweight wrapper around Python's sqlite3 database with a simple, Pythonic dict-like
A mutable set that remembers the order of its entries. One of Python's missing data types.
An OrderedSet is a mutable data structure that is a hybrid of a list and a set. It remembers the order of its entries, and every entry has an index nu
Python tree data library
Links Documentation PyPI GitHub Changelog Issues Contributors If you enjoy anytree Getting started Usage is simple. Construction from anytree impo
A JSON-friendly data structure which allows both object attributes and dictionary keys and values to be used simultaneously and interchangeably.
A JSON-friendly data structure which allows both object attributes and dictionary keys and values to be used simultaneously and interchangeably.
This is an implementation of PEP 557, Data Classes.
This is an implementation of PEP 557, Data Classes. It is a backport for Python 3.6. Because dataclasses will be included in Python 3.7, any discussio
D2LV: A Data-Driven and Local-Verification Approach for Image Copy Detection
Facebook AI Image Similarity Challenge: Matching Track —— Team: imgFp This is the source code of our 3rd place solution to matching track of Image Sim