3196 Repositories
Python data-processing Libraries
A query expression for extracting data from JSON.
JSONPATH A selector expression for extracting data from JSON. Quickstarts Installation Install the stable version from PYPI. pip install jsonpath-extr
Elasticsearch tool for easily collecting and batch inserting Python data and pandas DataFrames
ElasticBatch Elasticsearch buffer for collecting and batch inserting Python data and pandas DataFrames Overview ElasticBatch makes it easy to efficien
Python API for HotBits random data generator
HotBits Python API Python API for HotBits random data generator. Description This project is random data generator. It uses is HotBits API web service
Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.
Redash is designed to enable anyone, regardless of the level of technical sophistication, to harness the power of data big and small. SQL users levera
Python3 command-line tool for the inference of Boolean rules and pathway analysis on omics data
BONITA-Python3 BONITA was originally written in Python 2 and tested with Python 2-compatible packages. This version of the packages ports BONITA to Py
Turn any live video stream or locally stored video into a dataset of interesting samples for ML training, or any other type of analysis.
Sieve Video Data Collection Example Find samples that are interesting within hours of raw video, for free and completely automatically using Sieve API
Code for MentorNet: Learning Data-Driven Curriculum for Very Deep Neural Networks
MentorNet: Learning Data-Driven Curriculum for Very Deep Neural Networks This is the code for the paper: MentorNet: Learning Data-Driven Curriculum fo
Open source annotation tool for machine learning practitioners.
doccano doccano is an open source text annotation tool for humans. It provides annotation features for text classification, sequence labeling and sequ
AKShare is an elegant and simple financial data interface library for Python, built for human beings
AKShare is an elegant and simple financial data interface library for Python, built for human beings
Tools and data for measuring the popularity & growth of various programming languages.
growth-data Tools and data for measuring the popularity & growth of various programming languages. Install the dependencies $ pip install -r requireme
Python Libraries with functions and constants related to electrical engineering.
ElectricPy Electrical-Engineering-for-Python Python Libraries with functions and constants related to electrical engineering. The functions and consta
Monitor the stability of a pandas or spark dataframe ⚙︎
Population Shift Monitoring popmon is a package that allows one to check the stability of a dataset. popmon works with both pandas and spark datasets.
ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.
ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.
CleverCSV is a Python package for handling messy CSV files.
CleverCSV is a Python package for handling messy CSV files. It provides a drop-in replacement for the builtin CSV module with improved dialect detection, and comes with a handy command line application for working with CSV files.
This repository provides data for the VAW dataset as described in the CVPR 2021 paper titled "Learning to Predict Visual Attributes in the Wild"
Visual Attributes in the Wild (VAW) This repository provides data for the VAW dataset as described in the CVPR 2021 Paper: Learning to Predict Visual
A CLI tool to reduce the friction between data scientists by reducing git conflicts removing notebook metadata and gracefully resolving git conflicts.
databooks is a package for reducing the friction data scientists while using Jupyter notebooks, by reducing the number of git conflicts between different notebooks and assisting in the resolution of the conflicts.
Credo AI Lens is a comprehensive assessment framework for AI systems. Lens standardizes model and data assessment, and acts as a central gateway to assessments created in the open source community.
Lens by Credo AI - Responsible AI Assessment Framework Lens is a comprehensive assessment framework for AI systems. Lens standardizes model and data a
A Python wrapper API for operating and working with the Neo4j Graph Data Science (GDS) library
gdsclient NOTE: This is a work in progress and many GDS features are known to be missing or not working properly. This repo hosts the sources for gdsc
All course materials for the Zero to Mastery Machine Learning and Data Science course.
Zero to Mastery Machine Learning Welcome! This repository contains all of the code, notebooks, images and other materials related to the Zero to Maste
Repositório do Projeto de Jogo da Resília Educação.
Jogo da Segurança das Indústrias Acme Descrição Este jogo faz parte do projeto de entrega do primeiro módulo da Resilia Educação, referente ao curso d
🔮 A usefull set of scripts to dig into your Discord data package.
Discord DataExtractor 🔮 Discord DataExtractor is a set of scripts that allows you to dig into your Discord Data package. Repository guide ☕ Coffee_Ga
🍰 ConnectMP - An easy and efficient way to share data between Processes in Python.
ConnectMP - Taking Multi-Process Data Sharing to the moon 🚀 Contribute · Community · Documentation 🎫 Introduction : 🍤 ConnectMP is the easiest and
A Python wrapper API for operating and working with the Neo4j Graph Data Science (GDS) library
gdsclient This repo hosts the sources for gdsclient, a Python wrapper API for operating and working with the Neo4j Graph Data Science (GDS) library. g
This repository contains the code, data, and models of the paper titled "CrossSum: Beyond English-Centric Cross-Lingual Abstractive Text Summarization for 1500+ Language Pairs".
CrossSum This repository contains the code, data, and models of the paper titled "CrossSum: Beyond English-Centric Cross-Lingual Abstractive Text Summ
Official codebase for running the small, filtered-data GLIDE model from GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models.
GLIDE This is the official codebase for running the small, filtered-data GLIDE model from GLIDE: Towards Photorealistic Image Generation and Editing w
Quick tutorial on orchest.io that shows how to build multiple deep learning models on your data with a single line of code using python
Deep AutoViML Pipeline for orchest.io Quickstart Build Deep Learning models with a single line of code: deep_autoviml Deep AutoViML helps you build te
📜 GPT-2 Rhyming Limerick and Haiku models using data augmentation
Well-formed Limericks and Haikus with GPT2 📜 GPT-2 Rhyming Limerick and Haiku models using data augmentation In collaboration with Matthew Korahais &
A desktop application developed in Python with PyQt5 to predict demand and help monitor and schedule brewing processes for Barnaby's Brewhouse.
brewhouse-management A desktop application developed in Python with PyQt5 to predict demand and help monitor and schedule brewing processes for Barnab
Data Platform com AWS CDK
Welcome to your CDK Python project! This is a blank project for Python development with CDK. The cdk.json file tells the CDK Toolkit how to execute yo
Get MODBUS data from Sofar (K-TLX) inverter through LSW-3 or LSE module
SOFAR Inverter + LSW-3/LSE Small utility to read data from SOFAR K-TLX inverters through the Solarman (LSW-3/LSE) datalogger. Two scripts to get inver
Data and analysis relating to the 5.8M Melbourne quake of 2021
quake2021 Data and analysis relating to the 5.8M Melbourne quake of 2021 Monash University Woodside Living Lab Building The building is located here T
Scrapegoat is a python library that can be used to scrape the websites from internet based on the relevance of the given topic irrespective of language using Natural Language Processing
Scrapegoat is a python library that can be used to scrape the websites from internet based on the relevance of the given topic irrespective of language using Natural Language Processing. It can be mainly used for non-English language to get accurate and relevant scraped text.
Automatic data visualization in atom with the nteract data-explorer
Data Explorer Interactively explore your data directly in atom with hydrogen! The nteract data-explorer provides automatic data visualization, so you
Tidy data structures, summaries, and visualisations for missing data
naniar naniar provides principled, tidy ways to summarise, visualise, and manipulate missing data with minimal deviations from the workflows in ggplot
Catch-all collection of generative art made using processing
Generative art with Processing.py Some art I have created for fun. Dependencies Processing for Python, see how to download/use here Packages contained
Streaming over lightweight data transformations
Description Data augmentation libarary for Deep Learning, which supports images, segmentation masks, labels and keypoints. Furthermore, SOLT is fast a
A knowledge base construction engine for richly formatted data
Fonduer is a Python package and framework for building knowledge base construction (KBC) applications from richly formatted data. Note that Fonduer is
computer vision, image processing and machine learning on the web browser or node.
Image processing and Machine learning labs computer vision, image processing and machine learning on the web browser or node note Fast Fourier Trans
Automatic labeling, conversion of different data set formats, sample size statistics, model cascade
Simple Gadget Collection for Object Detection Tasks Automatic image annotation Conversion between different annotation formats Obtain statistical info
This is a web crawler that works on employ email data by gmane.org and visualizes it in different ways.
crawler_to_visual_gmane Analyzing an EMAIL Archive from gmane and vizualizing the data using the D3 JavaScript library. This is a set of tools that al
Base on browser-time to get har from network, and use python to analyze the data .
base on browser-time to get har from network, and use python to analyze the data
A custom qq-plot for two sample data comparision
QQ-Plot 2 Sample Just a gist to include the custom code to draw a qq-plot in python when dealing with a "two sample problem". This means when u try to
This script provides LIVE feedback for On-The-Fly data collection with RELION
README This script provides LIVE feedback for On-The-Fly data collection with RELION (very useful to explore already processed datasets too!) Creating
Fast methods to work with hydro- and topography data in pure Python.
PyFlwDir Intro PyFlwDir contains a series of methods to work with gridded DEM and flow direction datasets, which are key to many workflows in many ear
Pytorch library for seismic data augmentation
Pytorch library for seismic data augmentation
A Parameter-free Deep Embedded Clustering Method for Single-cell RNA-seq Data
A Parameter-free Deep Embedded Clustering Method for Single-cell RNA-seq Data Overview Clustering analysis is widely utilized in single-cell RNA-seque
Turning images into '9-pan' palettes using KMeans clustering from sklearn.
img2palette Turning images into '9-pan' palettes using KMeans clustering from sklearn. Requirements We require: Pillow, for opening and processing ima
Highly decentralized and censorship-resistant way to store key data
Beacon coin Beacon coin is a Chia singelton coin that can store data that needs to be: always available censorship resistant versioned potentially imm
Convert Table data to approximate values with GUI
Table_Editor Convert Table data to approximate values with GUIs... usage - Import methods for extension Tables. Imported method supposed to have only
Python SDK for LUSID by FINBOURNE, a bi-temporal investment management data platform with portfolio accounting capabilities.
LUSID® Python SDK This is the Python SDK for LUSID by FINBOURNE, a bi-temporal investment management data platform with portfolio accounting capabilit
simple way to build the declarative and destributed data pipelines with python
unipipeline simple way to build the declarative and distributed data pipelines. Why you should use it Declarative strict config Scaffolding Fully type
Python library for creating data pipelines with chain functional programming
PyFunctional Features PyFunctional makes creating data pipelines easy by using chained functional operators. Here are a few examples of what it can do
Automation that uses Github Actions, Google Drive API, YouTube Data API and youtube-dl together to feed BackJam app with new music
Automation that uses Github Actions, Google Drive API, YouTube Data API and youtube-dl together to feed BackJam app with new music
Evaluation of file formats in the context of geo-referenced 3D geometries.
Geo-referenced Geometry File Formats Classic geometry file formats as .obj, .off, .ply, .stl or .dae do not support the utilization of coordinate syst
Fast Python reader and editor for ASAM MDF / MF4 (Measurement Data Format) files
asammdf is a fast parser and editor for ASAM (Association for Standardization of Automation and Measuring Systems) MDF (Measurement Data Format) files
🌎 The Modern Declarative Data Flow Framework for the AI Empowered Generation.
🌎 JSONClasses JSONClasses is a declarative data flow pipeline and data graph framework. Official Website: https://www.jsonclasses.com Official Docume
A full pipeline AutoML tool for tabular data
HyperGBM Doc | 中文 We Are Hiring! Dear folks,we are offering challenging opportunities located in Beijing for both professionals and students who are k
Feature Store for Machine Learning
Overview Feast is an open source feature store for machine learning. Feast is the fastest path to productionizing analytic data for model training and
A distributed block-based data storage and compute engine
Nebula is an extremely-fast end-to-end interactive big data analytics solution. Nebula is designed as a high-performance columnar data storage and tabular OLAP engine.
AtlasNet: A Papier-Mâché Approach to Learning 3D Surface Generation
AtlasNet [Project Page] [Paper] [Talk] AtlasNet: A Papier-Mâché Approach to Learning 3D Surface Generation Thibault Groueix, Matthew Fisher, Vladimir
Use .csv files to record, play and evaluate motion capture data.
Purpose These scripts allow you to record mocap data to, and play from .csv files. This approach facilitates parsing of body movement data in statisti
Healthsea is a spaCy pipeline for analyzing user reviews of supplementary products for their effects on health.
Welcome to Healthsea ✨ Create better access to health with spaCy. Healthsea is a pipeline for analyzing user reviews to supplement products by extract
Python package for analyzing sensor-collected human motion data
Python package for analyzing sensor-collected human motion data
Motion Reconstruction Code and Data for Skills from Videos (SFV)
Motion Reconstruction Code and Data for Skills from Videos (SFV) This repo contains the data and the code for motion reconstruction component of the S
demir.ai Dataset Operations
demir.ai Dataset Operations With this application, you can have the empty values (nan/null) deleted or filled before giving your dataset to machine le
pyo is a Python module written in C to help digital signal processing script creation.
pyo is a Python module written in C to help digital signal processing script creation.
Itchio Downloader Tool with python
Itchio Downloader Tool Install pip install git+https://github.com/emersont1/itchio Download All Games in library from account python -m itchio.downloa
Process RunGap output file of a workout and load data into Apple Numbers Spreadsheet and my website with API calls
BSD 3-Clause License Copyright (c) 2020, Mike Bromberek All rights reserved. ProcessWorkout Exercise data is exported in JSON format to iCloud using
Howell County, Missouri, COVID-19 data and (unofficial) estimates
COVID-19 in Howell County, Missouri This repository contains the daily data files used to generate my COVID-19 dashboard for Howell County, Missouri,
BERT, LDA, and TFIDF based keyword extraction in Python
BERT, LDA, and TFIDF based keyword extraction in Python kwx is a toolkit for multilingual keyword extraction based on Google's BERT and Latent Dirichl
Functions for easily making publication-quality figures with matplotlib.
Data-viz utils 📈 Functions for data visualization in matplotlib 📚 API Can be installed using pip install dvu and then imported with import dvu. You
BERT-based Financial Question Answering System
BERT-based Financial Question Answering System In this example, we use Jina, PyTorch, and Hugging Face transformers to build a production-ready BERT-b
Continuously evaluated, functional, incremental, time-series forecasting
timemachines Autonomous, univariate, k-step ahead time-series forecasting functions assigned Elo ratings You can: Use some of the functionality of a s
Code for: Imagine by Reasoning: A Reasoning-Based Implicit Semantic Data Augmentation for Long-Tailed Classification
Imagine by Reasoning: A Reasoning-Based Implicit Semantic Data Augmentation for Long-Tailed Classification Prerequisite PyTorch = 1.2.0 Python3 torch
ARKitScenes - A Diverse Real-World Dataset for 3D Indoor Scene Understanding Using Mobile RGB-D Data
ARKitScenes This repo accompanies the research paper, ARKitScenes - A Diverse Real-World Dataset for 3D Indoor Scene Understanding Using Mobile RGB-D
Code for TIP 2017 paper --- Illumination Decomposition for Photograph with Multiple Light Sources.
Illumination_Decomposition Code for TIP 2017 paper --- Illumination Decomposition for Photograph with Multiple Light Sources. This code implements the
A DiY holiday project to demonstrate how you can send data from adafruitIO cloud to a balena edge device
holiday-star balena ❤️ adafruitIO Introduction A DiY holiday project to demonstrate how you can send data from adafruitIO cloud to a balena edge devic
A hangman game that I created. Thanks to Data Flair for giving me the code!
Hangman A hangman game that I created. Thanks to Data Flair for giving me the code! Run python3 hangman.py in a terminal if you have Python 3. Please
A Streamlit web-app for a data-science project that aims to evaluate if the answer to a question is helpful.
How useful is the aswer? A Streamlit web-app for a data-science project that aims to evaluate if the answer to a question is helpful. If you want to l
DataShare - Simple library for data sharing between scripts and public functions calling
DataShare - Simple library for data sharing between scripts and public functions calling. Installation. Install code, Delete LICENSE, README, readme.t
User-friendly Voice Cloning Application
Multi-Language-RTVC stands for Multi-Language Real Time Voice Cloning and is a Voice Cloning Tool capable of transfering speaker-specific audio featur
Ensembling Off-the-shelf Models for GAN Training
Data-Efficient GANs with DiffAugment project | paper | datasets | video | slides Generated using only 100 images of Obama, grumpy cats, pandas, the Br
The official repository for ROOT: analyzing, storing and visualizing big data, scientifically
About The ROOT system provides a set of OO frameworks with all the functionality needed to handle and analyze large amounts of data in a very efficien
Fit models to your data in Python with Sherpa.
Table of Contents Sherpa License How To Install Sherpa Using Anaconda Using pip Building from source History Release History Sherpa Sherpa is a modeli
Download candlestick data fast & easy for analysis
crypto-candlesticks 📈 The goal behind this project is to facilitate downloading cryptocurrency candlestick data fast & simple. Currently only the Bit
Label data using HuggingFace's transformers and automatically get a prediction service
Label Studio for Hugging Face's Transformers Website • Docs • Twitter • Join Slack Community Transfer learning for NLP models by annotating your textu
Synchrosqueezing, wavelet transforms, and time-frequency analysis in Python
Synchrosqueezing is a powerful reassignment method that focuses time-frequency representations, and allows extraction of instantaneous amplitudes and frequencies
🐍 A hyper-fast Python module for reading/writing JSON data using Rust's serde-json.
A hyper-fast, safe Python module to read and write JSON data. Works as a drop-in replacement for Python's built-in json module. This is alpha software
Codes for realizing theories learned from Data Mining, Machine Learning, Deep Learning without using the present Python packages.
Codes-for-Algorithms Codes for realizing theories learned from Data Mining, Machine Learning, Deep Learning without using the present Python packages.
A model that attempts to learn and benefit from data collected on card counting.
A model that attempts to learn and benefit from data collected on card counting. A decision tree like model is built to win more often than loose and increase the bet of the player appropriately to come out winning as much money as possible.
Offline Multi-Agent Reinforcement Learning Implementations: Solving Overcooked Game with Data-Driven Method
Overcooked-AI We suppose to apply traditional offline reinforcement learning technique to multi-agent algorithm. In this repository, we implemented be
For radiometrically calibrating and PSF deconvolving IRIS data
irispreppy For radiometrically calibrating and PSF deconvolving IRIS data. I dislike how I need to own proprietary software (IDL) just to simply prepa
Data 25 Star Wars Project With Python
Data 25 Star Wars Project Instructions The character data in your MongoDB database has been pulled from https://swapi.tech/. As well as 'people', the
UniSpeech - Large Scale Self-Supervised Learning for Speech
UniSpeech The family of UniSpeech: WavLM (arXiv): WavLM: Large-Scale Self-Supervised Pre-training for Full Stack Speech Processing UniSpeech (ICML 202
Tools to download and cleanup Common Crawl data
cc_net Tools to download and clean Common Crawl as introduced in our paper CCNet. If you found these resources useful, please consider citing: @inproc
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
English | 简体中文 | 繁體中文 | 한국어 State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow 🤗 Transformers provides thousands of pretrained models
Large-scale pretraining for dialogue
A State-of-the-Art Large-scale Pretrained Response Generation Model (DialoGPT) This repository contains the source code and trained model for a large-
Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing
Introduction Funnel-Transformer is a new self-attention model that gradually compresses the sequence of hidden states to a shorter one and hence reduc
Awesome Treasure of Transformers Models Collection
💁 Awesome Treasure of Transformers Models for Natural Language processing contains papers, videos, blogs, official repo along with colab Notebooks. 🛫☑️
A dashboard for Raspberry Pi to display environmental weather data, rain radar, weather forecast, etc. written in Python
Weather Clock for Raspberry PI This project is a dashboard for Raspberry Pi to display environmental weather data, rain radar, weather forecast, etc.