4253 Repositories
Python simple-report-data-table-vuetify Libraries
Data Analysis Baseline Library
dabl The data analysis baseline library. "Mr Sanchez, are you a data scientist?" "I dabl, Mr president." Find more information on the website. State o
Topological Data Analysis for Python🐍
Scikit-TDA is a home for Topological Data Analysis Python libraries intended for non-topologists. This project aims to provide a curated library of TD
scikit-learn cross validators for iterative stratification of multilabel data
iterative-stratification iterative-stratification is a project that provides scikit-learn compatible cross validators with stratification for multilab
(AAAI' 20) A Python Toolbox for Machine Learning Model Combination
combo: A Python Toolbox for Machine Learning Model Combination Deployment & Documentation & Stats Build Status & Coverage & Maintainability & License
A library of extension and helper modules for Python's data analysis and machine learning libraries.
Mlxtend (machine learning extensions) is a Python library of useful tools for the day-to-day data science tasks. Sebastian Raschka 2014-2021 Links Doc
A Python Package to Tackle the Curse of Imbalanced Datasets in Machine Learning
imbalanced-learn imbalanced-learn is a python package offering a number of re-sampling techniques commonly used in datasets showing strong between-cla
A simplified framework and utilities for PyTorch
Here is Poutyne. Poutyne is a simplified framework for PyTorch and handles much of the boilerplating code needed to train neural networks. Use Poutyne
Tez is a super-simple and lightweight Trainer for PyTorch. It also comes with many utils that you can use to tackle over 90% of deep learning projects in PyTorch.
Tez: a simple pytorch trainer NOTE: Currently, we are not accepting any pull requests! All PRs will be closed. If you want a feature or something does
PyTorch implementation of TabNet paper : https://arxiv.org/pdf/1908.07442.pdf
README TabNet : Attentive Interpretable Tabular Learning This is a pyTorch implementation of Tabnet (Arik, S. O., & Pfister, T. (2019). TabNet: Attent
A collection of extensions and data-loaders for few-shot learning & meta-learning in PyTorch
Torchmeta A collection of extensions and data-loaders for few-shot learning & meta-learning in PyTorch. Torchmeta contains popular meta-learning bench
Library for faster pinned CPU - GPU transfer in Pytorch
SpeedTorch Faster pinned CPU tensor - GPU Pytorch variabe transfer and GPU tensor - GPU Pytorch variable transfer, in certain cases. Update 9-29-1
General purpose GPU compute framework for cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). Blazing fast, mobile-enabled, asynchronous and optimized for advanced GPU data processing usecases.
Vulkan Kompute The general purpose GPU compute framework for cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). Blazing fast, mobile-enabl
BlazingSQL is a lightweight, GPU accelerated, SQL engine for Python. Built on RAPIDS cuDF.
A lightweight, GPU accelerated, SQL engine built on the RAPIDS.ai ecosystem. Get Started on app.blazingsql.com Getting Started | Documentation | Examp
A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
NVIDIA DALI The NVIDIA Data Loading Library (DALI) is a library for data loading and pre-processing to accelerate deep learning applications. It provi
📊 A simple command-line utility for querying and monitoring GPU status
gpustat Just less than nvidia-smi? NOTE: This works with NVIDIA Graphics Devices only, no AMD support as of now. Contributions are welcome! Self-Promo
🎛 Distributed machine learning made simple.
🎛 lazycluster Distributed machine learning made simple. Use your preferred distributed ML framework like a lazy engineer. Getting Started • Highlight
Distributed scikit-learn meta-estimators in PySpark
sk-dist: Distributed scikit-learn meta-estimators in PySpark What is it? sk-dist is a Python package for machine learning built on top of scikit-learn
Distributed Computing for AI Made Simple
Project Home Blog Documents Paper Media Coverage Join Fiber users email list [email protected] Fiber Distributed Computing for AI Made Simp
DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.
DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective. 10x Larger Models 10x Faster Trainin
BigDL: Distributed Deep Learning Framework for Apache Spark
BigDL: Distributed Deep Learning on Apache Spark What is BigDL? BigDL is a distributed deep learning library for Apache Spark; with BigDL, users can w
An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.
Ray provides a simple, universal API for building distributed applications. Ray is packaged with the following libraries for accelerating machine lear
Automatically build ARIMA, SARIMAX, VAR, FB Prophet and XGBoost Models on Time Series data sets with a Single Line of Code. Now updated with Dask to handle millions of rows.
Auto_TS: Auto_TimeSeries Automatically build multiple Time Series models using a Single Line of Code. Now updated with Dask. Auto_timeseries is a comp
A Python library for detecting patterns and anomalies in massive datasets using the Matrix Profile
matrixprofile-ts matrixprofile-ts is a Python 2 and 3 library for evaluating time series data using the Matrix Profile algorithms developed by the Keo
Python module for machine learning time series:
seglearn Seglearn is a python package for machine learning time series or sequences. It provides an integrated pipeline for segmentation, feature extr
STUMPY is a powerful and scalable Python library for computing a Matrix Profile, which can be used for a variety of time series data mining tasks
STUMPY STUMPY is a powerful and scalable library that efficiently computes something called the matrix profile, which can be used for a variety of tim
Real-time stream processing for python
Streamz Streamz helps you build pipelines to manage continuous streams of data. It is simple to use in simple cases, but also supports complex pipelin
A machine learning toolkit dedicated to time-series data
tslearn The machine learning toolkit for time series analysis in Python Section Description Installation Installing the dependencies and tslearn Getti
A unified framework for machine learning with time series
Welcome to sktime A unified framework for machine learning with time series We provide specialized time series algorithms and scikit-learn compatible
Automatic extraction of relevant features from time series:
tsfresh This repository contains the TSFRESH python package. The abbreviation stands for "Time Series Feature extraction based on scalable hypothesis
Qlib is an AI-oriented quantitative investment platform, which aims to realize the potential, empower the research, and create the value of AI technologies in quantitative investment. With Qlib, you can easily try your ideas to create better Quant investment strategies.
Qlib is an AI-oriented quantitative investment platform, which aims to realize the potential, empower the research, and create the value of AI technol
A python wrapper for Alpha Vantage API for financial data.
alpha_vantage Python module to get stock data/cryptocurrencies from the Alpha Vantage API Alpha Vantage delivers a free API for real time financial da
Yahoo! Finance market data downloader (+faster Pandas Datareader)
Yahoo! Finance market data downloader Ever since Yahoo! finance decommissioned their historical data API, many programs that relied on it to stop work
python toolbox for visualizing geographical data and making maps
geoplotlib is a python toolbox for visualizing geographical data and making maps data = read_csv('data/bus.csv') geoplotlib.dot(data) geoplotlib.show(
Use Mapbox GL JS to visualize data in a Python Jupyter notebook
Location Data Visualization library for Jupyter Notebooks Library documentation at https://mapbox-mapboxgl-jupyter.readthedocs-hosted.com/en/latest/.
Search and download Copernicus Sentinel satellite images
sentinelsat Sentinelsat makes searching, downloading and retrieving the metadata of Sentinel satellite images from the Copernicus Open Access Hub easy
Python package for earth-observing satellite data processing
Satpy The Satpy package is a python library for reading and manipulating meteorological remote sensing data and writing it to various image and data f
A package built to support working with spatial data using open source python
EarthPy EarthPy makes it easier to plot and manipulate spatial data in Python. Why EarthPy? Python is a generic programming language designed to suppo
Simple, concise geographical visualization in Python
Geographic visualizations for HoloViews. Build Status Coverage Latest dev release Latest release Docs What is it? GeoViews is a Python library that ma
Documentation and samples for ArcGIS API for Python
ArcGIS API for Python ArcGIS API for Python is a Python library for working with maps and geospatial data, powered by web GIS. It provides simple and
Fiona reads and writes geographic data files
Fiona Fiona reads and writes geographic data files and thereby helps Python programmers integrate geographic information systems with other computer s
Python tools for geographic data
GeoPandas Python tools for geographic data Introduction GeoPandas is a project to add support for geographic data to pandas objects. It currently impl
Python Data. Leaflet.js Maps.
folium Python Data, Leaflet.js Maps folium builds on the data wrangling strengths of the Python ecosystem and the mapping strengths of the Leaflet.js
WebGL2 powered geospatial visualization layers
deck.gl | Website WebGL2-powered, highly performant large-scale data visualization deck.gl is designed to simplify high-performance, WebGL-based visua
Apache Flink
Apache Flink Apache Flink is an open source stream processing framework with powerful stream- and batch-processing capabilities. Learn more about Flin
Code for the ECCV2020 paper "A Differentiable Recurrent Surface for Asynchronous Event-Based Data"
A Differentiable Recurrent Surface for Asynchronous Event-Based Data Code for the ECCV2020 paper "A Differentiable Recurrent Surface for Asynchronous
Simple utlity for sniffing decrypted HTTP/HTTPS traffic on a jailbroken iOS device into an HAR format.
Description iOS devices contain a hidden feature for sniffing decrypted HTTP/HTTPS traffic from all processes using the CFNetwork framework into an HA
Simple base for a telethon bot!
Telethon Bot Simple base used to make a Telegram Bot in telethon. Join @BotzHub! Note: The client, here, is named BotzHub. Fork and add your plugins t
Simple stock price analytics
mune · Mune is an open source python web application built to analyze stocks, named after Homma Munehisa. Currently, the forecasting component is powe
Data Recovery from your broken Android phone
Broken Phone Recovery a guide how to backup data from your locked android phone if you broke your screen (and more) you can skip some steps depending
"Very simple but works well" Computer Vision based ID verification solution provided by LibraX.
ID Verification by LibraX.ai This is the first free Identity verification in the market. LibraX.ai is an identity verification platform for developers
Simple and lightweight Spotify Overlay written in Python.
Simple Spotify Overlay This is a simple yet powerful Spotify Overlay. About I have been looking for something like this ever since I got Spotify. I th
A simple OCR API server, seriously easy to be deployed by Docker, on Heroku as well
ocrserver Simple OCR server, as a small working sample for gosseract. Try now here https://ocr-example.herokuapp.com/, and deploy your own now. Deploy
A curated list of awesome synthetic data for text location and recognition
awesome-SynthText A curated list of awesome synthetic data for text location and recognition and OCR datasets. Text location SynthText SynthText_Chine
A synthetic data generator for text recognition
TextRecognitionDataGenerator A synthetic data generator for text recognition What is it for? Generating text image samples to train an OCR software. N
Simple app for visual editing of Page XML files
Name nw-page-editor - Simple app for visual editing of Page XML files. Version: 2021.02.22 Description nw-page-editor is an application for viewing/ed
ISI's Optical Character Recognition (OCR) software for machine-print and handwriting data
VistaOCR ISI's Optical Character Recognition (OCR) software for machine-print and handwriting data Publications "How to Efficiently Increase Resolutio
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
English | 简体中文 Introduction PaddleOCR aims to create multilingual, awesome, leading, and practical OCR tools that help users train better models and a
Text language identification using Wikipedia data
Text language identification using Wikipedia data The aim of this project is to provide high-quality language detection over all the web's languages.
Python library to extract tabular data from images and scanned PDFs
Overview ExtractTable - API to extract tabular data from images and scanned PDFs The motivation is to make it easy for developers to extract tabular d
Extract tables from scanned image PDFs using Optical Character Recognition.
ocr-table This project aims to extract tables from scanned image PDFs using Optical Character Recognition. Install Requirements Tesseract OCR sudo apt
Camelot: PDF Table Extraction for Humans
Camelot: PDF Table Extraction for Humans Camelot is a Python library that makes it easy for anyone to extract tables from PDF files! Note: You can als
TableBank: A Benchmark Dataset for Table Detection and Recognition
TableBank TableBank is a new image-based table detection and recognition dataset built with novel weak supervision from Word and Latex documents on th
Table recognition inside douments using neural networks
TableTrainNet A simple project for training and testing table recognition in documents. This project was developed to make a neural network which reco
Table Extraction Tool
Tree Structure - Table Extraction Fonduer has been successfully extended to perform information extraction from richly formatted data such as tables.
Turn images of tables into CSV data. Detect tables from images and run OCR on the cells.
Table of Contents Overview Requirements Demo Modules Overview This python package contains modules to help with finding and extracting tabular data fr
Unofficial implementation of "TableNet: Deep Learning model for end-to-end Table detection and Tabular data extraction from Scanned Document Images"
TableNet Unofficial implementation of ICDAR 2019 paper : TableNet: Deep Learning model for end-to-end Table detection and Tabular data extraction from
Handwritten_Text_Recognition
Deep Learning framework for Line-level Handwritten Text Recognition Short presentation of our project Introduction Installation 2.a Install conda envi
A simple document layout analysis using Python-OpenCV
Run the application: python main.py *Note: For first time running the application, create a folder named "output". The application is a simple documen
Generic framework for historical document processing
dhSegment dhSegment is a tool for Historical Document Processing. Its generic approach allows to segment regions and extract content from different ty
A very tiny, very simple, and very secure file encryption tool.
Picocrypt is a very tiny (hence "Pico"), very simple, yet very secure file encryption tool. It uses the modern ChaCha20-Poly1305 cipher suite as well
Simple script to extract useful informations from the combo BloodHound + Neo4j
bloodhound-quickwin Simple script to extract useful informations from the combo BloodHound + Neo4j. Can help to choose a target. Prerequisites python3
DECAF: Deep Extreme Classification with Label Features
DECAF DECAF: Deep Extreme Classification with Label Features @InProceedings{Mittal21, author = "Mittal, A. and Dahiya, K. and Agrawal, S. and Sain
Pack up to 3MB of data into a tweetable PNG polyglot file.
tweetable-polyglot-png Pack up to 3MB of data into a tweetable PNG polyglot file. See it in action here: https://twitter.com/David3141593/status/13719
Participants of Bertelsmann Technology Scholarship created an awesome list of resources and they want to share it with the world, if you find illegal resources please report to us and we will remove.
Participants of Bertelsmann Technology Scholarship created an awesome list of resources and they want to share it with the world, if you find illegal
Simple progressbar for discord
⚙️ DiscordProgressbar 📂 Установка | Installation pip install discordbar 📚 Документация | Documentation 📞 Связаться со мной | Сontact with me 📜 Ли
Code for the paper "Training GANs with Stronger Augmentations via Contrastive Discriminator" (ICLR 2021)
Training GANs with Stronger Augmentations via Contrastive Discriminator (ICLR 2021) This repository contains the code for reproducing the paper: Train
Unofficial implementation of "TTNet: Real-time temporal and spatial video analysis of table tennis" (CVPR 2020)
TTNet-Pytorch The implementation for the paper "TTNet: Real-time temporal and spatial video analysis of table tennis" An introduction of the project c
A community-driven python bot that aims to be as simple as possible to serve humans with their everyday tasks
JARVIS on Messenger Just A Rather Very Intelligent System, now on Messenger! Messenger is now used by 1.2 billion people every month. With the launch
ReText: Simple but powerful editor for Markdown and reStructuredText
Welcome to ReText! ReText is a simple but powerful editor for Markdown and reStructuredText markup languages. One can also add support for custom mark
A small, simple editor for beginner Python programmers. Written in Python and Qt5.
Mu - A Simple Python Code Editor Mu is a simple code editor for beginner programmers based on extensive feedback from teachers and learners. Having sa
Gaphor is the simple modeling tool
Gaphor Gaphor is a UML and SysML modeling application written in Python. It is designed to be easy to use, while still being powerful. Gaphor implemen
Simple command line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network)
Simple command line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network)
Simple screen recorder
Kooha Simple screen recorder Description Kooha is a simple screen recorder built with GTK. It allows you to record your screen and also audio from you
Deep generative modeling for time-stamped heterogeneous data, enabling high-fidelity models for a large variety of spatio-temporal domains.
Neural Spatio-Temporal Point Processes [arxiv] Ricky T. Q. Chen, Brandon Amos, Maximilian Nickel Abstract. We propose a new class of parameterizations
Visualize Data From Stray Scanner https://keke.dev/blog/2021/03/10/Stray-Scanner.html
StrayVisualizer A set of scripts to work with data collected using Stray Scanner. Usage Installing Dependencies Install dependencies with pip -r requi
First Party data integration solution built for marketing teams to enable audience and conversion onboarding into Google Marketing products (Google Ads, Campaign Manager, Google Analytics).
Megalista Sample integration code for onboarding offline/CRM data from BigQuery as custom audiences or offline conversions in Google Ads, Google Analy
TorchMetrics is a collection of 25+ PyTorch metrics implementations and an easy-to-use API to create custom metrics.
Machine learning metrics for distributed, scalable PyTorch applications.
Ralph is the CMDB / Asset Management system for data center and back office hardware.
Ralph Ralph is full-featured Asset Management, DCIM and CMDB system for data centers and back offices. Features: keep track of assets purchases and th
IP address management (IPAM) and data center infrastructure management (DCIM) tool.
NetBox is an IP address management (IPAM) and data center infrastructure management (DCIM) tool. Initially conceived by the network engineering team a
a full featured file system for online data storage
S3QL S3QL is a file system that stores all its data online using storage services like Google Storage, Amazon S3, or OpenStack. S3QL effectively provi
An open source multi-tool for exploring and publishing data
Datasette An open source multi-tool for exploring and publishing data Datasette is a tool for exploring and publishing data. It helps people take data
A simple version control system built on top of Git
Gitless Gitless is a version control system built on top of Git, that is easy to learn and use: Simple commit workflow Track or untrack files to contr
🦉Data Version Control | Git for Data & Models
Website • Docs • Blog • Twitter • Chat (Community & Support) • Tutorial • Mailing List Data Version Control or DVC is an open-source tool for data sci
Finds Jobs on LinkedIn using web-scraping
Find Jobs on LinkedIn 📔 This program finds jobs by scraping on LinkedIn 👨💻 Relies on User Input. Accepts: Country, City, State 📑 Data about jobs
Feature extraction made simple with torchextractor
torchextractor: PyTorch Intermediate Feature Extraction Introduction Too many times some model definitions get remorselessly copy-pasted just because
Consistency Regularization for Adversarial Robustness
Consistency Regularization for Adversarial Robustness Official PyTorch implementation of Consistency Regularization for Adversarial Robustness by Jiho
Simple Dynamic Batching Inference
Simple Dynamic Batching Inference 解决了什么问题? 众所周知,Batch对于GPU上深度学习模型的运行效率影响很大。。。 是在Inference时。搜索、推荐等场景自带比较大的batch,问题不大。但更多场景面临的往往是稀碎的请求(比如图片服务里一次一张图)。 如果
All the essential resources and template code needed to understand and practice data structures and algorithms in python with few small projects to demonstrate their practical application.
Data Structures and Algorithms Python INDEX 1. Resources - Books Data Structures - Reema Thareja competitiveCoding Big-O Cheat Sheet DAA Syllabus Inte
Synthetic data for the people.
zpy: Synthetic data in Blender. Website • Install • Docs • Examples • CLI • Contribute • Licence Abstract Collecting, labeling, and cleaning data for
A complete end-to-end demonstration in which we collect training data in Unity and use that data to train a deep neural network to predict the pose of a cube. This model is then deployed in a simulated robotic pick-and-place task.
Object Pose Estimation Demo This tutorial will go through the steps necessary to perform pose estimation with a UR3 robotic arm in Unity. You’ll gain