494 Repositories
Python shotgun-pipeline-toolkit Libraries
ETL pipeline on movie data using Python and postgreSQL
Movies-ETL ETL pipeline on movie data using Python and postgreSQL Overview This project consisted on a automated Extraction, Transformation and Load p
The object detection pipeline is based on Ultralytics YOLOv5
AYOLOv2 The main goal of this repository is to rewrite the object detection pipeline with a better code structure for better portability and adaptabil
jiant is an NLP toolkit
🚨 Update 🚨 : As of 2021/10/17, the jiant project is no longer being actively maintained. This means there will be no plans to add new models, tasks,
pypyr task-runner cli & api for automation pipelines.
pypyr task-runner cli & api for automation pipelines. Automate anything by combining commands, different scripts in different languages & applications into one pipeline process.
PatZilla is a modular patent information research platform and data integration toolkit with a modern user interface and access to multiple data sources.
PatZilla is a modular patent information research platform and data integration toolkit with a modern user interface and access to multiple data sources.
Py2neo is a comprehensive toolkit for working with Neo4j from within Python applications or from the command line.
Py2neo v3 Py2neo is a client library and toolkit for working with Neo4j from within Python applications and from the command line. The core library ha
The DL Streamer Pipeline Zoo is a catalog of optimized media and media analytics pipelines.
The DL Streamer Pipeline Zoo is a catalog of optimized media and media analytics pipelines. It includes tools for downloading pipelines and their dependencies and tools for measuring their performace.
A toolkit to generate MR sequence diagrams
mrsd: a toolkit to generate MR sequence diagrams mrsd is a Python toolkit to generate MR sequence diagrams, as shown below for the basic FLASH sequenc
Spline is a tool that is capable of running locally as well as part of well known pipelines like Jenkins (Jenkinsfile), Travis CI (.travis.yml) or similar ones.
Welcome to spline - the pipeline tool Important note: Since change in my job I didn't had the chance to continue on this project. My main new project
Build an Amazon SageMaker Pipeline to Transform Raw Texts to A Knowledge Graph
Build an Amazon SageMaker Pipeline to Transform Raw Texts to A Knowledge Graph This repository provides a pipeline to create a knowledge graph from ra
MMRazor: a model compression toolkit for model slimming and AutoML
Documentation: https://mmrazor.readthedocs.io/ English | 简体中文 Introduction MMRazor is a model compression toolkit for model slimming and AutoML, which
A pipeline for making highlighted text stand-alone.
title emoji colorFrom colorTo sdk app_file pinned decontextualizer 📤 green gray streamlit main.py false Decontextualizer As a second step in improvin
Evaluation toolkit of the informative tracking benchmark comprising 9 scenarios, 180 diverse videos, and new challenges.
Informative-tracking-benchmark Informative tracking benchmark (ITB) higher diversity. It contains 9 representative scenarios and 180 diverse videos. m
Basic yet complete Machine Learning pipeline for NLP tasks
Basic yet complete Machine Learning pipeline for NLP tasks This repository accompanies the article on building basic yet complete ML pipelines for sol
Metasploit Multi Purpose Exploiting Toolkit For Termux
MSF-EXPLOIT MSF-ANDRO is a Metasploit Multi Purpose Exploiting Toolkit For Termux . Only a Basic Script , Still in Development . FEATURES : Install Me
simple way to build the declarative and destributed data pipelines with python
unipipeline simple way to build the declarative and distributed data pipelines. Why you should use it Declarative strict config Scaffolding Fully type
Python library for creating data pipelines with chain functional programming
PyFunctional Features PyFunctional makes creating data pipelines easy by using chained functional operators. Here are a few examples of what it can do
A PyTorch-based model pruning toolkit for pre-trained language models
English | 中文说明 TextPruner是一个为预训练语言模型设计的模型裁剪工具包,通过轻量、快速的裁剪方法对模型进行结构化剪枝,从而实现压缩模型体积、提升模型速度。 其他相关资源: 知识蒸馏工具TextBrewer:https://github.com/airaria/TextBrewe
A full pipeline AutoML tool for tabular data
HyperGBM Doc | 中文 We Are Hiring! Dear folks,we are offering challenging opportunities located in Beijing for both professionals and students who are k
Stochastic Tensor Optimization for Robot Motion - A GPU Robot Motion Toolkit
STORM Stochastic Tensor Optimization for Robot Motion - A GPU Robot Motion Toolkit [Install Instructions] [Paper] [Website] This package contains code
Healthsea is a spaCy pipeline for analyzing user reviews of supplementary products for their effects on health.
Welcome to Healthsea ✨ Create better access to health with spaCy. Healthsea is a pipeline for analyzing user reviews to supplement products by extract
Transform a Google Drive server into a VFX pipeline ready server
Google Drive VFX Server VFX Pipeline About The Project Quick tutorial to setup a Google Drive Server for multiple machines access, and VFX Pipeline on
ONT Analysis Toolkit (OAT)
A toolkit for monitoring ONT MinION sequencing, followed by data analysis, for viral genomes amplified with tiled amplicon sequencing.
Malaya-Speech is a Speech-Toolkit library for bahasa Malaysia, powered by Deep Learning Tensorflow.
Malaya-Speech is a Speech-Toolkit library for bahasa Malaysia, powered by Deep Learning Tensorflow. Documentation Proper documentation is available at
Allele-specific pipeline for unbiased read mapping(WIP), QTL discovery(WIP), and allelic-imbalance analysis
WASP2 (Currently in pre-development): Allele-specific pipeline for unbiased read mapping(WIP), QTL discovery(WIP), and allelic-imbalance analysis Requ
A toolkit for document-level event extraction, containing some SOTA model implementations
❤️ A Toolkit for Document-level Event Extraction with & without Triggers Hi, there 👋 . Thanks for your stay in this repo. This project aims at buildi
A computational block to solve entity alignment over textual attributes in a knowledge graph creation pipeline.
How to apply? Create your config.ini file following the example provided in config.ini Choose one of the options below to run: Run with Python3 pip in
pandas: powerful Python data analysis toolkit
pandas is a Python package that provides fast, flexible, and expressive data structures designed to make working with "relational" or "labeled" data both easy and intuitive.
SimpleITK is an image analysis toolkit with a large number of components supporting general filtering operations, image segmentation and registration
SimpleITK is an image analysis toolkit with a large number of components supporting general filtering operations, image segmentation and registration
MIT version of the PyMca XRF Toolkit
PyMca This is the MIT version of the PyMca XRF Toolkit. Please read the LICENSE file for details. Installation Ready-to-use packages are available for
Estoult - a Python toolkit for data mapping with an integrated query builder for SQL databases
Estoult Estoult is a Python toolkit for data mapping with an integrated query builder for SQL databases. It currently supports MySQL, PostgreSQL, and
A real-time financial data streaming pipeline and visualization platform using Apache Kafka, Cassandra, and Bokeh.
Realtime Financial Market Data Visualization and Analysis Introduction This repo shows my project about real-time stock data pipeline. All the code is
A toolkit for document-level event extraction, containing some SOTA model implementations
Document-level Event Extraction via Heterogeneous Graph-based Interaction Model with a Tracker Source code for ACL-IJCNLP 2021 Long paper: Document-le
Evaluation toolkit of the informative tracking benchmark comprising 9 scenarios, 180 diverse videos, and new challenges.
Informative-tracking-benchmark Informative tracking benchmark (ITB) higher diversity. It contains 9 representative scenarios and 180 diverse videos. m
Multiview Dataset Toolkit
Multiview Dataset Toolkit Using multi-view cameras is a natural way to obtain a complete point cloud. However, there is to date only one multi-view 3D
Dnspython is a DNS toolkit for Python.
dnspython is a DNS toolkit for Python. It supports almost all record types. It can be used for queries, zone transfers, and dynamic updates. It supports TSIG authenticated messages and EDNS0.
RCT-ART is an NLP pipeline built with spaCy for converting clinical trial result sentences into tables through jointly extracting intervention, outcome and outcome measure entities and their relations.
Randomised controlled trial abstract result tabulator RCT-ART is an NLP pipeline built with spaCy for converting clinical trial result sentences into
The geospatial toolkit for redistricting data.
maup maup is the geospatial toolkit for redistricting data. The package streamlines the basic workflows that arise when working with blocks, precincts
Intercepting proxy + analysis toolkit for Second Life compatible virtual worlds
Hippolyzer Hippolyzer is a revival of Linden Lab's PyOGP library targeting modern Python 3, with a focus on debugging issues in Second Life-compatible
wxPython's Project Phoenix. A new implementation of wxPython, better, stronger, faster than he was before.
wxPython Project Phoenix Introduction Welcome to wxPython's Project Phoenix! Phoenix is the improved next-generation wxPython, "better, stronger, fast
Microsoft Azure provides a wide number of services for managing and storing data
Microsoft Azure provides a wide number of services for managing and storing data. One product is Microsoft Azure SQL. Which gives us the capability to create and manage instances of SQL Servers hosted in the cloud. This project, demonstrates how to use these services to manage data we collect from different sources.
An efficient toolkit for Face Stylization based on the paper "AgileGAN: Stylizing Portraits by Inversion-Consistent Transfer Learning"
MMGEN-FaceStylor English | 简体中文 Introduction This repo is an efficient toolkit for Face Stylization based on the paper "AgileGAN: Stylizing Portraits
Python Toolkit containing different Cyber Attacks Tools
Helikopter Python Toolkit containing different Cyber Attacks Tools. Tools in Helikopter Toolkit 1. FattyNigger (PYTHON WORM) 2. Taxes (PYTHON PASS EXT
The sarge package provides a wrapper for subprocess which provides command pipeline functionality.
Overview The sarge package provides a wrapper for subprocess which provides command pipeline functionality. This package leverages subprocess to provi
Model Validation Toolkit is a collection of tools to assist with validating machine learning models prior to deploying them to production and monitoring them after deployment to production.
Model Validation Toolkit is a collection of tools to assist with validating machine learning models prior to deploying them to production and monitoring them after deployment to production.
Logan is a toolkit for building standalone Django applications
Logan Logan is a toolkit for running standalone Django applications. It provides you with tools to create a CLI runner, manage settings, and the abili
MapReader: A computer vision pipeline for the semantic exploration of maps at scale
MapReader A computer vision pipeline for the semantic exploration of maps at scale MapReader is an end-to-end computer vision (CV) pipeline designed b
This repository contains full machine learning pipeline of the Zillow Houses competition on Kaggle platform.
Zillow-Houses This repository contains full machine learning pipeline of the Zillow Houses competition on Kaggle platform. Pipeline is consists of 10
Machine Learning Toolkit for Kubernetes
Kubeflow the cloud-native platform for machine learning operations - pipelines, training and deployment. Documentation Please refer to the official do
QML: A Python Toolkit for Quantum Machine Learning
QML is a Python2/3-compatible toolkit for representation learning of properties of molecules and solids.
Demonstrate a Dataflow pipeline that saves data from an API into BigQuery table
Overview dataflow-mvp provides a basic example pipeline that pulls data from an API and writes it to a BigQuery table using GCP's Dataflow (i.e., Apac
PyTorch toolkit for biomedical imaging
farabio is a minimal PyTorch toolkit for out-of-the-box deep learning support in biomedical imaging. For further information, see Wikis and Docs.
Pipeline for training LSA models using Scikit-Learn.
Latent Semantic Analysis Pipeline for training LSA models using Scikit-Learn. Usage Instead of writing custom code for latent semantic analysis, you j
Toolkit for developing and maintaining ML models
modelkit Python framework for production ML systems. modelkit is a minimalist yet powerful MLOps library for Python, built for people who want to depl
Python toolkit for defining+simulating+visualizing+analyzing attractors, dynamical systems, iterated function systems, roulette curves, and more
Attractors A small module that provides functions and classes for very efficient simulation and rendering of iterated function systems; dynamical syst
A kedro-plugin to serve Kedro Pipelines as API
General informations Software repository Latest release Total downloads Pypi Code health Branch Tests Coverage Links Documentation Deployment Activity
A unified 3D Transformer Pipeline for visual synthesis
Overview This is the official repo for the paper: NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion. NÜWA is a unified multimodal p
Agile Threat Modeling Toolkit
Threagile is an open-source toolkit for agile threat modeling:
MapReader: A computer vision pipeline for the semantic exploration of maps at scale
MapReader A computer vision pipeline for the semantic exploration of maps at scale MapReader is an end-to-end computer vision (CV) pipeline designed b
Pmapper is a super-resolution and deconvolution toolkit for python 3.6+
pmapper pmapper is a super-resolution and deconvolution toolkit for python 3.6+. PMAP stands for Poisson Maximum A-Posteriori, a highly flexible and a
PowerApps-docstring is a console based, pipeline ready application that automatically generates user and technical documentation for Power Apps.
powerapps-docstring PowerApps-docstring is a console based, pipeline ready application that automatically generates user and technical documentation f
Modern Denial-of-service ToolKit for python
💣 Impulse Modern Denial-of-service ToolKit 💻 Main window 📡 Methods: Method Target Description SMS PHONE Sends a massive amount of SMS messages and
APIlocal_dbAWS_RDS - Disclaimer! All data used is for educational purposes only.
APIlocal_dbAWS_RDS Disclaimer! All data used is for educational purposes only. ETL pipeline diagram. Aim of project By creating a fully working pipe
Production First and Production Ready End-to-End Keyword Spotting Toolkit
WeKws Production First and Production Ready End-to-End Keyword Spotting Toolkit. The goal of this toolkit it to... Small footprint keyword spotting (K
A procedural Blender pipeline for photorealistic training image generation
BlenderProc2 A procedural Blender pipeline for photorealistic rendering. Documentation | Tutorials | Examples | ArXiv paper | Workshop paper Features
This is a Docker-based pipeline for preparing sextractor-ready multiwavelength images
Pipeline for creating NB422-detected (ODI) catalog The repository contains a Docker-based pipeline for preprocessing observational data. The pipeline
Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.
Pyserini Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations. Retrieval using sparse re
Pipeline to convert a haploid assembly into diploid
HapDup (haplotype duplicator) is a pipeline to convert a haploid long read assembly into a dual diploid assembly. The reconstructed haplotypes
Scikit-Learn useful pre-defined Pipelines Hub
Scikit-Pipes Scikit-Learn useful pre-defined Pipelines Hub Usage: Install scikit-pipes It's advised to install sklearn-genetic using a virtual env, in
📚 Papermill is a tool for parameterizing, executing, and analyzing Jupyter Notebooks.
papermill is a tool for parameterizing, executing, and analyzing Jupyter Notebooks. Papermill lets you: parameterize notebooks execute notebooks This
A DSL for data-driven computational pipelines
"Dataflow variables are spectacularly expressive in concurrent programming" Henri E. Bal , Jennifer G. Steiner , Andrew S. Tanenbaum Quick overview Ne
A unified 3D Transformer Pipeline for visual synthesis
Overview This is the official repo for the paper: "NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion". NÜWA is a unified multimodal
🧪 Cutting-edge experimental spaCy components and features
spacy-experimental: Cutting-edge experimental spaCy components and features This package includes experimental components and features for spaCy v3.x,
MCML is a toolkit for semi-supervised dimensionality reduction and quantitative analysis of Multi-Class, Multi-Label data
MCML is a toolkit for semi-supervised dimensionality reduction and quantitative analysis of Multi-Class, Multi-Label data. We demonstrate its use
Crypto Stats and Tweets Data Pipeline using Airflow
Crypto Stats and Tweets Data Pipeline using Airflow Introduction Project Overview This project was brought upon through Udacity's nanodegree program.
Datashader is a data rasterization pipeline for automating the process of creating meaningful representations of large amounts of data.
Datashader is a data rasterization pipeline for automating the process of creating meaningful representations of large amounts of data.
A Powerful Serverless Analysis Toolkit That Takes Trial And Error Out of Machine Learning Projects
KXY: A Seemless API to 10x The Productivity of Machine Learning Engineers Documentation https://www.kxy.ai/reference/ Installation From PyPi: pip inst
This toolkit provides codes to download and pre-process the SLUE datasets, train the baseline models, and evaluate SLUE tasks.
slue-toolkit We introduce Spoken Language Understanding Evaluation (SLUE) benchmark. This toolkit provides codes to download and pre-process the SLUE
DA2Lite is an automated model compression toolkit for PyTorch.
DA2Lite (Deep Architecture to Lite) is a toolkit to compress and accelerate deep network models. ⭐ Star us on GitHub — it helps!! Frameworks & Librari
BookNLP, a natural language processing pipeline for books
BookNLP BookNLP is a natural language processing pipeline that scales to books and other long documents (in English), including: Part-of-speech taggin
CertPy is a high level toolkit for generating x509 (e.g. SSL/TLS/HTTPS) certificates in Python.
CertPy CertPy is a high level toolkit for generating x509 (e.g. SSL/TLS/HTTPS) certificates in Python. Certificate “profiles” are implemented as Pytho
A toolkit for web reconnaissance, it's fast and easy to use.
A toolkit for web reconnaissance, it's fast and easy to use. File Structure httpsuite/ main.py init.py db/ db.py init.py subdomains_db directories_db
Microscopy Image Cytometry Toolkit
Cytokit Cytokit is a collection of tools for quantifying and analyzing properties of individual cells in large fluorescent microscopy datasets with a
AI Toolkit for Healthcare Imaging
Medical Open Network for AI MONAI is a PyTorch-based, open-source framework for deep learning in healthcare imaging, part of PyTorch Ecosystem. Its am
Predicting the usefulness of reviews given the review text and metadata surrounding the reviews.
Predicting Yelp Review Quality Table of Contents Introduction Motivation Goal and Central Questions The Data Data Storage and ETL EDA Data Pipeline Da
Backtesting an algorithmic trading strategy using Machine Learning and Sentiment Analysis.
Trading Tesla with Machine Learning and Sentiment Analysis An interactive program to train a Random Forest Classifier to predict Tesla daily prices us
Full automated data pipeline using docker images
Create postgres tables from CSV files This first section is only relate to creating tables from CSV files using postgres container alone. Just one of
Building house price data pipelines with Apache Beam and Spark on GCP
This project contains the process from building a web crawler to extract the raw data of house price to create ETL pipelines using Google Could Platform services.
Automated Machine Learning Pipeline for tabular data. Designed for predictive maintenance applications, failure identification, failure prediction, condition monitoring, etc.
Automated Machine Learning Pipeline for tabular data. Designed for predictive maintenance applications, failure identification, failure prediction, condition monitoring, etc.
A simple python application for running a CI pipeline locally
A simple python application for running a CI pipeline locally This app currently supports GitLab CI scripts
RedFish API Toolkit
RedFish API Toolkit RedFish API Toolkit Build Status GitHub Actions Travis CI Requirements requirements.txt requirements-dev.txt Dependencies Document
Project issue to website data transformation toolkit
braintransform Project issue to website data transformation toolkit. Introduction The purpose of these scripts is to be able to dynamically generate t
Two phase pipeline + StreamlitTwo phase pipeline + Streamlit
Two phase pipeline + Streamlit This is an example project that demonstrates how to create a pipeline that consists of two phases of execution. In betw
A data preprocessing and feature engineering script for a machine learning pipeline is prepared.
FEATURE ENGINEERING Business Problem: A data preprocessing and feature engineering script for a machine learning pipeline needs to be prepared. It is
Required for a machine learning pipeline data preprocessing and variable engineering script needs to be prepared
Feature-Engineering Required for a machine learning pipeline data preprocessing and variable engineering script needs to be prepared. When the dataset
A simple python application for running a CI pipeline locally This app currently supports GitLab CI scripts
🏃 Simple Local CI Runner 🏃 A simple python application for running a CI pipeline locally This app currently supports GitLab CI scripts ⚙️ Setup Inst
Non-Metric Space Library (NMSLIB): An efficient similarity search library and a toolkit for evaluation of k-NN methods for generic non-metric spaces.
Non-Metric Space Library (NMSLIB) Important Notes NMSLIB is generic but fast, see the results of ANN benchmarks. A standalone implementation of our fa
A deep-learning pipeline for segmentation of ambiguous microscopic images.
Welcome to Official repository of deepflash2 - a deep-learning pipeline for segmentation of ambiguous microscopic images. Quick Start in 30 seconds se
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
NNI Doc | 简体中文 NNI (Neural Network Intelligence) is a lightweight but powerful toolkit to help users automate Feature Engineering, Neural Architecture
A CV toolkit for my papers.
PyTorch-Encoding created by Hang Zhang Documentation Please visit the Docs for detail instructions of installation and usage. Please visit the link to