3168 Repositories
Python data-annotation-tools Libraries
Self-attentive task GAN for space domain awareness data augmentation.
SATGAN TODO: update the article URL once published. Article about this implemention The self-attentive task generative adversarial network (SATGAN) le
This library is an ongoing effort towards bringing the data exchanging ability between Java/Scala and Python
PyJava This library is an ongoing effort towards bringing the data exchanging ability between Java/Scala and Python
A number of methods in order to perform Natural Language Processing on live data derived from Twitter
A number of methods in order to perform Natural Language Processing on live data derived from Twitter
neo Tool is great one in binary exploitation topic
neo Tool is great one in binary exploitation topic. instead of doing several missions by many tools and windows, you can now automate this in one tool in one session.. Enjoy it
A PyTorch implementation of "CoAtNet: Marrying Convolution and Attention for All Data Sizes".
CoAtNet Overview This is a PyTorch implementation of CoAtNet specified in "CoAtNet: Marrying Convolution and Attention for All Data Sizes", arXiv 2021
A simple CLI productivity tool to quickly display the syntax of a desired piece of code
Iforgor Iforgor is a customisable and easy to use command line tool to manage code samples. It's a good way to quickly get your hand on syntax you don
Project issue to website data transformation toolkit
braintransform Project issue to website data transformation toolkit. Introduction The purpose of these scripts is to be able to dynamically generate t
Una simple herramienta para rastrear IP programada en Python
Spyrod-v2 Una simple herramienta para rastrear IP programada en Python Instalacion apt install git -y cd $HOME git clone https://github.com/Euronymou5
Simulation code and tutorial for BBHnet training data
Simulation Dataset for BBHnet NOTE: OLD README, UPDATE IN PROGRESS We generate simulation dataset to train BBHnet, our deep learning framework for det
Libraries, tools and tasks created and used at DeepMind Robotics.
dm_robotics: Libraries, tools, and tasks created and used for Robotics research at DeepMind. Package overview Package Summary Transformations Rigid bo
Greenery - tools for parsing and manipulating regular expressions
Greenery - tools for parsing and manipulating regular expressions
Clip Bing Maps backgound as RGB geotif image using center-point from vector data of a shapefile and Bing Maps zoom
Clip Bing Maps backgound as RGB geotif image using center-point from vector data of a shapefile and Bing Maps zoom. Also, rasterize shapefile vectors as corresponding label image.
PyTorch Kafka Dataset: A definition of a dataset to get training data from Kafka.
PyTorch Kafka Dataset: A definition of a dataset to get training data from Kafka.
GeneGAN: Learning Object Transfiguration and Attribute Subspace from Unpaired Data
GeneGAN: Learning Object Transfiguration and Attribute Subspace from Unpaired Data By Shuchang Zhou, Taihong Xiao, Yi Yang, Dieqiao Feng, Qinyao He, W
Code and data for paper "Deep Photo Style Transfer"
deep-photo-styletransfer Code and data for paper "Deep Photo Style Transfer" Disclaimer This software is published for academic and non-commercial use
A data preprocessing and feature engineering script for a machine learning pipeline is prepared.
FEATURE ENGINEERING Business Problem: A data preprocessing and feature engineering script for a machine learning pipeline needs to be prepared. It is
Required for a machine learning pipeline data preprocessing and variable engineering script needs to be prepared
Feature-Engineering Required for a machine learning pipeline data preprocessing and variable engineering script needs to be prepared. When the dataset
The repository forked from NVlabs uses our data. (Differentiable rasterization applied to 3D model simplification tasks)
nvdiffmodeling [origin_code] Differentiable rasterization applied to 3D model simplification tasks, as described in the paper: Appearance-Driven Autom
Extract rooms type, door, neibour rooms, rooms corners nad bounding boxes, and generate graph from rplan dataset
Housegan-data-reader House-GAN++ (data-reader) Code and instructions for converting rplan dataset (raster images) to housegan++ data format. House-GAN
A Semi-Intelligent ChatBot filled with statistical and economical data for the Premier League.
MONEYBALL - ChatBot Module: 4006CEM, Class: B, Group: 5 Contributors: Jonas Djondo Roshan Kc Cole Samson Daniel Rodrigues Ihteshaam Naseer Kind remind
Data Utilities e.g. for importing files to onetask
Use this repository to easily convert your source files (csv, txt, excel, json, html) into record-oriented JSON files that can be uploaded into onetask.
Finding Label and Model Errors in Perception Data With Learned Observation Assertions
Finding Label and Model Errors in Perception Data With Learned Observation Assertions This is the project page for Finding Label and Model Errors in P
Powerful and efficient Computer Vision Annotation Tool (CVAT)
Computer Vision Annotation Tool (CVAT) CVAT is free, online, interactive video and image annotation tool for computer vision. It is being used by our
🐦 Quickly annotate data from the comfort of your Jupyter notebook
🐦 pigeon - Quickly annotate data on Jupyter Pigeon is a simple widget that lets you quickly annotate a dataset of unlabeled examples from the comfort
This is the code used in the paper "Entity Embeddings of Categorical Variables".
This is the code used in the paper "Entity Embeddings of Categorical Variables". If you want to get the original version of the code used for the Kagg
A scikit-learn-compatible module for estimating prediction intervals.
MAPIE - Model Agnostic Prediction Interval Estimator MAPIE allows you to easily estimate prediction intervals (or prediction sets) using your favourit
PyClustering is a Python, C++ data mining library.
pyclustering is a Python, C++ data mining library (clustering algorithm, oscillatory networks, neural networks). The library provides Python and C++ implementations (C++ pyclustering library) of each algorithm or model. C++ pyclustering library is a part of pyclustering and supported for Linux, Windows and MacOS operating systems.
The Fundamental Clustering Problems Suite (FCPS) summaries 54 state-of-the-art clustering algorithms, common cluster challenges and estimations of the number of clusters as well as the testing for cluster tendency.
FCPS Fundamental Clustering Problems Suite The package provides over sixty state-of-the-art clustering algorithms for unsupervised machine learning pu
t-SNE and hierarchical clustering are popular methods of exploratory data analysis, particularly in biology.
tree-SNE t-SNE and hierarchical clustering are popular methods of exploratory data analysis, particularly in biology. Building on recent advances in s
Subpopulation detection in high-dimensional single-cell data
PhenoGraph for Python3 PhenoGraph is a clustering method designed for high-dimensional single-cell data. It works by creating a graph ("network") repr
Kats, a kit to analyze time series data, a lightweight, easy-to-use, generalizable, and extendable framework to perform time series analysis, from understanding the key statistics and characteristics, detecting change points and anomalies, to forecasting future trends.
Description Kats is a toolkit to analyze time series data, a lightweight, easy-to-use, and generalizable framework to perform time series analysis. Ti
Contains an implementation (sklearn API) of the algorithm proposed in "GENDIS: GEnetic DIscovery of Shapelets" and code to reproduce all experiments.
GENDIS GENetic DIscovery of Shapelets In the time series classification domain, shapelets are small subseries that are discriminative for a certain cl
Calling Julia from Python - an experiment on data loading
Calling Julia from Python - an experiment on data loading See the slides. TLDR After reading Patrick's blog post, we decided to try to replace C++ wit
For making Tagtog annotation into csv dataset
tagtog_relation_extraction for making Tagtog annotation into csv dataset How to Use On Tagtog 1. Go to Project Downloads 2. Download all documents,
🛠️ Tools for Transformers compression using Lightning ⚡
Bert-squeeze is a repository aiming to provide code to reduce the size of Transformer-based models or decrease their latency at inference time.
The audio-video synchronization of MKV Container Format is exploited to achieve data hiding
The audio-video synchronization of MKV Container Format is exploited to achieve data hiding, where the hidden data can be utilized for various management purposes, including hyper-linking, annotation, and authentication
Urban Big Data Centre Housing Sensor Project
Housing Sensor Project The Urban Big Data Centre is conducting a study of indoor environmental data in Scottish houses. We are using Raspberry Pi devi
Pyfunctools is a module that provides functions, methods and classes that help in the creation of projects in python
Pyfunctools Pyfunctools is a module that provides functions, methods and classes that help in the creation of projects in python, bringing functional
[NeurIPS 2021] Deceive D: Adaptive Pseudo Augmentation for GAN Training with Limited Data
Deceive D: Adaptive Pseudo Augmentation for GAN Training with Limited Data (NeurIPS 2021) This repository will provide the official PyTorch implementa
Scrutinizing XAI with linear ground-truth data
This repository contains all the experiments presented in the corresponding paper: "Scrutinizing XAI using linear ground-truth data with suppressor va
Code and real data for the paper "Counterfactual Temporal Point Processes", available at arXiv.
counterfactual-tpp This is a repository containing code and real data for the paper Counterfactual Temporal Point Processes. Pre-requisites This code
Code and data accompanying our SVRHM'21 paper.
Code and data accompanying our SVRHM'21 paper. Requires tensorflow 1.13, python 3.7, scikit-learn, and pytorch 1.6.0 to be installed. Python scripts i
Automated detection of anomalous exoplanet transits in light curve data.
Automatically detecting anomalous exoplanet transits This repository contains the source code for the paper "Automatically detecting anomalous exoplan
DataCLUE: 国内首个以数据为中心的AI测评(含模型分析报告)
DataCLUE: A Benchmark Suite for Data-centric NLP You can get the english version of README. 以数据为中心的AI测评(DataCLUE) 内容导引 章节 描述 简介 介绍以数据为中心的AI测评(DataCLUE
Text-to-Music Retrieval using Pre-defined/Data-driven Emotion Embeddings
Text2Music Emotion Embedding Text-to-Music Retrieval using Pre-defined/Data-driven Emotion Embeddings Reference Emotion Embedding Spaces for Matching
A collection of online resources to help you on your Tech journey.
Everything Tech Resources & Projects About The Project Coming from an engineering background and looking to up skill yourself on a new field can be di
SASE : Self-Adaptive noise distribution network for Speech Enhancement with heterogeneous data of Cross-Silo Federated learning
SASE : Self-Adaptive noise distribution network for Speech Enhancement with heterogeneous data of Cross-Silo Federated learning We propose a SASE mode
Hobby Project. A Python Library to create and generate static web pages using just python.
PyWeb 🕸️ 🐍 Current Release: 0.1 A Hobby Project 🤓 PyWeb is a small Library to generate customized static web pages using python. Aimed for new deve
FLIR/DJI IR Camera Data Parser, Python Version
FLIR/DJI IR Camera Data Parser, Python Version Parser infrared camera data as NumPy data. Usage Clone this respository and cd thermal_parser. Run pip
Tools for downloading and processing numerical weather predictions
NWP Tools for downloading and processing numerical weather predictions At the moment, this code is focused on downloading historical UKV NWPs produced
Scikit learn library models to account for data and concept drift.
liquid_scikit_learn Scikit learn library models to account for data and concept drift. This python library focuses on solving data drift and concept d
Tools for analyzing Java JVM gc log files
gc_log This package consists of two separate utilities useful for : gc_log_visualizer.py regionsize.py GC Log Visualizer This was updated to run under
Python plugin/extra to load data files from an external source (such as AWS S3) to a local directory
Data Loader Plugin - Python Table of Content (ToC) Data Loader Plugin - Python Table of Content (ToC) Overview References Python module Python virtual
Service for working with open data of the State Duma of the Russian Federation
Сервис для работы с открытыми данными Госдумы РФ Исходные данные из API Госдумы РФ извлекаются с помощью Apache Nifi и приземляются в хранилище Clickh
MASS (Mueen's Algorithm for Similarity Search) - a python 2 and 3 compatible library used for searching time series sub-sequences under z-normalized Euclidean distance for similarity.
Introduction MASS allows you to search a time series for a subquery resulting in an array of distances. These array of distances enable you to identif
ObsPy: A Python Toolbox for seismology/seismological observatories.
ObsPy is an open-source project dedicated to provide a Python framework for processing seismological data. It provides parsers for common file formats
sktime companion package for deep learning based on TensorFlow
NOTE: sktime-dl is currently being updated to work correctly with sktime 0.6, and wwill be fully relaunched over the summer. The plan is Refactor and
Luminaire is a python package that provides ML driven solutions for monitoring time series data.
A hands-off Anomaly Detection Library Table of contents What is Luminaire Quick Start Time Series Outlier Detection Workflow Anomaly Detection for Hig
Time Series Cross-Validation -- an extension for scikit-learn
TSCV: Time Series Cross-Validation This repository is a scikit-learn extension for time series cross-validation. It introduces gaps between the traini
Python library to download market data via Bloomberg, Eikon, Quandl, Yahoo etc.
findatapy findatapy creates an easy to use Python API to download market data from many sources including Quandl, Bloomberg, Yahoo, Google etc. using
Deep Survival Machines - Fully Parametric Survival Regression
Package: dsm Python package dsm provides an API to train the Deep Survival Machines and associated models for problems in survival analysis. The under
A framework for using LSTMs to detect anomalies in multivariate time series data. Includes spacecraft anomaly data and experiments from the Mars Science Laboratory and SMAP missions.
Telemanom (v2.0) v2.0 updates: Vectorized operations via numpy Object-oriented restructure, improved organization Merge branches into single branch fo
DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions. DoWhy is based on a unified language for causal inference, combining causal graphical models and potential outcomes frameworks.
DoWhy | An end-to-end library for causal inference Amit Sharma, Emre Kiciman Introducing DoWhy and the 4 steps of causal inference | Microsoft Researc
Responsible Machine Learning with Python
Examples of techniques for training interpretable ML models, explaining ML models, and debugging ML models for accuracy, discrimination, and security.
LOFO (Leave One Feature Out) Importance calculates the importances of a set of features based on a metric of choice,
LOFO (Leave One Feature Out) Importance calculates the importances of a set of features based on a metric of choice, for a model of choice, by iteratively removing each feature from the set, and evaluating the performance of the model, with a validation scheme of choice, based on the chosen metric.
DrWhy is the collection of tools for eXplainable AI (XAI). It's based on shared principles and simple grammar for exploration, explanation and visualisation of predictive models.
Responsible Machine Learning With Great Power Comes Great Responsibility. Voltaire (well, maybe) How to develop machine learning models in a responsib
moDel Agnostic Language for Exploration and eXplanation
moDel Agnostic Language for Exploration and eXplanation Overview Unverified black box model is the path to the failure. Opaqueness leads to distrust.
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
NNI Doc | 简体中文 NNI (Neural Network Intelligence) is a lightweight but powerful toolkit to help users automate Feature Engineering, Neural Architecture
🌊 River is a Python library for online machine learning.
River is a Python library for online machine learning. It is the result of a merger between creme and scikit-multiflow. River's ambition is to be the go-to library for doing machine learning on streaming data.
A tool that updates all your project's Python dependency files through Pull Requests on GitHub/GitLab.
A tool that updates all your project's Python dependency files through Pull Requests on GitHub/GitLab. About This repo contains the bot that is runnin
Hangar is version control for tensor data. Commit, branch, merge, revert, and collaborate in the data-defined software era.
Overview docs tests package Hangar is version control for tensor data. Commit, branch, merge, revert, and collaborate in the data-defined software era
Transpile trained scikit-learn estimators to C, Java, JavaScript and others.
sklearn-porter Transpile trained scikit-learn estimators to C, Java, JavaScript and others. It's recommended for limited embedded systems and critical
ModelChimp is an experiment tracker for Deep Learning and Machine Learning experiments.
ModelChimp What is ModelChimp? ModelChimp is an experiment tracker for Deep Learning and Machine Learning experiments. ModelChimp provides the followi
An orchestration platform for the development, production, and observation of data assets.
Dagster An orchestration platform for the development, production, and observation of data assets. Dagster lets you define jobs in terms of the data f
Metaflow is a human-friendly Python/R library that helps scientists and engineers build and manage real-life data science projects
Metaflow Metaflow is a human-friendly Python/R library that helps scientists and engineers build and manage real-life data science projects. Metaflow
Handle, manipulate, and convert data with units in Python
unyt A package for handling numpy arrays with units. Often writing code that deals with data that has units can be confusing. A function might return
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
List of Data Science Cheatsheets to rule the world
Data Science Cheatsheets List of Data Science Cheatsheets to rule the world. Table of Contents Business Science Business Science Problem Framework Dat
Fast and customizable vulnerability scanner For JIRA written in Python
Fast and customizable vulnerability scanner For JIRA. 🤔 What is this? Jira-Lens 🔍 is a Python Based vulnerability Scanner for JIRA. Jira is a propri
A Kitti Road Segmentation model implemented in tensorflow.
KittiSeg KittiSeg performs segmentation of roads by utilizing an FCN based model. The model achieved first place on the Kitti Road Detection Benchmark
PyTorch implementation of Federated Learning with Non-IID Data, and federated learning algorithms, including FedAvg, FedProx.
Federated Learning with Non-IID Data This is an implementation of the following paper: Yue Zhao, Meng Li, Liangzhen Lai, Naveen Suda, Damon Civin, Vik
Identifies the faulty wafer before it can be used for the fabrication of integrated circuits and, in photovoltaics, to manufacture solar cells.
Identifies the faulty wafer before it can be used for the fabrication of integrated circuits and, in photovoltaics, to manufacture solar cells. The project retrains itself after every prediction, making it more robust and generalized over time.
A package to fetch sentinel 2 Satellite data from Google.
Sentinel 2 Data Fetcher Installation Create a Virtual Environment and activate it. python3 -m venv venv . venv/bin/activate Install the Package via pi
Very simple encoding scheme that will encode data as a series of OwOs or UwUs.
OwO Encoder Very simple encoding scheme that will encode data as a series of OwOs or UwUs. The encoder is a simple state machine. Still needs a decode
Python based framework for Automatic AI for Regression and Classification over numerical data.
Python based framework for Automatic AI for Regression and Classification over numerical data. Performs model search, hyper-parameter tuning, and high-quality Jupyter Notebook code generation.
Sentiment analysis on streaming twitter data using Spark Structured Streaming & Python
Sentiment analysis on streaming twitter data using Spark Structured Streaming & Python This project is a good starting point for those who have little
A meta plugin for processing timelapse data timepoint by timepoint in napari
napari-time-slicer A meta plugin for processing timelapse data timepoint by timepoint. It enables a list of napari plugins to process 2D+t or 3D+t dat
Data and code for the paper "Importance of Kernel Bandwidth in Quantum Machine Learning"
Reproducibility materials for "Importance of Kernel Bandwidth in Quantum Machine Learning" Repo structure: code contains Python scripts used to genera
Understanding the Generalization Benefit of Model Invariance from a Data Perspective
Understanding the Generalization Benefit of Model Invariance from a Data Perspective This is the code for our NeurIPS2021 paper "Understanding the Gen
RMTD: Robust Moving Target Defence Against False Data Injection Attacks in Power Grids
RMTD: Robust Moving Target Defence Against False Data Injection Attacks in Power Grids Real-time detection performance. This repo contains the code an
Phishing-Crack tools to punish friends
Phishing-Crack Phishing Tool Version 1.0.0 Created By temirovazat A Phishing Tool With PHP and Python3 Features Fake Instagram Phishing Page Fake Face
Important dataframe statistics with a single command
quick_eda Receiving dataframe statistics with one command Project description A python package for Data Scientists, Students, ML Engineers and anyone
AWS Lambda - Parsing Cloudwatch Data and sending the response via email.
AWS Lambda - Parsing Cloudwatch Data and sending the response via email. Author: Evan Erickson Language: Python Backend: AWS / Serverless / AWS Lambda
Data Scientist in Simple Stock Analysis of PT Bukalapak.com Tbk for Long Term Investment
Data Scientist in Simple Stock Analysis of PT Bukalapak.com Tbk for Long Term Investment Brief explanation of PT Bukalapak.com Tbk Bukalapak was found
App to get data from popular polish pages with job offers
Job board parser I written simple app to get me data from popular pages with job offers, because I wanted to knew immidietly if there is some new offe
Created covid data pipeline using PySpark and MySQL that collected data stream from API and do some processing and store it into MYSQL database.
Created covid data pipeline using PySpark and MySQL that collected data stream from API and do some processing and store it into MYSQL database.
Single machine, multiple cards training; mix-precision training; DALI data loader.
Template Script Category Description Category script comparison script train.py, loader.py for single-machine-multiple-cards training train_DP.py, tra
A Python Covid-19 cases tracker that scrapes data off the web and presents the number of Cases, Recovered Cases, and Deaths that occurred because of the pandemic.
A Python Covid-19 cases tracker that scrapes data off the web and presents the number of Cases, Recovered Cases, and Deaths that occurred because of the pandemic.
Early version for manipulate Geo localization data trough API REST.
Backend para obtener los datos (beta) Descripción El servidor está diseñado para recibir y almacenar datos enviados en forma de JSON por una aplicació