3255 Python Probabilistic-data-analysis Libraries

An open source utility for creating publication quality LaTex figures generated from OpenFOAM data files.

foamTEX An open source utility for creating publication quality LaTex figures generated from OpenFOAM data files. Explore the docs » Report Bug · Requ

1 Dec 19, 2021

Data Applications Project

DBMS project- Hotel Franchise Data and application project By TEAM Kurukunda Bhargavi Pamulapati Pallavi Greeshma Amaraneni What is this project about

1 Nov 28, 2021

Sheet Data Image/PDF-to-CSV Converter

5 Nov 22, 2021

Official repository for "Restormer: Efficient Transformer for High-Resolution Image Restoration". SOTA for motion deblurring, image deraining, denoising (Gaussian/real data), and defocus deblurring.

Restormer: Efficient Transformer for High-Resolution Image Restoration Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan,

906 Dec 30, 2022

A single model for shaping, creating, accessing, storing data within a Database

'db' within pydantic - A single model for shaping, creating, accessing, storing data within a Database Key Features Integrated Redis Caching Support A

178 Dec 16, 2022

Image classification for projects and researches

This is a tool to help you quickly solve classification problems including: data analysis, training, report results and model explanation.

2 Dec 27, 2021

MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble

datasketch: Big Data Looks Small datasketch gives you probabilistic data structures that can process and search very large amount of data super fast,

1.9k Jan 7, 2023

Single cell current best practices tutorial case study for the paper:Luecken and Theis, "Current best practices in single-cell RNA-seq analysis: a tutorial"

Scripts for "Current best-practices in single-cell RNA-seq: a tutorial" This repository is complementary to the publication: M.D. Luecken, F.J. Theis,

968 Dec 28, 2022

AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty

AugMix Introduction We propose AugMix, a data processing technique that mixes augmented images and enforces consistent embeddings of the augmented ima

876 Dec 17, 2022

How to use TensorLayer

How to use TensorLayer While research in Deep Learning continues to improve the world, we use a bunch of tricks to implement algorithms with TensorLay

349 Dec 7, 2022

The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate.

The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate. Website • Key Features • How To Use • Docs •

21.1k Dec 29, 2022

This project uses Youtube data API's to do youtube tags analysis based on viewCount, comments etc.

Youtube video details analyser Steps to run this project Please set the AuthKey which you can fetch from google developer console and paste it in the

1 Nov 21, 2021

zeus is a Python implementation of the Ensemble Slice Sampling method.

zeus is a Python implementation of the Ensemble Slice Sampling method. Fast & Robust Bayesian Inference, Efficient Markov Chain Monte Carlo (MCMC), Bl

197 Dec 4, 2022

Source code and notebooks to reproduce experiments and benchmarks on Bias Faces in the Wild (BFW).

Face Recognition: Too Bias, or Not Too Bias? Robinson, Joseph P., Gennady Livitz, Yann Henon, Can Qin, Yun Fu, and Samson Timoner. "Face recognition:

41 Dec 12, 2022

Python module for performing linear regression for data with measurement errors and intrinsic scatter

Linear regression for data with measurement errors and intrinsic scatter (BCES) Python module for performing robust linear regression on (X,Y) data po

56 Sep 27, 2022

Projeto: Machine Learning: Linguagens de Programacao 2004-2001

Projeto: Machine Learning: Linguagens de Programacao 2004-2001 Projeto de Data Science e Machine Learning de análise de linguagens de programação de 2

0 Jun 29, 2021

Predicting the usefulness of reviews given the review text and metadata surrounding the reviews.

Predicting Yelp Review Quality Table of Contents Introduction Motivation Goal and Central Questions The Data Data Storage and ETL EDA Data Pipeline Da

3 Nov 27, 2022

Tutorials, assignments, and competitions for MIT Deep Learning related courses.

MIT Deep Learning This repository is a collection of tutorials for MIT Deep Learning courses. More added as courses progress. Tutorial: Deep Learning

9.5k Jan 7, 2023

Python solutions to solve practical business problems.

Python Business Analytics Also instead of "watching" you can join the link-letter, it's already being sent out to about 90 people and you are free to

357 Dec 26, 2022

Python Data Science Handbook: full text in Jupyter Notebooks

Python Data Science Handbook This repository contains the entire Python Data Science Handbook, in the form of (free!) Jupyter notebooks. How to Use th

36.9k Dec 28, 2022

Design and build a wrapper for the Open Weather API current weather data service

Design and build a wrapper for the Open Weather API current weather data service that returns a city's temperature, with caching, also allowing for the temperature of the latest queried cities that are still validly cached to be retrieved.

1 Jun 27, 2022

Optimal Randomized Canonical Correlation Analysis

ORCCA Optimal Randomized Canonical Correlation Analysis This project is for the python version of ORCCA algorithm. It depends on Numpy for matrix calc

1 Nov 21, 2021

Empyrial is a Python-based open-source quantitative investment library dedicated to financial institutions and retail investors

By Investors, For Investors. Want to read this in Chinese? Click here Empyrial is a Python-based open-source quantitative investment library dedicated

640 Dec 31, 2022

Clinica is a software platform for clinical research studies involving patients with neurological and psychiatric diseases and the acquisition of multimodal data

Clinica Software platform for clinical neuroimaging studies Homepage | Documentation | Paper | Forum | See also: AD-ML, AD-DL ClinicaDL About The Proj

165 Dec 29, 2022

MRQy is a quality assurance and checking tool for quantitative assessment of magnetic resonance imaging (MRI) data.

Front-end View Backend View Table of Contents Description Prerequisites Running Basic Information Measurements User Interface Feedback and usage Descr

Center for Computational Imaging and Personalized Diagnostics

58 Dec 2, 2022

Python Automated Machine Learning library for tabular data.

Simple but powerful Automated Machine Learning library for tabular data. It uses efficient in-memory SAP HANA algorithms to automate routine Data Scie

47 Dec 17, 2022

Data Version Control or DVC is an open-source tool for data science and machine learning projects

Continuous Machine Learning project integration with DVC Data Version Control or DVC is an open-source tool for data science and machine learning proj

2 Jul 29, 2021

A machine learning model for analyzing text for user sentiment and determine whether its a positive, neutral, or negative review.

Sentiment Analysis on Yelp's Dataset Author: Roberto Sanchez, Talent Path: D1 Group Docker Deployment: Deployment of this application can be found her

0 Aug 4, 2021

Backtesting an algorithmic trading strategy using Machine Learning and Sentiment Analysis.

Trading Tesla with Machine Learning and Sentiment Analysis An interactive program to train a Random Forest Classifier to predict Tesla daily prices us

31 Nov 17, 2022

Introducing neural networks to predict stock prices

IntroNeuralNetworks in Python: A Template Project IntroNeuralNetworks is a project that introduces neural networks and illustrates an example of how o

637 Jan 4, 2023

PyExplainer: A Local Rule-Based Model-Agnostic Technique (Explainable AI)

PyExplainer PyExplainer is a local rule-based model-agnostic technique for generating explanations (i.e., why a commit is predicted as defective) of J

AI Wizards for Software Management (AWSM) Research Group

14 Nov 13, 2022

An interactive dashboard for visualisation, integration and classification of data using Active Learning.

AstronomicAL An interactive dashboard for visualisation, integration and classification of data using Active Learning. AstronomicAL is a human-in-the-

45 Nov 28, 2022

Full automated data pipeline using docker images

Create postgres tables from CSV files This first section is only relate to creating tables from CSV files using postgres container alone. Just one of

1 Nov 21, 2021

For specific function. For my own convenience. Remind owner to share data to another DITO user.

1 Dec 14, 2021

Building house price data pipelines with Apache Beam and Spark on GCP

This project contains the process from building a web crawler to extract the raw data of house price to create ETL pipelines using Google Could Platform services.

1 Nov 22, 2021

topic modeling on unstructured data in Space news articles retrieved from the Guardian (UK) newspaper using API

NLP Space News Topic Modeling Photos by nasa.gov (1, 2, 3, 4, 5) and extremetech.com Table of Contents Project Idea Data acquisition Primary data sour

1 Jan 3, 2022

Real-time analysis of intracranial neurophysiology recordings.

py_neuromodulation Click this button to run the "Tutorial ML with py_neuro" notebooks: The py_neuromodulation toolbox allows for real time capable pro

Interventional Cognitive Neuromodulation - Neumann Lab Berlin

15 Nov 3, 2022

Automated Machine Learning Pipeline for tabular data. Designed for predictive maintenance applications, failure identification, failure prediction, condition monitoring, etc.

10 May 15, 2022

This project uses unsupervised machine learning to identify correlations between daily inoculation rates in the USA and twitter sentiment in regards to COVID-19.

4 Oct 15, 2022

Accelerating model creation and evaluation.

EmeraldML A machine learning library for streamlining the process of (1) cleaning and splitting data, (2) training, optimizing, and testing various mo

0 Dec 6, 2021

This Repository consists of my solutions in Python 3 to various problems in Data Structures and Algorithms

Problems and it's solutions. Problem solving, a great Speed comes with a good Accuracy. The more Accurate you can write code, the more Speed you will

1.3k Jan 1, 2023

Conversational text Analysis using various NLP techniques

159 Jan 6, 2023

Analyse japanese ebooks using MeCab to determine the difficulty level for japanese learners

japanese-ebook-analysis This aim of this project is to make analysing the contents of a japanese ebook easy and streamline the process for non-technic

14 Jul 23, 2022

BESS: Balanced Evolutionary Semi-Stacking for Disease Detection via Partially Labeled Imbalanced Tongue Data

Balanced-Evolutionary-Semi-Stacking Code for the paper ''BESS: Balanced Evolutionary Semi-Stacking for Disease Detection via Partially Labeled Imbalan

0 Jan 16, 2022

go-cqhttp API typing annoations, return data models and utils for nonebot

6 Jan 4, 2023

Herramienta para transferir eventos de Sucuri WAF hacia Azure Data Tables.

Transfiere eventos de Sucuri hacia Azure Data Tables Script para transferir eventos del Sucuri Web Application Firewall (WAF) hacia Azure Data Tables,

1 Dec 22, 2021

An easy-to-use library for emulating code in minidump files.

dumpulator Note: This is a work-in-progress prototype, please treat it as such. An easy-to-use library for emulating code in minidump files. Example T

362 Dec 31, 2022

FakeDataGen is a Full Valid Fake Data Generator.

FakeDataGen is a Full Valid Fake Data Generator. This tool helps you to create fake accounts (in Spanish format) with fully valid data. Within this in

64 Dec 12, 2022

LinkML based SPARQL template library and execution engine

sparqlfun LinkML based SPARQL template library and execution engine modularized core library of SPARQL templates generic templates using common vocabs

6 Oct 10, 2022

Convert your JSON data to a valid Python object to allow accessing keys with the member access operator(.)

JSONObjectMapper Allows you to transform JSON data into an object whose members can be queried using the member access operator. Unlike json.dumps in

4 Jul 20, 2022

A benchmark of data-centric tasks from across the machine learning lifecycle.

61 Dec 28, 2022

Code and real data for the paper "Counterfactual Temporal Point Processes", available at arXiv.

counterfactual-tpp This is a repository containing code and real data for the paper Counterfactual Temporal Point Processes. Pre-requisites This code

11 Dec 9, 2022

Official PyTorch implementation for "Low Precision Decentralized Distributed Training with Heterogenous Data"

Low Precision Decentralized Training with Heterogenous Data Official PyTorch implementation for "Low Precision Decentralized Distributed Training with

0 Nov 23, 2021

Machine Learning Privacy Meter: A tool to quantify the privacy risks of machine learning models with respect to inference attacks, notably membership inference attacks

ML Privacy Meter Machine learning is playing a central role in automated decision making in a wide range of organization and service providers. The da

Data Privacy and Trustworthy Machine Learning Research Lab

357 Jan 6, 2023

Official repository for "Restormer: Efficient Transformer for High-Resolution Image Restoration". SOTA results for single-image motion deblurring, image deraining, image denoising (synthetic and real data), and dual-pixel defocus deblurring.

Restormer: Efficient Transformer for High-Resolution Image Restoration Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan,

906 Dec 30, 2022

Self-attentive task GAN for space domain awareness data augmentation.

SATGAN TODO: update the article URL once published. Article about this implemention The self-attentive task generative adversarial network (SATGAN) le

2 Mar 24, 2022

A multi-platform GUI for bit-based analysis, processing, and visualization

529 Dec 19, 2022

Render tokei's output to interactive sunburst chart.

134 Dec 15, 2022

This library is an ongoing effort towards bringing the data exchanging ability between Java/Scala and Python

PyJava This library is an ongoing effort towards bringing the data exchanging ability between Java/Scala and Python

6 Oct 17, 2022

A number of methods in order to perform Natural Language Processing on live data derived from Twitter

1 Nov 24, 2021

A PyTorch implementation of "CoAtNet: Marrying Convolution and Attention for All Data Sizes".

CoAtNet Overview This is a PyTorch implementation of CoAtNet specified in "CoAtNet: Marrying Convolution and Attention for All Data Sizes", arXiv 2021

268 Jan 7, 2023

Project issue to website data transformation toolkit

braintransform Project issue to website data transformation toolkit. Introduction The purpose of these scripts is to be able to dynamically generate t

1 Nov 19, 2021

PyTorch implementation for our NeurIPS 2021 Spotlight paper "Long Short-Term Transformer for Online Action Detection".

Long Short-Term Transformer for Online Action Detection Introduction This is a PyTorch implementation for our NeurIPS 2021 Spotlight paper "Long Short

77 Dec 16, 2022

Simulation code and tutorial for BBHnet training data

Simulation Dataset for BBHnet NOTE: OLD README, UPDATE IN PROGRESS We generate simulation dataset to train BBHnet, our deep learning framework for det

0 May 31, 2022

Clip Bing Maps backgound as RGB geotif image using center-point from vector data of a shapefile and Bing Maps zoom

Clip Bing Maps backgound as RGB geotif image using center-point from vector data of a shapefile and Bing Maps zoom. Also, rasterize shapefile vectors as corresponding label image.

2 Nov 22, 2021

PyTorch Kafka Dataset: A definition of a dataset to get training data from Kafka.

7 Aug 1, 2022

GeneGAN: Learning Object Transfiguration and Attribute Subspace from Unpaired Data

GeneGAN: Learning Object Transfiguration and Attribute Subspace from Unpaired Data By Shuchang Zhou, Taihong Xiao, Yi Yang, Dieqiao Feng, Qinyao He, W

141 Apr 16, 2021

Training, generation, and analysis code for Learning Particle Physics by Example: Location-Aware Generative Adversarial Networks for Physics

Location-Aware Generative Adversarial Networks (LAGAN) for Physics Synthesis This repository contains all the code used in L. de Oliveira (@lukedeo),

57 Oct 22, 2022

Code and data for paper "Deep Photo Style Transfer"

deep-photo-styletransfer Code and data for paper "Deep Photo Style Transfer" Disclaimer This software is published for academic and non-commercial use

9.9k Dec 29, 2022

A data preprocessing and feature engineering script for a machine learning pipeline is prepared.

FEATURE ENGINEERING Business Problem: A data preprocessing and feature engineering script for a machine learning pipeline needs to be prepared. It is

7 Dec 18, 2021

Required for a machine learning pipeline data preprocessing and variable engineering script needs to be prepared

Feature-Engineering Required for a machine learning pipeline data preprocessing and variable engineering script needs to be prepared. When the dataset

5 Apr 21, 2022

Sentiment Analysis Project using Count Vectorizer and TF-IDF Vectorizer

Sentiment Analysis Project This project contains two sentiment analysis programs for Hotel Reviews using a Hotel Reviews dataset from Datafiniti. The

0 Mar 28, 2022

The repository forked from NVlabs uses our data. (Differentiable rasterization applied to 3D model simplification tasks)

nvdiffmodeling [origin_code] Differentiable rasterization applied to 3D model simplification tasks, as described in the paper: Appearance-Driven Autom

2 Oct 31, 2022

Extract rooms type, door, neibour rooms, rooms corners nad bounding boxes, and generate graph from rplan dataset

Housegan-data-reader House-GAN++ (data-reader) Code and instructions for converting rplan dataset (raster images) to housegan++ data format. House-GAN

13 Nov 24, 2022

A Semi-Intelligent ChatBot filled with statistical and economical data for the Premier League.

MONEYBALL - ChatBot Module: 4006CEM, Class: B, Group: 5 Contributors: Jonas Djondo Roshan Kc Cole Samson Daniel Rodrigues Ihteshaam Naseer Kind remind

1 Nov 18, 2021

Data Utilities e.g. for importing files to onetask

Use this repository to easily convert your source files (csv, txt, excel, json, html) into record-oriented JSON files that can be uploaded into onetask.

1 Jul 18, 2022

Finding Label and Model Errors in Perception Data With Learned Observation Assertions

Finding Label and Model Errors in Perception Data With Learned Observation Assertions This is the project page for Finding Label and Model Errors in P

17 Oct 14, 2022

🐦 Quickly annotate data from the comfort of your Jupyter notebook

🐦 pigeon - Quickly annotate data on Jupyter Pigeon is a simple widget that lets you quickly annotate a dataset of unlabeled examples from the comfort

647 Jan 5, 2023

This is the code used in the paper "Entity Embeddings of Categorical Variables".

This is the code used in the paper "Entity Embeddings of Categorical Variables". If you want to get the original version of the code used for the Kagg

845 Nov 29, 2022

A scikit-learn-compatible module for estimating prediction intervals.

MAPIE - Model Agnostic Prediction Interval Estimator MAPIE allows you to easily estimate prediction intervals (or prediction sets) using your favourit

588 Jan 4, 2023

A high performance implementation of HDBSCAN clustering.

HDBSCAN HDBSCAN - Hierarchical Density-Based Spatial Clustering of Applications with Noise. Performs DBSCAN over varying epsilon values and integrates

2.3k Jan 2, 2023

PyClustering is a Python, C++ data mining library.

pyclustering is a Python, C++ data mining library (clustering algorithm, oscillatory networks, neural networks). The library provides Python and C++ implementations (C++ pyclustering library) of each algorithm or model. C++ pyclustering library is a part of pyclustering and supported for Linux, Windows and MacOS operating systems.

1k Jan 5, 2023

The Fundamental Clustering Problems Suite (FCPS) summaries 54 state-of-the-art clustering algorithms, common cluster challenges and estimations of the number of clusters as well as the testing for cluster tendency.

FCPS Fundamental Clustering Problems Suite The package provides over sixty state-of-the-art clustering algorithms for unsupervised machine learning pu

9 Nov 27, 2022

t-SNE and hierarchical clustering are popular methods of exploratory data analysis, particularly in biology.

tree-SNE t-SNE and hierarchical clustering are popular methods of exploratory data analysis, particularly in biology. Building on recent advances in s

61 Nov 21, 2022

Subpopulation detection in high-dimensional single-cell data

PhenoGraph for Python3 PhenoGraph is a clustering method designed for high-dimensional single-cell data. It works by creating a graph ("network") repr

42 Sep 5, 2022

Kats, a kit to analyze time series data, a lightweight, easy-to-use, generalizable, and extendable framework to perform time series analysis, from understanding the key statistics and characteristics, detecting change points and anomalies, to forecasting future trends.

Description Kats is a toolkit to analyze time series data, a lightweight, easy-to-use, and generalizable framework to perform time series analysis. Ti

4.1k Jan 9, 2023

A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.

pmdarima Pmdarima (originally pyramid-arima, for the anagram of 'py' + 'arima') is a statistical library designed to fill the void in Python's time se

1.3k Dec 22, 2022

A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.

pmdarima Pmdarima (originally pyramid-arima, for the anagram of 'py' + 'arima') is a statistical library designed to fill the void in Python's time se

1.3k Jan 6, 2023

Contains an implementation (sklearn API) of the algorithm proposed in "GENDIS: GEnetic DIscovery of Shapelets" and code to reproduce all experiments.

GENDIS GENetic DIscovery of Shapelets In the time series classification domain, shapelets are small subseries that are discriminative for a certain cl

90 Oct 28, 2022

Calling Julia from Python - an experiment on data loading

Calling Julia from Python - an experiment on data loading See the slides. TLDR After reading Patrick's blog post, we decided to try to replace C++ wit

8 Jun 7, 2022

An experimental Python-to-C transpiler and domain specific language for embedded high-performance computing

562 Dec 28, 2022

The audio-video synchronization of MKV Container Format is exploited to achieve data hiding

The audio-video synchronization of MKV Container Format is exploited to achieve data hiding, where the hidden data can be utilized for various management purposes, including hyper-linking, annotation, and authentication

1 Nov 17, 2021

Dynamical movement primitives (DMPs), probabilistic movement primitives (ProMPs), spatially coupled bimanual DMPs.

Movement Primitives Movement primitives are a common group of policy representations in robotics. There are many different types and variations. This

63 Jan 6, 2023

Urban Big Data Centre Housing Sensor Project

Housing Sensor Project The Urban Big Data Centre is conducting a study of indoor environmental data in Scottish houses. We are using Raspberry Pi devi

2 Dec 13, 2021

[NeurIPS 2021] Deceive D: Adaptive Pseudo Augmentation for GAN Training with Limited Data

Deceive D: Adaptive Pseudo Augmentation for GAN Training with Limited Data (NeurIPS 2021) This repository will provide the official PyTorch implementa

238 Nov 25, 2022

A deep-learning pipeline for segmentation of ambiguous microscopic images.

Welcome to Official repository of deepflash2 - a deep-learning pipeline for segmentation of ambiguous microscopic images. Quick Start in 30 seconds se

39 Dec 19, 2022

Full-featured Decision Trees and Random Forests learner.

CID3 This is a full-featured Decision Trees and Random Forests learner. It can save trees or forests to disk for later use. It is possible to query tr

3 Aug 15, 2022

Repository containing detailed experiments related to the paper "Memotion Analysis through the Lens of Joint Embedding".

Memotion Analysis Through The Lens Of Joint Embedding This repository contains the experiments conducted as described in the paper 'Memotion Analysis

1 Mar 16, 2022

This repository is maintained for the scientific paper tittled " Study of keyword extraction techniques for Electric Double Layer Capacitor domain using text similarity indexes: An experimental analysis "

kwd-extraction-study This repository is maintained for the scientific paper tittled " Study of keyword extraction techniques for Electric Double Layer

1 Dec 5, 2022

Stochastic gradient descent with model building

Stochastic Model Building (SMB) This repository includes a new fast and robust stochastic optimization algorithm for training deep learning models. Th

22 Jan 19, 2022

Python Probabilistic-data-analysis Resources

Python probabilistic-data-analysis Libraries

An open source utility for creating publication quality LaTex figures generated from OpenFOAM data files.

Data Applications Project

Sheet Data Image/PDF-to-CSV Converter

Official repository for "Restormer: Efficient Transformer for High-Resolution Image Restoration". SOTA for motion deblurring, image deraining, denoising (Gaussian/real data), and defocus deblurring.

A single model for shaping, creating, accessing, storing data within a Database

Image classification for projects and researches

MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble

Single cell current best practices tutorial case study for the paper:Luecken and Theis, "Current best practices in single-cell RNA-seq analysis: a tutorial"

AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty

How to use TensorLayer

The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate.

This project uses Youtube data API's to do youtube tags analysis based on viewCount, comments etc.

zeus is a Python implementation of the Ensemble Slice Sampling method.

Source code and notebooks to reproduce experiments and benchmarks on Bias Faces in the Wild (BFW).

Python module for performing linear regression for data with measurement errors and intrinsic scatter

Projeto: Machine Learning: Linguagens de Programacao 2004-2001

Predicting the usefulness of reviews given the review text and metadata surrounding the reviews.

Tutorials, assignments, and competitions for MIT Deep Learning related courses.

Python solutions to solve practical business problems.

Python Data Science Handbook: full text in Jupyter Notebooks

Design and build a wrapper for the Open Weather API current weather data service

Optimal Randomized Canonical Correlation Analysis

Empyrial is a Python-based open-source quantitative investment library dedicated to financial institutions and retail investors

Clinica is a software platform for clinical research studies involving patients with neurological and psychiatric diseases and the acquisition of multimodal data

MRQy is a quality assurance and checking tool for quantitative assessment of magnetic resonance imaging (MRI) data.

Python Automated Machine Learning library for tabular data.

Data Version Control or DVC is an open-source tool for data science and machine learning projects

A machine learning model for analyzing text for user sentiment and determine whether its a positive, neutral, or negative review.

Backtesting an algorithmic trading strategy using Machine Learning and Sentiment Analysis.

Introducing neural networks to predict stock prices

PyExplainer: A Local Rule-Based Model-Agnostic Technique (Explainable AI)

An interactive dashboard for visualisation, integration and classification of data using Active Learning.

Full automated data pipeline using docker images

For specific function. For my own convenience. Remind owner to share data to another DITO user.

Building house price data pipelines with Apache Beam and Spark on GCP

topic modeling on unstructured data in Space news articles retrieved from the Guardian (UK) newspaper using API

Real-time analysis of intracranial neurophysiology recordings.

Automated Machine Learning Pipeline for tabular data. Designed for predictive maintenance applications, failure identification, failure prediction, condition monitoring, etc.

This project uses unsupervised machine learning to identify correlations between daily inoculation rates in the USA and twitter sentiment in regards to COVID-19.

Accelerating model creation and evaluation.

This Repository consists of my solutions in Python 3 to various problems in Data Structures and Algorithms

Conversational text Analysis using various NLP techniques

Analyse japanese ebooks using MeCab to determine the difficulty level for japanese learners

BESS: Balanced Evolutionary Semi-Stacking for Disease Detection via Partially Labeled Imbalanced Tongue Data

go-cqhttp API typing annoations, return data models and utils for nonebot

Herramienta para transferir eventos de Sucuri WAF hacia Azure Data Tables.

An easy-to-use library for emulating code in minidump files.

FakeDataGen is a Full Valid Fake Data Generator.

LinkML based SPARQL template library and execution engine

Convert your JSON data to a valid Python object to allow accessing keys with the member access operator(.)

A benchmark of data-centric tasks from across the machine learning lifecycle.

Code and real data for the paper "Counterfactual Temporal Point Processes", available at arXiv.

Official PyTorch implementation for "Low Precision Decentralized Distributed Training with Heterogenous Data"

Machine Learning Privacy Meter: A tool to quantify the privacy risks of machine learning models with respect to inference attacks, notably membership inference attacks

Official repository for "Restormer: Efficient Transformer for High-Resolution Image Restoration". SOTA results for single-image motion deblurring, image deraining, image denoising (synthetic and real data), and dual-pixel defocus deblurring.

Self-attentive task GAN for space domain awareness data augmentation.

A multi-platform GUI for bit-based analysis, processing, and visualization

Render tokei's output to interactive sunburst chart.

This library is an ongoing effort towards bringing the data exchanging ability between Java/Scala and Python

A number of methods in order to perform Natural Language Processing on live data derived from Twitter

A PyTorch implementation of "CoAtNet: Marrying Convolution and Attention for All Data Sizes".

Project issue to website data transformation toolkit

PyTorch implementation for our NeurIPS 2021 Spotlight paper "Long Short-Term Transformer for Online Action Detection".

Simulation code and tutorial for BBHnet training data

Clip Bing Maps backgound as RGB geotif image using center-point from vector data of a shapefile and Bing Maps zoom

PyTorch Kafka Dataset: A definition of a dataset to get training data from Kafka.

GeneGAN: Learning Object Transfiguration and Attribute Subspace from Unpaired Data

Training, generation, and analysis code for Learning Particle Physics by Example: Location-Aware Generative Adversarial Networks for Physics

Code and data for paper "Deep Photo Style Transfer"

A data preprocessing and feature engineering script for a machine learning pipeline is prepared.

Required for a machine learning pipeline data preprocessing and variable engineering script needs to be prepared

Sentiment Analysis Project using Count Vectorizer and TF-IDF Vectorizer

The repository forked from NVlabs uses our data. (Differentiable rasterization applied to 3D model simplification tasks)

Extract rooms type, door, neibour rooms, rooms corners nad bounding boxes, and generate graph from rplan dataset

A Semi-Intelligent ChatBot filled with statistical and economical data for the Premier League.

Data Utilities e.g. for importing files to onetask

Finding Label and Model Errors in Perception Data With Learned Observation Assertions

🐦 Quickly annotate data from the comfort of your Jupyter notebook