560 Repositories
Python series-datasets Libraries
The Spectral Diagram (SD) is a new tool for the comparison of time series in the frequency domain
The Spectral Diagram (SD) is a new tool for the comparison of time series in the frequency domain. The SD provides a novel way to display the coherence function, power, amplitude, phase, and skill score of discrete frequencies of two time series. Each SD summarises these quantities in a single plot for multiple targeted frequencies.
Code and datasets for TPAMI 2021
SkeletonNet This repository constains the codes and ShapeNetV1-Surface-Skeleton,ShapNetV1-SkeletalVolume and 2d image datasets ShapeNetRendering. Plea
ACAV100M: Automatic Curation of Large-Scale Datasets for Audio-Visual Video Representation Learning. In ICCV, 2021.
ACAV100M: Automatic Curation of Large-Scale Datasets for Audio-Visual Video Representation Learning This repository contains the code for our ICCV 202
ObjTables: Tools for creating and reusing high-quality spreadsheets
ObjTables: Tools for creating and reusing high-quality spreadsheets ObjTables is a toolkit which makes it easy to use spreadsheets (e.g., XLSX workboo
Replication Package for "An Empirical Study of the Effectiveness of an Ensemble of Stand-alone Sentiment Detection Tools for Software Engineering Datasets"
Replication Package for "An Empirical Study of the Effectiveness of an Ensemble of Stand-alone Sentiment Detection Tools for Software Engineering Data
Code for "Long Range Probabilistic Forecasting in Time-Series using High Order Statistics"
Long Range Probabilistic Forecasting in Time-Series using High Order Statistics This is the code produced as part of the paper Long Range Probabilisti
Source code for the paper: Variance-Aware Machine Translation Test Sets (NeurIPS 2021 Datasets and Benchmarks Track)
Variance-Aware-MT-Test-Sets Variance-Aware Machine Translation Test Sets License See LICENSE. We follow the data licensing plan as the same as the WMT
Understanding the Effects of Datasets Characteristics on Offline Reinforcement Learning
Understanding the Effects of Datasets Characteristics on Offline Reinforcement Learning Kajetan Schweighofer1, Markus Hofmarcher1, Marius-Constantin D
advance python series: Data Classes, OOPs, python
Working With Pydantic - Built-in Data Process ========================== Normal way to process data (reading json file): the normal princiople, it's f
An implementation of a discriminant function over a normal distribution to help classify datasets.
CS4044D Machine Learning Assignment 1 By Dev Sony, B180297CS The question, report and source code can be found here. Github Repo Solution 1 Based on t
face2comics by Sxela (Alex Spirin) - face2comics datasets
This is a paired face to comics dataset, which can be used to train pix2pix or similar networks.
A non-linear, non-parametric Machine Learning method capable of modeling complex datasets
Fast Symbolic Regression Symbolic Regression is a non-linear, non-parametric Machine Learning method capable of modeling complex data sets. fastsr aim
Datasets and source code for our paper Webly Supervised Fine-Grained Recognition: Benchmark Datasets and An Approach
Introduction Datasets and source code for our paper Webly Supervised Fine-Grained Recognition: Benchmark Datasets and An Approach Datasets: WebFG-496
Biblioteca Python que extrai dados de mercado do Bacen (Séries Temporais)
Pybacen This library was developed for economic analysis in the Brazilian scenario (Investments, micro and macroeconomic indicators) Installation Inst
An unofficial personal implementation of UM-Adapt, specifically to tackle joint estimation of panoptic segmentation and depth prediction for autonomous driving datasets.
Semisupervised Multitask Learning This repository is an unofficial and slightly modified implementation of UM-Adapt[1] using PyTorch. This code primar
Python tools for querying and manipulating BIDS datasets.
PyBIDS is a Python library to centralize interactions with datasets conforming BIDS (Brain Imaging Data Structure) format.
LSTM Neural Networks for Spectroscopic Studies of Type Ia Supernovae
Package Description The difficulties in acquiring spectroscopic data have been a major challenge for supernova surveys. snlstm is developed to provide
RLDS stands for Reinforcement Learning Datasets
RLDS RLDS stands for Reinforcement Learning Datasets and it is an ecosystem of tools to store, retrieve and manipulate episodic data in the context of
A method to perform unsupervised cross-region adaptation of crop classifiers trained with satellite image time series.
TimeMatch Official source code of TimeMatch: Unsupervised Cross-region Adaptation by Temporal Shift Estimation by Joachim Nyborg, Charlotte Pelletier,
[NeurIPS 2021] Well-tuned Simple Nets Excel on Tabular Datasets
[NeurIPS 2021] Well-tuned Simple Nets Excel on Tabular Datasets Introduction This repo contains the source code accompanying the paper: Well-tuned Sim
A collection of Scikit-Learn compatible time series transformers and tools.
tsfeast A collection of Scikit-Learn compatible time series transformers and tools. Installation Create a virtual environment and install: From PyPi p
Companion repo of the UCC 2021 paper "Predictive Auto-scaling with OpenStack Monasca"
Predictive Auto-scaling with OpenStack Monasca Giacomo Lanciano*, Filippo Galli, Tommaso Cucinotta, Davide Bacciu, Andrea Passarella 2021 IEEE/ACM 14t
This repository has datasets containing information of Uber pickups in NYC from April 2014 to September 2014 and January to June 2015. data Analysis , virtualization and some insights are gathered here
uber-pickups-analysis Data Source: https://www.kaggle.com/fivethirtyeight/uber-pickups-in-new-york-city Information about data set The dataset contain
NitroFE is a Python feature engineering engine which provides a variety of modules designed to internally save past dependent values for providing continuous calculation.
NitroFE is a Python feature engineering engine which provides a variety of modules designed to internally save past dependent values for providing continuous calculation.
Active Learning demo using two small datasets
ActiveLearningDemo How to run step one put the dataset folder and use command below to split the dataset to the required structure run utils.py For ea
Time Series Forecasting with Temporal Fusion Transformer in Pytorch
Forecasting with the Temporal Fusion Transformer Multi-horizon forecasting often contains a complex mix of inputs – including static (i.e. time-invari
STS Benchmark comprises a selection of the English datasets used in the STS tasks organized in the context of SemEval between 2012 and 2017. The selection of datasets include text from image captions, news headlines and user forums.
stsb_multi_mt_en STS Benchmark comprises a selection of the English datasets used in the STS tasks organized in the context of SemEval between 2012 an
Instant search for and access to many datasets in Pyspark.
SparkDataset Provides instant access to many datasets right from Pyspark (in Spark DataFrame structure). Drop a star if you like the project. 😃 Motiv
this repository has datasets containing information of Uber pickups in NYC from April 2014 to September 2014 and January to June 2015. data Analysis , virtualization and some insights are gathered here
uber-pickups-analysis Data Source: https://www.kaggle.com/fivethirtyeight/uber-pickups-in-new-york-city Information about data set The dataset contain
Python library for analysis of time series data including dimensionality reduction, clustering, and Markov model estimation
deeptime Releases: Installation via conda recommended. conda install -c conda-forge deeptime pip install deeptime Documentation: deeptime-ml.github.io
TAug :: Time Series Data Augmentation using Deep Generative Models
TAug :: Time Series Data Augmentation using Deep Generative Models Note!!! The package is under development so be careful for using in production! Fea
This repository is the official implementation of Using Time-Series Privileged Information for Provably Efficient Learning of Prediction Models
Using Time-Series Privileged Information for Provably Efficient Learning of Prediction Models Link to paper Abstract We study prediction of future out
A tutorial for people to run synthetic data replica's from source healthcare datasets
Synthetic-Data-Replica-for-Healthcare Description What is this? A tailored hands-on tutorial showing how to use Python to create synthetic data replic
Official Datasets and Implementation from our Paper "Video Class Agnostic Segmentation in Autonomous Driving".
Video Class Agnostic Segmentation [Method Paper] [Benchmark Paper] [Project] [Demo] Official Datasets and Implementation from our Paper "Video Class A
Price forecasting of SGB and IRFC Bonds and comparing there returns
Project_Bonds Project Title : Price forecasting of SGB and IRFC Bonds and comparing there returns. Introduction of the Project The 2008-09 global fina
Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting
Autoformer (NeurIPS 2021) Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting Time series forecasting is a c
Anomaly Detection Based on Hierarchical Clustering of Mobile Robot Data
We proposed a new approach to detect anomalies of mobile robot data. We investigate each data seperately with two clustering method hierarchical and k-means. There are two sub-method that we used for produce an anomaly score. Then, we merge these two score and produce merged anomaly score as a result.
Dynamica causal Bayesian optimisation
Dynamic Causal Bayesian Optimization This is a Python implementation of Dynamic Causal Bayesian Optimization as presented at NeurIPS 2021. Abstract Th
YOLOv5 Series Multi-backbone, Pruning and quantization Compression Tool Box.
YOLOv5-Compression Update News Requirements 环境安装 pip install -r requirements.txt Evaluation metric Visdrone Model mAP mAP@50 Parameters(M) GFLOPs FPS@
HW 2: Visualizing interesting datasets
HW 2: Visualizing interesting datasets Check out the project instructions here! Mean Earnings per Hour for Males and Females My first graph uses data
Asterisk is a framework to generate high-quality training datasets at scale
Asterisk is a framework to generate high-quality training datasets at scale
PyTorch Autoencoders - Implementing a Variational Autoencoder (VAE) Series in Pytorch.
PyTorch Autoencoders Implementing a Variational Autoencoder (VAE) Series in Pytorch. Inspired by this repository Model List check model paper conferen
A dataset handling library for computer vision datasets in LOST-fromat
A dataset handling library for computer vision datasets in LOST-fromat
HM02: Visualizing Interesting Datasets
HM02: Visualizing Interesting Datasets This is a homework assignment for CSCI 40 class at Claremont McKenna College. Go to the project page to learn m
This repository contains code demonstrating the methods outlined in Path Signature Area-Based Causal Discovery in Coupled Time Series presented at Causal Analysis Workshop 2021.
signed-area-causal-inference This repository contains code demonstrating the methods outlined in Path Signature Area-Based Causal Discovery in Coupled
One-Stop Destination for codes of all Data Structures & Algorithms
CodingSimplified_GK This repository is aimed at creating a One stop Destination of codes of all Data structures and Algorithms along with basic explai
Multivariate Time Series Forecasting with efficient Transformers. Code for the paper "Long-Range Transformers for Dynamic Spatiotemporal Forecasting."
Spacetimeformer Multivariate Forecasting This repository contains the code for the paper, "Long-Range Transformers for Dynamic Spatiotemporal Forecast
Multivariate Time Series Transformer, public version
Multivariate Time Series Transformer Framework This code corresponds to the paper: George Zerveas et al. A Transformer-based Framework for Multivariat
🎵 A repository for manually annotating files to create labeled acoustic datasets for machine learning.
🎵 A repository for manually annotating files to create labeled acoustic datasets for machine learning.
Efficient Training of Visual Transformers with Small Datasets
Official codes for "Efficient Training of Visual Transformers with Small Datasets", NerIPS 2021.
Repo for "Physion: Evaluating Physical Prediction from Vision in Humans and Machines" submission to NeurIPS 2021 (Datasets & Benchmarks track)
Physion: Evaluating Physical Prediction from Vision in Humans and Machines This repo contains code and data to reproduce the results in our paper, Phy
Warren - Stock Price Predictor
Web app to predict closing stock prices in real time using Facebook's Prophet time series algorithm with a multi-variate, single-step time series forecasting strategy.
an elegant datasets factory
rawbuilder an elegant datasets factory Free software: MIT license Documentation: https://rawbuilder.readthedocs.io. Features Schema oriented datasets
Extremely simple and fast extreme multi-class and multi-label classifiers.
napkinXC napkinXC is an extremely simple and fast library for extreme multi-class and multi-label classification, that focus of implementing various m
Compares various time-series feature sets on computational performance, within-set structure, and between-set relationships.
feature-set-comp Compares various time-series feature sets on computational performance, within-set structure, and between-set relationships. Reposito
Code and datasets for the paper "KnowPrompt: Knowledge-aware Prompt-tuning with Synergistic Optimization for Relation Extraction"
KnowPrompt Code and datasets for our paper "KnowPrompt: Knowledge-aware Prompt-tuning with Synergistic Optimization for Relation Extraction" Requireme
Pytorch implementation for "Distribution-Balanced Loss for Multi-Label Classification in Long-Tailed Datasets" (ECCV 2020 Spotlight)
Distribution-Balanced Loss [Paper] The implementation of our paper Distribution-Balanced Loss for Multi-Label Classification in Long-Tailed Datasets (
QuakeLabeler is a Python package to create and manage your seismic training data, processes, and visualization in a single place — so you can focus on building the next big thing.
QuakeLabeler Quake Labeler was born from the need for seismologists and developers who are not AI specialists to easily, quickly, and independently bu
Nixtla is an open-source time series forecasting library.
Nixtla Nixtla is an open-source time series forecasting library. We are helping data scientists and developers to have access to open source state-of-
PyEmits, a python package for easy manipulation in time-series data.
PyEmits, a python package for easy manipulation in time-series data. Time-series data is very common in real life. Engineering FSI industry (Financial
The "breathing k-means" algorithm with datasets and example notebooks
The Breathing K-Means Algorithm (with examples) The Breathing K-Means is an approximation algorithm for the k-means problem that (on average) is bette
Codebase for Time-series Generative Adversarial Networks (TimeGAN)
Codebase for Time-series Generative Adversarial Networks (TimeGAN)
Pytorch implementation of the paper Time-series Generative Adversarial Networks
TimeGAN-pytorch Pytorch implementation of the paper Time-series Generative Adversarial Networks presented at NeurIPS'19. Jinsung Yoon, Daniel Jarrett
A rule learning algorithm for the deduction of syndrome definitions from time series data.
README This project provides a rule learning algorithm for the deduction of syndrome definitions from time series data. Large parts of the algorithm a
Glue is a python project to link visualizations of scientific datasets across many files.
Glue Glue is a python project to link visualizations of scientific datasets across many files. Click on the image for a quick demo: Features Interacti
Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
Tensor2Tensor Tensor2Tensor, or T2T for short, is a library of deep learning models and datasets designed to make deep learning more accessible and ac
The first online catalogue for Arabic NLP datasets.
Masader The first online catalogue for Arabic NLP datasets. This catalogue contains 200 datasets with more than 25 metadata annotations for each datas
An implementation of the [Hierarchical (Sig-Wasserstein) GAN] algorithm for large dimensional Time Series Generation
Hierarchical GAN for large dimensional financial market data Implementation This repository is an implementation of the [Hierarchical (Sig-Wasserstein
MaD GUI is a basis for graphical annotation and computational analysis of time series data.
MaD GUI Machine Learning and Data Analytics Graphical User Interface MaD GUI is a basis for graphical annotation and computational analysis of time se
Benchmark datasets, data loaders, and evaluators for graph machine learning
Overview The Open Graph Benchmark (OGB) is a collection of benchmark datasets, data loaders, and evaluators for graph machine learning. Datasets cover
Datasets accompanying the paper ConditionalQA: A Complex Reading Comprehension Dataset with Conditional Answers.
ConditionalQA Datasets accompanying the paper ConditionalQA: A Complex Reading Comprehension Dataset with Conditional Answers. Disclaimer This dataset
PSML: A Multi-scale Time-series Dataset for Machine Learning in Decarbonized Energy Grids
PSML: A Multi-scale Time-series Dataset for Machine Learning in Decarbonized Energy Grids The electric grid is a key enabling infrastructure for the a
Dynamical Wasserstein Barycenters for Time Series Modeling
Dynamical Wasserstein Barycenters for Time Series Modeling This is the code related for the Dynamical Wasserstein Barycenter model published in Neurip
Python Package for DataHerb: create, search, and load datasets.
The Python Package for DataHerb A DataHerb Core Service to Create and Load Datasets.
HyperSpy is an open source Python library for the interactive analysis of multidimensional datasets
HyperSpy is an open source Python library for the interactive analysis of multidimensional datasets that can be described as multidimensional arrays o
TCube generates rich and fluent narratives that describes the characteristics, trends, and anomalies of any time-series data (domain-agnostic) using the transfer learning capabilities of PLMs.
TCube: Domain-Agnostic Neural Time series Narration This repository contains the code for the paper: "TCube: Domain-Agnostic Neural Time series Narrat
Raindrop strategy for Irregular time series
Graph-Guided Network For Irregularly Sampled Multivariate Time Series Overview This repository contains processed datasets and implementation code for
Python suite to construct benchmark machine learning datasets from the MIMIC-III clinical database.
MIMIC-III Benchmarks Python suite to construct benchmark machine learning datasets from the MIMIC-III clinical database. Currently, the benchmark data
Eland is a Python Elasticsearch client for exploring and analyzing data in Elasticsearch with a familiar Pandas-compatible API.
Python Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch
A set of examples around hub for creating and processing datasets
Examples for Hub - Dataset Format for AI A repository showcasing examples of using Hub Uploading Dataset Places365 Colab Tutorials Notebook Link Getti
TorchXRayVision: A library of chest X-ray datasets and models.
torchxrayvision A library for chest X-ray datasets and models. Including pre-trained models. ( 🎬 promo video about the project) Motivation: While the
A simple recipe for training and inferencing Transformer architecture for Multi-Task Learning on custom datasets. You can find two approaches for achieving this in this repo.
multitask-learning-transformers A simple recipe for training and inferencing Transformer architecture for Multi-Task Learning on custom datasets. You
C++ Implementation of PyTorch Tutorials for Everyone
C++ Implementation of PyTorch Tutorials for Everyone OS (Compiler)\LibTorch 1.9.0 macOS (clang 10.0, 11.0, 12.0) Linux (gcc 8, 9, 10, 11) Windows (msv
The source code and data of the paper "Instance-wise Graph-based Framework for Multivariate Time Series Forecasting".
IGMTF The source code and data of the paper "Instance-wise Graph-based Framework for Multivariate Time Series Forecasting". Requirements The framework
Merlion: A Machine Learning Framework for Time Series Intelligence
Merlion: A Machine Learning Library for Time Series Table of Contents Introduction Installation Documentation Getting Started Anomaly Detection Foreca
Create Fast and easy image datasets using reddit
Reddit-Image-Scraper Reddit Reddit is an American Social news aggregation, web content rating, and discussion website. Reddit has been devided by topi
Merlion: A Machine Learning Framework for Time Series Intelligence
Merlion is a Python library for time series intelligence. It provides an end-to-end machine learning framework that includes loading and transforming data, building and training models, post-processing model outputs, and evaluating model performance. I
The GitHub repository for the paper: “Time Series is a Special Sequence: Forecasting with Sample Convolution and Interaction“.
SCINet This is the original PyTorch implementation of the following work: Time Series is a Special Sequence: Forecasting with Sample Convolution and I
A Tensorflow based library for Time Series Modelling with Gaussian Processes
Markovflow Documentation | Tutorials | API reference | Slack What does Markovflow do? Markovflow is a Python library for time-series analysis via prob
The tool to make NLP datasets ready to use
chazutsu photo from Kaikado, traditional Japanese chazutsu maker chazutsu is the dataset downloader for NLP. import chazutsu r = chazutsu.data
TorchGeo is a PyTorch domain library, similar to torchvision, that provides datasets, transforms, samplers, and pre-trained models specific to geospatial data.
TorchGeo is a PyTorch domain library, similar to torchvision, that provides datasets, transforms, samplers, and pre-trained models specific to geospatial data.
S3-plugin is a high performance PyTorch dataset library to efficiently access datasets stored in S3 buckets.
S3-plugin is a high performance PyTorch dataset library to efficiently access datasets stored in S3 buckets.
Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless.
Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless. This is the official Roboflow python package that interfaces with the Roboflow API.
An open-source Python project series where beginners can contribute and practice coding.
Python Mini Projects A collection of easy Python small projects to help you improve your programming skills. Table Of Contents Aim Of The Project Cont
Here, I find the Fibonacci Series using python
Fibonacci-Series-using-python Here, I find the Fibonacci Series using python Requirements No Special Requirements Contribution I have strong belief on
Calculates carbon footprint based on fuel mix and discharge profile at the utility selected. Can create graphs and tabular output for fuel mix based on input file of series of power drawn over a period of time.
carbon-footprint-calculator Conda distribution ~/anaconda3/bin/conda install anaconda-client conda-build ~/anaconda3/bin/conda config --set anaconda_u
Fully reproducible, Dockerized, step-by-step, tutorial on how to mock a "real-time" Kafka data stream from a timestamped csv file. Detailed blog post published on Towards Data Science.
time-series-kafka-demo Mock stream producer for time series data using Kafka. I walk through this tutorial and others here on GitHub and on my Medium
Simple integer-valued time series bit packing
Smahat allows to encode a sequence of integer values using a fixed (for all values) number of bits but minimal with regards to the data range. For example: for a series of boolean values only one bit is needed, for a series of integer percentages 7 bits are needed, etc.
Source code from thenewboston Discord Bot with Python tutorial series.
Project Setup Follow the steps below to set up the project on your environment. Local Development Create a virtual environment with Python 3.7 or high
This repository contains the official release of the model "BanglaBERT" and associated downstream finetuning code and datasets introduced in the paper titled "BanglaBERT: Combating Embedding Barrier in Multilingual Models for Low-Resource Language Understanding".
BanglaBERT This repository contains the official release of the model "BanglaBERT" and associated downstream finetuning code and datasets introduced i