Code for "Long Range Probabilistic Forecasting in Time-Series using High Order Statistics"

Last update: Dec 6, 2022

Related tags

Overview

Long Range Probabilistic Forecasting in Time-Series using High Order Statistics

This is the code produced as part of the paper Long Range Probabilistic Forecasting in Time-Series using High Order Statistics

Long Range Probabilistic Forecasting in Time-Series using High Order Statistics.

Prathamesh Deshpande and Sunita Sarawagi. arXiv:2111.03394v1.

How to work with Command Line Arguments?

If an optional argument is not passed, it's value will be extracted from configuration specified in the file main.py (based on dataset_name, model_name).
If a valid argument value is passed through command line arguments, the code will use it further. That is, it will ignore the value assigned in the configuration.

Command Line Arguments Information

Argument name	Type	Valid Assignments	Default
dataset_name	str	azure, ett, etthourly, Solar, taxi30min, Traffic911	positional argument
saved_models_dir	str	-	None
output_dir	str	-	None
N_input	int	>0	-1
N_output	int	>0	-1
epochs	int	>0	-1
normalize	str	same, zscore_per_series, gaussian_copula, log	None
learning_rate	float	>0	-1.0
hidden_size	int	>0	-1
num_grulstm_layers	int	>0	-1
batch_size	int	>0	-1
v_dim	int	>0	-1
t2v_type	str	local, idx, mdh_lincomb, mdh_parti	None
K_list	[int,...,int ]	[>0,...,>0 ]	[]
device	str	-	None

Datasets

All the datasets can be found here.

Add the dataset files/directories in data directory before running the code.

Output files

Targets and Forecasts

Following output files are stored in the <output_dir>/<dataset_name>/ directory.

File name	Description
inputs.npy	Test input values, size: `number of time-series x N_input`
targets.npy	Test target/ground-truth values, size: `number of time-series x N_output`
`<model_name>`_pred_mu.npy	Mean forecast values. The size of the matrix is `number of time-series x number of time-steps`
`<model_name>`_pred_std.npy	Standard-deviation of forecast values. The size of the matrix is `number of time-series x number of time-steps`

Metrics

All the evaluation metrics on test data are stored in <output_dir>/results_<dataset_name>.json in the following format:

{
  <model_name1>: 
    {
      'crps':<crps>,
      'mae':<mae>,
      'mse':<mse>,
      'smape':<smape>,
      'dtw':<dtw>,
      'tdi':<tdi>,
    }
  <model_name2>: 
    {
      'crps':<crps>,
      'mae':<mae>,
      'mse':<mse>,
      'smape':<smape>,
      'dtw':<dtw>,
      'tdi':<tdi>,
    }
    .
    .
    .
}

Here <model_name1>, <model_name2>, ... are different models under consideration.

Comments

Forecast method
As per my under standing of your paper ('Long Range Probabilistic Forecasting in Time-Series using High Order Statistics'), you are taking multiple (2) averages of the original series and then the informer model is being trained independently on these averages and the original series as well. I have a couple of questions here.

How many informer models are being trained?...i.e if we have 3 time series (2 aggregate and 1 original) does 1 informer model get trained on these 3 time series independently or does each time series get its own informer model ?

Once the model trained, a new multivariate gaussian is being defined with mean and covariance. How exactly is this mean being calculated ?
opened by HrudayR 3
Errors running script.sh
This project looks exciting, but Im having trouble running the project. I don't see a requirements.txt or environment.yml, so I did my best to create one.

It looks like there might be some missing files from the project too:

in base_model.py

from models import transformer_manual_attn, transformer_dual_attn ImportError: cannot import name 'transformer_manual_attn' from 'models' (unknown location)

Thanks for the help
opened by gdevos010 1
During training, use dilate_loss for early stopping
When base model is DILATE, compute dilate_loss(..) on val data during training.

A separate metric for early stopping for each model based on sutaibility.

TODOS
opened by pratham16cse 1
Need better way to set stride while creating batches for training
Possible options:

Manually set stride for each dataset such that the model can see maximum possible patterns of input-output sequences in the training data.

Add stochasticity in the stride. For example, if stride is s, shift the window by U(s-r,s+r) where r is width for stochasticity

TODOS
opened by pratham16cse 1
Train base model in teacher-forcing mode
The model makes sequential predictions during training also.

This might be the reason for poor performance of DILATE and MSE on Taxi dataset as forecast horizon at the bottom level is sufficiently large.

Add teacher-forcing mode of training in order to better learn the parameters of the base model at bottom level.

TODOS
opened by pratham16cse 1
readability updates
updates:

commented out transformer_manual_attn, transformer_dual_attn

added log formatting for readability

refactored arg parsing to functions

added missing headers to csv

It's a bit hard to follow the flow of the repo. Main.py is very long, and some refactoring could go a long way.
opened by gdevos010 0
Parser for Datasets of Gaussian Copula Paper
In both train.json and test.json, each line is a time-series.

In test.json, each series can be treated as an independent sequence.

[ ] Context length needs to be limited for both train and test.

[ ] For wiki, find out why initial ~9000 train and test sequences overlap.

[ ] For selecting only 2000 sequences for wiki, how to ensure same 2000 sequences get selected from both train and test?

[ ] Create this pre-processing for Solar, Wiki, and Exchange-rate.

enhancement
opened by pratham16cse 0
Use K as a hyper-parameter
[x] Understand how to perform hyper-parameter turning in pytorch.

[x] Get multiple solutions using multiple values of K.

[x] Select the K that performs best on validation data.
opened by pratham16cse 0
Adjust Model Complexity according to value of K
[ ] Explore the heuristics to reduce model complexity as value of K increases.

[ ] If the aggregate itself tracks multiple values, increase the complexity accordingly?

TODOS
opened by pratham16cse 0
Normalization added
Added flag "normalize"

Normalization occurs at each level

Modified inference models to handle normalization

Modified training and evaluation to handle normalization
opened by pratham16cse 0
Taxi data possibilities
Consider all zones irrespective of number of pickups -- If a zone does not have a pickup in an hour, the value will be zero

Consider only zones that have at least one pickup per hour

Only consider top k zones according to number of pickups over the duration

[ ] If required, get data for more months as well
opened by pratham16cse 0

Owner

GitHub https://arxiv.org/pdf/2111.03394v1.pdf

Inference code for "StylePeople: A Generative Model of Fullbody Human Avatars" paper. This code is for the part of the paper describing video-based avatars.

NeuralTextures This is repository with inference code for paper "StylePeople: A Generative Model of Fullbody Human Avatars" (CVPR21). This code is for

Visual Understanding Lab @ Samsung AI Center Moscow

18 Oct 6, 2022

A code generator from ONNX to PyTorch code

onnx-pytorch Generating pytorch code from ONNX. Currently support onnx==1.9.0 and torch==1.8.1. Installation From PyPI pip install onnx-pytorch From

94 Jan 6, 2023

This is the code for our KILT leaderboard submission to the T-REx and zsRE tasks. It includes code for training a DPR model then continuing training with RAG.

KGI (Knowledge Graph Induction) for slot filling This is the code for our KILT leaderboard submission to the T-REx and zsRE tasks. It includes code fo

72 Jan 6, 2023

Convert Python 3 code to CUDA code.

Py2CUDA Convert python code to CUDA. Usage To convert a python file say named py_file.py to CUDA, run python generate_cuda.py --file py_file.py --arch

3 Jul 14, 2021

Empirical Study of Transformers for Source Code & A Simple Approach for Handling Out-of-Vocabulary Identifiers in Deep Learning for Source Code

Transformers for variable misuse, function naming and code completion tasks The official PyTorch implementation of: Empirical Study of Transformers fo

56 Nov 15, 2022

Reference implementation of code generation projects from Facebook AI Research. General toolkit to apply machine learning to code, from dataset creation to model training and evaluation. Comes with pretrained models.

This repository is a toolkit to do machine learning for programming languages. It implements tokenization, dataset preprocessing, model training and m

408 Jan 1, 2023

Code for the prototype tool in our paper "CoProtector: Protect Open-Source Code against Unauthorized Training Usage with Data Poisoning".

CoProtector Code for the prototype tool in our paper "CoProtector: Protect Open-Source Code against Unauthorized Training Usage with Data Poisoning".

1 Oct 26, 2021

Low-code/No-code approach for deep learning inference on devices

EzEdgeAI A concept project that uses a low-code/no-code approach to implement deep learning inference on devices. It provides a componentized framewor

7 Apr 5, 2022

Code for all the Advent of Code'21 challenges mostly written in python

Advent of Code 21 Code for all the Advent of Code'21 challenges mostly written in python. They are not necessarily the best or fastest solutions but j

4 May 26, 2022

Code to use Augmented Shapiro Wilks Stopping, as well as code for the paper "Statistically Signifigant Stopping of Neural Network Training"

This codebase is being actively maintained, please create and issue if you have issues using it Basics All data files are included under losses and ea

32 Nov 9, 2021

Opinionated code formatter, just like Python's black code formatter but for Beancount

beancount-black Opinionated code formatter, just like Python's black code formatter but for Beancount Try it out online here Features MIT licensed - b

16 Oct 11, 2022

a delightful machine learning tool that allows you to train, test and use models without writing code

igel A delightful machine learning tool that allows you to train/fit, test and use models without writing code Note I'm also working on a GUI desktop

3k Jan 5, 2023

Pytorch Lightning code guideline for conferences

Deep learning project seed Use this seed to start new deep learning / ML projects. Built in setup.py Built in requirements Examples with MNIST Badges

1k Jan 2, 2023

Automatically Build Multiple ML Models with a Single Line of Code. Created by Ram Seshadri. Collaborators Welcome. Permission Granted upon Request.

Auto-ViML Automatically Build Variant Interpretable ML models fast! Auto_ViML is pronounced "auto vimal" (autovimal logo created by Sanket Ghanmare) N

397 Dec 30, 2022

Code samples for my book "Neural Networks and Deep Learning"

Code samples for "Neural Networks and Deep Learning" This repository contains code samples for my book on "Neural Networks and Deep Learning". The cod

13.9k Dec 26, 2022

Code for: https://berkeleyautomation.github.io/bags/

DeformableRavens Code for the paper Learning to Rearrange Deformable Cables, Fabrics, and Bags with Goal-Conditioned Transporter Networks. Here is the

121 Dec 30, 2022

Code for our method RePRI for Few-Shot Segmentation. Paper at http://arxiv.org/abs/2012.06166

Region Proportion Regularized Inference (RePRI) for Few-Shot Segmentation In this repo, we provide the code for our paper : "Few-Shot Segmentation Wit

138 Dec 12, 2022

Applications using the GTN library and code to reproduce experiments in "Differentiable Weighted Finite-State Transducers"

gtn_applications An applications library using GTN. Current examples include: Offline handwriting recognition Automatic speech recognition Installing

68 Dec 29, 2022

Code for "Contextual Non-Local Alignment over Full-Scale Representation for Text-Based Person Search"

Contextual Non-Local Alignment over Full-Scale Representation for Text-Based Person Search This is an implementation for our paper Contextual Non-Loca

50 Dec 3, 2022