Code for "Long Range Probabilistic Forecasting in Time-Series using High Order Statistics"

Overview

Long Range Probabilistic Forecasting in Time-Series using High Order Statistics

This is the code produced as part of the paper Long Range Probabilistic Forecasting in Time-Series using High Order Statistics

Long Range Probabilistic Forecasting in Time-Series using High Order Statistics.

Prathamesh Deshpande and Sunita Sarawagi. arXiv:2111.03394v1.

How to work with Command Line Arguments?

  • If an optional argument is not passed, it's value will be extracted from configuration specified in the file main.py (based on dataset_name, model_name).
  • If a valid argument value is passed through command line arguments, the code will use it further. That is, it will ignore the value assigned in the configuration.

Command Line Arguments Information

Argument name Type Valid Assignments Default
dataset_name str azure, ett, etthourly, Solar, taxi30min, Traffic911 positional argument
saved_models_dir str - None
output_dir str - None
N_input int >0 -1
N_output int >0 -1
epochs int >0 -1
normalize str same, zscore_per_series, gaussian_copula, log None
learning_rate float >0 -1.0
hidden_size int >0 -1
num_grulstm_layers int >0 -1
batch_size int >0 -1
v_dim int >0 -1
t2v_type str local, idx, mdh_lincomb, mdh_parti None
K_list [int,...,int ] [>0,...,>0 ] []
device str - None

Datasets

All the datasets can be found here.

Add the dataset files/directories in data directory before running the code.

Output files

Targets and Forecasts

Following output files are stored in the <output_dir>/<dataset_name>/ directory.

File name Description
inputs.npy Test input values, size: number of time-series x N_input
targets.npy Test target/ground-truth values, size: number of time-series x N_output
<model_name>_pred_mu.npy Mean forecast values. The size of the matrix is number of time-series x number of time-steps
<model_name>_pred_std.npy Standard-deviation of forecast values. The size of the matrix is number of time-series x number of time-steps

Metrics

All the evaluation metrics on test data are stored in <output_dir>/results_<dataset_name>.json in the following format:

{
  <model_name1>: 
    {
      'crps':<crps>,
      'mae':<mae>,
      'mse':<mse>,
      'smape':<smape>,
      'dtw':<dtw>,
      'tdi':<tdi>,
    }
  <model_name2>: 
    {
      'crps':<crps>,
      'mae':<mae>,
      'mse':<mse>,
      'smape':<smape>,
      'dtw':<dtw>,
      'tdi':<tdi>,
    }
    .
    .
    .
}

Here <model_name1>, <model_name2>, ... are different models under consideration.

Comments
  • Forecast method

    Forecast method

    As per my under standing of your paper ('Long Range Probabilistic Forecasting in Time-Series using High Order Statistics'), you are taking multiple (2) averages of the original series and then the informer model is being trained independently on these averages and the original series as well. I have a couple of questions here.

    1. How many informer models are being trained?...i.e if we have 3 time series (2 aggregate and 1 original) does 1 informer model get trained on these 3 time series independently or does each time series get its own informer model ?

    2. Once the model trained, a new multivariate gaussian is being defined with mean and covariance. How exactly is this mean being calculated ?

    opened by HrudayR 3
  • Errors running script.sh

    Errors running script.sh

    This project looks exciting, but Im having trouble running the project. I don't see a requirements.txt or environment.yml, so I did my best to create one.

    It looks like there might be some missing files from the project too:

    in base_model.py

    from models import transformer_manual_attn, transformer_dual_attn
    ImportError: cannot import name 'transformer_manual_attn' from 'models' (unknown location)
    

    Thanks for the help

    opened by gdevos010 1
  • During training, use dilate_loss for early stopping

    During training, use dilate_loss for early stopping

    • When base model is DILATE, compute dilate_loss(..) on val data during training.
    • A separate metric for early stopping for each model based on sutaibility.
    TODOS 
    opened by pratham16cse 1
  • Need better way to set stride while creating batches for training

    Need better way to set stride while creating batches for training

    Possible options:

    • Manually set stride for each dataset such that the model can see maximum possible patterns of input-output sequences in the training data.
    • Add stochasticity in the stride. For example, if stride is s, shift the window by U(s-r,s+r) where r is width for stochasticity
    TODOS 
    opened by pratham16cse 1
  • Train base model in teacher-forcing mode

    Train base model in teacher-forcing mode

    • The model makes sequential predictions during training also.
    • This might be the reason for poor performance of DILATE and MSE on Taxi dataset as forecast horizon at the bottom level is sufficiently large.
    • Add teacher-forcing mode of training in order to better learn the parameters of the base model at bottom level.
    TODOS 
    opened by pratham16cse 1
  • readability updates

    readability updates

    updates:

    • commented out transformer_manual_attn, transformer_dual_attn
    • added log formatting for readability
    • refactored arg parsing to functions
    • added missing headers to csv

    It's a bit hard to follow the flow of the repo. Main.py is very long, and some refactoring could go a long way.

    opened by gdevos010 0
  • Parser for Datasets of Gaussian Copula Paper

    Parser for Datasets of Gaussian Copula Paper

    • In both train.json and test.json, each line is a time-series.
    • In test.json, each series can be treated as an independent sequence.
    • [ ] Context length needs to be limited for both train and test.
    • [ ] For wiki, find out why initial ~9000 train and test sequences overlap.
    • [ ] For selecting only 2000 sequences for wiki, how to ensure same 2000 sequences get selected from both train and test?
    • [ ] Create this pre-processing for Solar, Wiki, and Exchange-rate.
    enhancement 
    opened by pratham16cse 0
  • Use K as a hyper-parameter

    Use K as a hyper-parameter

    • [x] Understand how to perform hyper-parameter turning in pytorch.
    • [x] Get multiple solutions using multiple values of K.
    • [x] Select the K that performs best on validation data.
    opened by pratham16cse 0
  • Adjust Model Complexity according to value of K

    Adjust Model Complexity according to value of K

    • [ ] Explore the heuristics to reduce model complexity as value of K increases.
    • [ ] If the aggregate itself tracks multiple values, increase the complexity accordingly?
    TODOS 
    opened by pratham16cse 0
  • Normalization added

    Normalization added

    • Added flag "normalize"
    • Normalization occurs at each level
    • Modified inference models to handle normalization
    • Modified training and evaluation to handle normalization
    opened by pratham16cse 0
  • Taxi data possibilities

    Taxi data possibilities

    1. Consider all zones irrespective of number of pickups -- If a zone does not have a pickup in an hour, the value will be zero
    2. Consider only zones that have at least one pickup per hour
    3. Only consider top k zones according to number of pickups over the duration
    • [ ] If required, get data for more months as well
    opened by pratham16cse 0
Inference code for "StylePeople: A Generative Model of Fullbody Human Avatars" paper. This code is for the part of the paper describing video-based avatars.

NeuralTextures This is repository with inference code for paper "StylePeople: A Generative Model of Fullbody Human Avatars" (CVPR21). This code is for

Visual Understanding Lab @ Samsung AI Center Moscow 18 Oct 6, 2022
A code generator from ONNX to PyTorch code

onnx-pytorch Generating pytorch code from ONNX. Currently support onnx==1.9.0 and torch==1.8.1. Installation From PyPI pip install onnx-pytorch From

Wenhao Hu 94 Jan 6, 2023
This is the code for our KILT leaderboard submission to the T-REx and zsRE tasks. It includes code for training a DPR model then continuing training with RAG.

KGI (Knowledge Graph Induction) for slot filling This is the code for our KILT leaderboard submission to the T-REx and zsRE tasks. It includes code fo

International Business Machines 72 Jan 6, 2023
Convert Python 3 code to CUDA code.

Py2CUDA Convert python code to CUDA. Usage To convert a python file say named py_file.py to CUDA, run python generate_cuda.py --file py_file.py --arch

Yuval Rosen 3 Jul 14, 2021
Empirical Study of Transformers for Source Code & A Simple Approach for Handling Out-of-Vocabulary Identifiers in Deep Learning for Source Code

Transformers for variable misuse, function naming and code completion tasks The official PyTorch implementation of: Empirical Study of Transformers fo

Bayesian Methods Research Group 56 Nov 15, 2022
Reference implementation of code generation projects from Facebook AI Research. General toolkit to apply machine learning to code, from dataset creation to model training and evaluation. Comes with pretrained models.

This repository is a toolkit to do machine learning for programming languages. It implements tokenization, dataset preprocessing, model training and m

Facebook Research 408 Jan 1, 2023
Code for the prototype tool in our paper "CoProtector: Protect Open-Source Code against Unauthorized Training Usage with Data Poisoning".

CoProtector Code for the prototype tool in our paper "CoProtector: Protect Open-Source Code against Unauthorized Training Usage with Data Poisoning".

Zhensu Sun 1 Oct 26, 2021
Low-code/No-code approach for deep learning inference on devices

EzEdgeAI A concept project that uses a low-code/no-code approach to implement deep learning inference on devices. It provides a componentized framewor

On-Device AI Co., Ltd. 7 Apr 5, 2022
Code for all the Advent of Code'21 challenges mostly written in python

Advent of Code 21 Code for all the Advent of Code'21 challenges mostly written in python. They are not necessarily the best or fastest solutions but j

null 4 May 26, 2022
Code to use Augmented Shapiro Wilks Stopping, as well as code for the paper "Statistically Signifigant Stopping of Neural Network Training"

This codebase is being actively maintained, please create and issue if you have issues using it Basics All data files are included under losses and ea

J K Terry 32 Nov 9, 2021
Opinionated code formatter, just like Python's black code formatter but for Beancount

beancount-black Opinionated code formatter, just like Python's black code formatter but for Beancount Try it out online here Features MIT licensed - b

Launch Platform 16 Oct 11, 2022
a delightful machine learning tool that allows you to train, test and use models without writing code

igel A delightful machine learning tool that allows you to train/fit, test and use models without writing code Note I'm also working on a GUI desktop

Nidhal Baccouri 3k Jan 5, 2023
Pytorch Lightning code guideline for conferences

Deep learning project seed Use this seed to start new deep learning / ML projects. Built in setup.py Built in requirements Examples with MNIST Badges

Pytorch Lightning 1k Jan 2, 2023
Automatically Build Multiple ML Models with a Single Line of Code. Created by Ram Seshadri. Collaborators Welcome. Permission Granted upon Request.

Auto-ViML Automatically Build Variant Interpretable ML models fast! Auto_ViML is pronounced "auto vimal" (autovimal logo created by Sanket Ghanmare) N

AutoViz and Auto_ViML 397 Dec 30, 2022
Code samples for my book "Neural Networks and Deep Learning"

Code samples for "Neural Networks and Deep Learning" This repository contains code samples for my book on "Neural Networks and Deep Learning". The cod

Michael Nielsen 13.9k Dec 26, 2022
Code for: https://berkeleyautomation.github.io/bags/

DeformableRavens Code for the paper Learning to Rearrange Deformable Cables, Fabrics, and Bags with Goal-Conditioned Transporter Networks. Here is the

Daniel Seita 121 Dec 30, 2022
Code for our method RePRI for Few-Shot Segmentation. Paper at http://arxiv.org/abs/2012.06166

Region Proportion Regularized Inference (RePRI) for Few-Shot Segmentation In this repo, we provide the code for our paper : "Few-Shot Segmentation Wit

Malik Boudiaf 138 Dec 12, 2022
Applications using the GTN library and code to reproduce experiments in "Differentiable Weighted Finite-State Transducers"

gtn_applications An applications library using GTN. Current examples include: Offline handwriting recognition Automatic speech recognition Installing

Facebook Research 68 Dec 29, 2022
Code for "Contextual Non-Local Alignment over Full-Scale Representation for Text-Based Person Search"

Contextual Non-Local Alignment over Full-Scale Representation for Text-Based Person Search This is an implementation for our paper Contextual Non-Loca

Tencent YouTu Research 50 Dec 3, 2022