This repository contains the implementations related to the experiments of a set of publicly available datasets that are used in the time series forecasting research space.

Overview

TSForecasting

This repository contains the implementations related to the experiments of a set of publicly available datasets that are used in the time series forecasting research space.

The benchmark datasets are available at: https://zenodo.org/communities/forecasting. For more details, please refer to our website: https://forecastingdata.org/ and paper: https://arxiv.org/abs/2105.06643.

All datasets contain univariate time series and they are availble in a new format that we name as .tsf, pioneered by the sktime .ts format. The data can be loaded into the R environment in tsibble format [1] by following the example in "utils/data_loader.R". It uses a similar approach to the arff file loading method in R foreign package [2]. The data can be loaded into the Python environment as a Pandas dataframe by following the example in "utils/data_loader.py". Download the .tsf files as required from our Zenodo dataset repository and put them into "tsf_data" folder.

The fixed horizon, rolling origin and feature calculation related experiments are there in the "experiments" folder. Please see the examples in the corresponding R scripts in the "experiments" folder for more details. Makesure to create a folder named "results" in the parent level and sub-folders as necessary before running the experiments. The outputs of the experiments will be stored into the sub-folders within the "results" folder as mentioned follows:

Sub-folder Name Stored Output
rolling_origin_forecasts rolling origin forecasts
rolling_origin_errors rolling origin errors
rolling_origin_execution_times rolling origin execution times
fixed_horizon_forecasts fixed horizon forecasts
fixed_horizon_errors fixed horizon errors
fixed_horizon_execution_times fixed horizon execution times
tsfeatures tsfeatures
catch22_features catch22 features
lambdas boxcox lambdas

Citing Our Work

When using this repository, please cite:

@misc{godahewa2021monash,
    author="Godahewa, Rakshitha and Bergmeir, Christoph and Webb, Geoffrey I. and Hyndman, Rob J. and Montero-Manso, Pablo",
    title="Monash Time Series Forecasting Archive",
    howpublished ="\url{https://arxiv.org/abs/2105.06643}",
    year="2021"
}

References

[1] Wang, E., Cook, D., Hyndman, R. J. (2020). A new tidy data structure to support exploration and modeling of temporal data. Journal of Computational and Graphical Statistics. doi:10.1080/10618600.2019.1695624.

[2] R Core Team (2018). foreign: Read Data Stored by 'Minitab', 'S', 'SAS', 'SPSS', 'Stata', 'Systat', 'Weka', 'dBase', .... R package version 0.8-71. https://CRAN.R-project.org/package=foreign

Comments
  • sktime integration of `data_loader.convert_tsf_to_dataframe`

    sktime integration of `data_loader.convert_tsf_to_dataframe`

    Really nice collection of forecasting benchmark datasets you have here!

    We were wondering (at sktime) whether you would be open for us to integrate (a possibly modified) data_loader.convert_tsf_to_dataframe into the data_io module of sktime, using the Monash forecasting repository as a data endpoint. Of course with proper attribution and crediting of the source.

    Since tsf is based on the ts format and the loaders are similar, it would fit nicely with a current refactoring effort in the space.

    If you'd be up for a chat and/or collaboration, feel free to visit us on the sktime slack, forecasting or forecasting-global channel. (go https://github.com/alan-turing-institute/sktime, README -> slack badge at the top) Might also be nice to collaborate on "nice" benchmark functionality, which can be loaded as a package and which directly interfaces with existing base class templates (with no need to write extra glue code that's special to the benchmark)

    opened by fkiraly 5
  • covariate and forecasting target

    covariate and forecasting target

    Hello! I had a question regarding the forecasting target vs. additional covariates in the dataset. For example, in the temperature_rain dataset there are some fields that I assume are the forecast target from the names of the column while others are covariates... How does the tst format distinguish between those?

    opened by kashif 4
  • Unable to reproduce the results from paper

    Unable to reproduce the results from paper

    Hi,

    Monash Forecasting Repository and your work is greatly appreciated. Thanks a lot for making the work reproducible.

    However, I tried experimenting with a few datasets and found that I was unable to reproduce the same results for the local models like ARIMA, ETS, SES, etc. Here are the results that I found for the COVID dataset. I use the same script and the same lag and horizon values. Could you please let me know if I am doing something wrong?. Should I change some parameters for these local models in order to attain the results reported in the paper?.

    These are the results I got for the COVID dataset.

      | ETS | TBATS | SES | ARIMA -- | -- | -- | -- | -- MyExperiment | 8.98 | 8.98 | 8.977 | 6.104 Published Results | 5.33 | 5.72 | 7.776 | 6.117

    Additionally, I also found that TBATS, ETS, and SES mostly always show the same error. Do you have an idea about why this could be true for the COVID dataset?

    opened by 18kiran12 3
  • corrected import of sktime

    corrected import of sktime

    The package structure of sktime changed a bit in the current version, so the import wasn't working for me and this PR corrects it. Here's a related issue: https://github.com/sktime/sktime-dl/issues/79

    opened by timoschowski 3
  • Data leakage for global univariate setting?

    Data leakage for global univariate setting?

    Hi, thanks for your amazing work! One issue I'm wondering about is, what constitutes a data leak in the global univariate setting?

    Please correct me if I'm wrong, but I noticed that for the global univariate settings where the time series is unaligned (specifically, an unaligned end date), this leads to a shorter time series X to have test set in the same period as a longer time series Y's training set. (Examples would include Vehicle Trips and Tourism datasets).

    There could possibly be some leakage of global information (e.g. in some financial time series there could be some systematic movement of all assets), leading to unfair advantage for global models.

    opened by gorold 2
  • Default trainer from Gluon package in the deep_learning_experiments script

    Default trainer from Gluon package in the deep_learning_experiments script

    Hello, thank you for making this benchmark available. I noticed that the deep learning implementation relies on the Gluon package and in the train_model function, the trainer is always used with the defaulted parameters. I wonder if it is not better to adjust some params to the considered dataset (basically the num_batches_per_epoch parameter).

    opened by BoudabousSafa 2
  • Why is arima giving an error when I use a subset of NN5 dataset without missing values?

    Why is arima giving an error when I use a subset of NN5 dataset without missing values?

    Here is the procedure I followed:

    1. Download nn5_daily_dataset_without_missing_values.tsf from Zenodo.
    2. Remove all the time series except the first and second one from TSF file and save as nn5_small.tsf. In the resulting TSF file there are two time series T1 and T2.
    3. Run "do_fixed_horizon_local_forecasting("nn5_small", "arima", "nn5_small.tsf","series_name", "start_timestamp")"

    When I follow the above steps, I get the following error:

    > do_fixed_horizon_local_forecasting("nn5_small", "arima", "nn5_small.tsf","series_name", "start_timestamp")
    [1] "Started loading nn5_small"
    [1] "started Forecasting"
    [1] 1
    [1] 2
    [1] "Finished Forecasting"
    Time difference of 1.184725 mins
    The length of the provided data differs.
    Length of holdout: 56
    Length of forecast: 0
    Error: Cannot proceed.
    In addition: Warning messages:
    1: The chosen seasonal unit root test encountered an error when testing for the first difference.
    From stl(): NA/NaN/Inf in foreign function call (arg 1)
    0 seasonal differences will be used. Consider using a different unit root test. 
    2: The chosen seasonal unit root test encountered an error when testing for the first difference.
    From stl(): NA/NaN/Inf in foreign function call (arg 1)
    0 seasonal differences will be used. Consider using a different unit root test. 
    3: In array(x, c(length(x), 1L), if (!is.null(names(x))) list(names(x),  : 'data' must be of a vector type, was 'NULL'
    

    When I check the result file, I realized that there is no forecast for T1.

    $ cat results/fixed_horizon_forecasts/nn5_small_arima.txt 
    T1
    T2,12.9393424036281,15.3486394557823,18.75,29.6768707482993,30.7256235827664,16.7375283446712,12.046485260771,12.9393424036281,15.3486394557823,18.75,29.6768707482993,30.7256235827664,16.7375283446712,12.046485260771,12.9393424036281,15.3486394557823,18.75,29.6768707482993,30.7256235827664,16.7375283446712,12.046485260771,12.9393424036281,15.3486394557823,18.75,29.6768707482993,30.7256235827664,16.7375283446712,12.046485260771,12.9393424036281,15.3486394557823,18.75,29.6768707482993,30.7256235827664,16.7375283446712,12.046485260771,12.9393424036281,15.3486394557823,18.75,29.6768707482993,30.7256235827664,16.7375283446712,12.046485260771,12.9393424036281,15.3486394557823,18.75,29.6768707482993,30.7256235827664,16.7375283446712,12.046485260771,12.9393424036281,15.3486394557823,18.75,29.6768707482993,30.7256235827664,16.7375283446712,12.046485260771
    

    All other five methods (ets,ses,theta, tbats, and dhr_arima) work fine.

    I wonder if this is an expected behaviour?

    opened by hkayabilisim 2
  • Missing documentation in the notebook

    Missing documentation in the notebook

    In the forecastingdata_python notebook, it should be nice if you could add something like this:

    After a fresh R installation, you need to install devtools by running "install.packages("devtools")" in R command line.

    By the way, thank you for this great forecasting repository and the tools you created in the GitHub repo.

    opened by hkayabilisim 2
  • Notebook for features and fixed horizon

    Notebook for features and fixed horizon

    Some highlights:

    • refactor the experiments into functions and experiment files
    • added recursive path creation
    • added removing existing file (always appending makes the test fail when run after the first time)
    • added tryCatch for checking it catboost is installed, because it is difficult to install
    opened by pmontman 1
  • License

    License

    Hi,

    Thanks for making this data repository available! Could you please add a license for the code (e.g., the data loader)?

    From GitHub docs:

    without a license, the default copyright laws apply, meaning that you retain all rights to your source code and no one may reproduce, distribute, or create derivative works from your work.

    Thank you, Alex

    opened by aldro61 1
  • UnicodeDecodeError in data_loader.py when reading m4_monthly_dataset.tsf on Windows

    UnicodeDecodeError in data_loader.py when reading m4_monthly_dataset.tsf on Windows

    I downloaded and extracted M4 monthly dataset and startet data_loader.py. I get following error message:

    ---------------------------------------------------------------------------
    UnicodeDecodeError                        Traceback (most recent call last)
    <ipython-input-3-78dfe61da181> in <module>
          1 filename = 'm4_monthly_dataset.tsf'
          2 loaded_data, frequency, forecast_horizon, contain_missing_values, contain_equal_length = \
    ----> 3     data_loader.convert_tsf_to_dataframe("tsf_data/"+filename)
          4 
          5 print('loaded_data',loaded_data)
    
    ~\Documents\PythonScripts\Timeseries2020\MonashTSForecastingArchiv\data_loader.py in convert_tsf_to_dataframe(full_file_path_and_name, replace_missing_vals_with, value_column_name)
         28 
         29     with open(full_file_path_and_name, 'r', encoding='utf-8') as file:
    ---> 30         for line in file:
         31             # Strip white space from start/end of line
         32             line = line.strip()
    
    ~\miniconda3\envs\sktime\lib\codecs.py in decode(self, input, final)
        320         # decode input (taking the buffer into account)
        321         data = self.buffer + input
    --> 322         (result, consumed) = self._buffer_decode(data, self.errors, final)
        323         # keep undecoded input until the next call
        324         self.buffer = data[consumed:]
    
    UnicodeDecodeError: 'utf-8' codec can't decode byte 0x96 in position 437: invalid start byte
    

    If I change encoding from utf-8 to ansi it works:

     #with open(full_file_path_and_name, 'r', encoding='utf-8') as file:
     with open(full_file_path_and_name, 'r', encoding='ansi') as file:
    

    I work on Windows 10 with Python 3.9.4

    opened by JBOE22175 1
Owner
Rakshitha Godahewa
PhD Student
Rakshitha Godahewa
Time-series-deep-learning - Developing Deep learning LSTM, BiLSTM models, and NeuralProphet for multi-step time-series forecasting of stock price.

Stock Price Prediction Using Deep Learning Univariate Time Series Predicting stock price using historical data of a company using Neural networks for

Abdultawwab Safarji 7 Nov 27, 2022
Multivariate Time Series Forecasting with efficient Transformers. Code for the paper "Long-Range Transformers for Dynamic Spatiotemporal Forecasting."

Spacetimeformer Multivariate Forecasting This repository contains the code for the paper, "Long-Range Transformers for Dynamic Spatiotemporal Forecast

QData 440 Jan 2, 2023
The GitHub repository for the paper: “Time Series is a Special Sequence: Forecasting with Sample Convolution and Interaction“.

SCINet This is the original PyTorch implementation of the following work: Time Series is a Special Sequence: Forecasting with Sample Convolution and I

null 386 Jan 1, 2023
Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting This is the origin Pytorch implementation of Informer in the followin

Haoyi 3.1k Dec 29, 2022
Implementation of the paper NAST: Non-Autoregressive Spatial-Temporal Transformer for Time Series Forecasting.

Non-AR Spatial-Temporal Transformer Introduction Implementation of the paper NAST: Non-Autoregressive Spatial-Temporal Transformer for Time Series For

Chen Kai 66 Nov 28, 2022
Spectral Temporal Graph Neural Network (StemGNN in short) for Multivariate Time-series Forecasting

Spectral Temporal Graph Neural Network for Multivariate Time-series Forecasting This repository is the official implementation of Spectral Temporal Gr

Microsoft 306 Dec 29, 2022
tsai is an open-source deep learning package built on top of Pytorch & fastai focused on state-of-the-art techniques for time series classification, regression and forecasting.

Time series Timeseries Deep Learning Pytorch fastai - State-of-the-art Deep Learning with Time Series and Sequences in Pytorch / fastai

timeseriesAI 2.8k Jan 8, 2023
Code for the CIKM 2019 paper "DSANet: Dual Self-Attention Network for Multivariate Time Series Forecasting".

Dual Self-Attention Network for Multivariate Time Series Forecasting 20.10.26 Update: Due to the difficulty of installation and code maintenance cause

Kyon Huang 223 Dec 16, 2022
The source code and data of the paper "Instance-wise Graph-based Framework for Multivariate Time Series Forecasting".

IGMTF The source code and data of the paper "Instance-wise Graph-based Framework for Multivariate Time Series Forecasting". Requirements The framework

Wentao Xu 24 Dec 5, 2022
Time Series Forecasting with Temporal Fusion Transformer in Pytorch

Forecasting with the Temporal Fusion Transformer Multi-horizon forecasting often contains a complex mix of inputs – including static (i.e. time-invari

Nicolás Fornasari 6 Jan 24, 2022
A small library of 3D related utilities used in my research.

utils3D A small library of 3D related utilities used in my research. Installation Install via GitHub pip install git+https://github.com/Steve-Tod/util

Zhenyu Jiang 8 May 20, 2022
Automatically download the cwru data set, and then divide it into training data set and test data set

Automatically download the cwru data set, and then divide it into training data set and test data set.自动下载cwru数据集,然后分训练数据集和测试数据集

null 6 Jun 27, 2022
Voxel Set Transformer: A Set-to-Set Approach to 3D Object Detection from Point Clouds (CVPR 2022)

Voxel Set Transformer: A Set-to-Set Approach to 3D Object Detection from Point Clouds (CVPR2022)[paper] Authors: Chenhang He, Ruihuang Li, Shuai Li, L

Billy HE 141 Dec 30, 2022
Event-forecasting - Event Forecasting Algorithms With Python

event-forecasting Event Forecasting Algorithms Theory Correlating events in comp

Intellia ICT 4 Feb 15, 2022
Forecasting for knowable future events using Bayesian informative priors (forecasting with judgmental-adjustment).

What is judgyprophet? judgyprophet is a Bayesian forecasting algorithm based on Prophet, that enables forecasting while using information known by the

AstraZeneca 56 Oct 26, 2022
Python-experiments - A Repository which contains python scripts to automate things and make your life easier with python

Python Experiments A Repository which contains python scripts to automate things

Vivek Kumar Singh 11 Sep 25, 2022
The LaTeX and Python code for generating the paper, experiments' results and visualizations reported in each paper is available (whenever possible) in the paper's directory

This repository contains the software implementation of most algorithms used or developed in my research. The LaTeX and Python code for generating the

João Fonseca 3 Jan 3, 2023
This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CNPs), Neural Processes (NPs), Attentive Neural Processes (ANPs).

The Neural Process Family This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CN

DeepMind 892 Dec 28, 2022
Script that receives an Image (original) and a set of images to be used as "pixels" in reconstruction of the Original image using the set of images as "pixels"

picinpics Script that receives an Image (original) and a set of images to be used as "pixels" in reconstruction of the Original image using the set of

RodrigoCMoraes 1 Oct 24, 2021