NBEATSx: Neural basis expansion analysis with exogenous variables

Overview

NBEATSx: Neural basis expansion analysis with exogenous variables

We extend the NBEATS model to incorporate exogenous factors. The resulting method, called NBEATSx, improves on a well performing deep learning model, extending its capabilities by including exogenous variables and allowing it to integrate multiple sources of useful information. To showcase the utility of the NBEATSx model, we conduct a comprehensive study of its application to electricity price forecasting (EPF) tasks across a broad range of years and markets. We observe state-of-the-art performance, significantly improving the forecast accuracy by nearly 20% over the original NBEATS model, and by up to 5% over other well established statistical and machine learning methods specialized for these tasks. Additionally, the proposed neural network has an interpretable configuration that can structurally decompose time series, visualizing the relative impact of trend and seasonal components and revealing the modeled processes' interactions with exogenous factors.

This repository provides an implementation of the NBEATSx algorithm introduced in [https://arxiv.org/pdf/2104.05522.pdf].

Electricity Price Forecasting Results

The tables report the forecasting accuracy for the two years of test, using the ensembled models in the Nord Pool market. The results for the Pennsylvania-New Jersey-Maryland, Belgium, France and Germany markets are available in the paper.

METRIC AR ESRNN NBEATS ARX LEAR DNN NBEATSx-G NBEATSx-I
MAE 2.26 2.09 2.08 2.01 1.74 1.68 1.58 1.62
rMAE 0.71 0.66 0.66 0.63 0.55 0.53 0.5 0.51
sMAPE 6.47 6.04 5.96 5.84 5.01 4.88 4.63 4.7
RMSE 4.08 3.89 3.94 3.71 3.36 3.32 3.16 3.27

NBEATSx usage

Our implementation of the NBEATSx is designed to work on any data. We designed a full pipeline with auxiliary objects, namely Dataset and DataLoader, to facilitate the forecasting task. We provide an example notebook in nbeatsx_example.ipynb

Run NBEATSx experiment from console

To replicate the results of the paper, in particular to produce the forecasts for NBEATSx, run the following line:

python src/hyperopt_nbeatsx.py --dataset 'NP' --space "nbeats_x" --data_augmentation 0 --random_validation 0 --n_val_weeks 52 --hyperopt_iters 1500 --experiment_id "nbeatsx_0_0"

We included the forecasts for all the markets and models in the results folder. The notebook main_results.ipynb replicates the main results table and GW test plots.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Citation

If you use NBEATSx, please cite the following paper:

@article{olivares2021nbeatsx,
  title={Neural basis expansion analysis with exogenous variables: Forecasting electricity prices with NBEATSx},
  author={Olivares, Kin G and Challu, Cristian and Marcjasz, Grzegorz and Weron, Rafa{\l} and Dubrawski, Artur},
  journal = {International Journal of Forecasting, submitted},
  volume = {Working Paper version available at arXiv:2104.05522},
  year={2021}
}
Comments
  • Hyperparameter files

    Hyperparameter files

    Hello Authors,

    I'd be grateful if you would also upload the hyperparameters files for each dataset / case. I wanted to check the value of your validation MAE and also the hyperparameter values.

    Thanks!

    opened by mcvageesh 5
  • Number of stacks confusion

    Number of stacks confusion

    Hi. Thanks for the amazing paper and architecture.

    I was trying to find out how many stacks in a generic model you used and I noticed in your paper:

    "The original NBEATS configuration includes only one generic stack with dozens of blocks, while our proposed model includes both the generic and exogenous stacks, with the order determined via data-driven hyperparameter tuning. We refer to this configuration as the NBEATSx-G model."

    That confused me, because in the original NBEATS paper in a table listing N-BEATS-G hyperparameters (Table 18.) authors claim to have used 30 stacks with 1 block in each. Did you assume they had mismatched the numbers? To me, 30 blocks and 1 stack seems to be more reasonable as well.

    Can you confirm that you used just 2 stacks (one generic, one exogenous)? Additionally, do you think it would make sense to add more stacks (idk what configuration would be best)?

    Regards

    opened by MBelniak 4
  • FCNN realization doesn't take exogenous variables as arguments

    FCNN realization doesn't take exogenous variables as arguments

    Hi,

    In the paper in eq. (1) it is stated, that FCNN in each block is a function of output of previous block/stack (y^{back}) + exogenous variables. Whereas, I see at https://github.com/cchallu/nbeatsx/blob/main/src/nbeats/nbeats_model.py#L122 that there's only y^{back} passed to FCNN. My question is: why is that? My reasoning for passing insample exog. variables as well is that they could provide additional information on how to scale the C_{s,l} context vector later on.

    opened by MBelniak 2
  • Forecasting with long output dimension

    Forecasting with long output dimension

    Hello,

    First of all thank you for your paper, I have tried N-beatx and it's performing super well.

    The problem is that I need to forecast not only Day-Ahead but seven days (so 168 hours) . I tried to put my data in daily and did a 7 forecast horizon but its not good, do you have any advice to try for that problem please ? I would be very thanksful.

    (By the way, my data is similar to yours in your n-beatsx paper.

    Thank you for your time

    opened by linotto 1
  • nothing

    nothing

    I am using nbeatsx_example.ipynb, and add some codes as follow:

    result jpg https://github.com/943fansi/GuideOfP/blob/main/1.jpg

    print(y_true.shape, y_hat.shape, block.shape)
    plt.plot(range(168, 336), y_true.flatten(), label='Price')
    plt.plot(range(168, 336), y_hat.flatten(), linestyle='dashed', label='Forecast')
    plt.axvline(168, color='black')
    plt.legend()
    plt.grid()
    plt.xlabel('Hour')
    plt.ylabel('Price')
    
    plt.figure()
    print(block.shape, len(block))
    plt.plot(range(168, 336), block[:, 0, :].flatten(), linestyle='dashed', label='block Forecast')
    plt.plot(range(168, 336), block[:, 1, :].flatten(), linestyle='dashed', label='block Forecast')
    plt.show()```
    
    
    opened by 943fansi 0
  • n_val_weeks in the paper and code do not match

    n_val_weeks in the paper and code do not match

    Hi,

    Will you please clarify the value for n_val_weeks used to replicate your results? Is it 52, or 42? In the paper, it is mentioned as 42 in several places, but in the example on the main page on github (to replicate results of NP dataset), it is 52. Has the value 42 been used in some datasets, and 52 in some, maybe by mistake?

    Thank you.

    opened by mcvageesh 3
  • Any advice, please.

    Any advice, please.

    Hi. I read your paper very well. I want to do two experiments.

    1. Multivariate regression You said "The NBEATSx model offers a solution to the multivariate regression problem" in your paper. I will try with a multivariate datasets.

    2. A probabilistic forecast, not a point forecasting I'm going to apply 'Quntile regression'. I wonder if it can be implemented without difficulty in the current architecture. For example, simple 'loss function, ..' modifications, etc.

    Please let me know if you have any advice. Thank you.

    opened by TaeniKim 1
  • Number of stacks confusion

    Number of stacks confusion

    Hi. Thanks for the amazing paper and architecture.

    I was trying to find out how many stacks in a generic model you used and I noticed in your paper:

    "The original NBEATS configuration includes only one generic stack with dozens of blocks, while our proposed model includes both the generic and exogenous stacks, with the order determined via data-driven hyperparameter tuning. We refer to this configuration as the NBEATSx-G model."

    That confused me, because in the original NBEATS paper in a table listing N-BEATS-G hyperparameters (Table 18.) authors claim to have used 30 stacks with 1 block in each. Did you assume they had mismatched the numbers? To me, 30 blocks and 1 stack seems to be more reasonable as well.

    Can you confirm that you used just 2 stacks (one generic, one exogenous)? Additionally, do you think it would make sense to add more stacks (idk what configuration would be best)?

    Regards

    opened by MBelniak 0
  • Non-EPF Example

    Non-EPF Example

    Thank you for this paper and the code. I am excited to try out the NBEATSx.

    Would it be possible for you to share a minimal notebook that applies the method not to the price forecasting task but some more general time-series problem? I tried doing this but don't understand the specific hyperparameter settings from your nbeatsx_example.ipynb. In particular what idx_to_sample_freq specifies and how the lag dict works:

    include_var_dict = {'y': [-8,-4,-3,-2],
                        'Exogenous1': [-8,-2,-1],
                        'Exogenous2': [-8,-2,-1],
                        'week_day': [-1]}
    

    I prepared some code that downloads a dataset that is frequently used in time-series examples. It is the Jena Climate dataset recorded by the Max Planck Institute for Biogeochemistry. The dataset consists of 14 features such as temperature, pressure, humidity etc, recorded once per 10 minutes. Data is available for 8 years. So it has a more unusual time structure but also a very simple task: predict the future temperature (with exogenous variables). Example settings could be

    window_sampling_limit=365*4*24*6, input_size=7*24*6, output_size=24*6

    Please let me know if this would be possible for you; it would help me a lot in understanding NBEATSx.


    # Code that downloads the Jena Climate dataset
    from zipfile import ZipFile
    from urllib.request import urlopen
    from io import BytesIO
    import pandas as pd
    
    r = urlopen("https://storage.googleapis.com/tensorflow/tf-keras-datasets/jena_climate_2009_2016.csv.zip").read()
    zip_data = ZipFile(BytesIO(r))
    csv_path  = zip_data.open("jena_climate_2009_2016.csv")
    df = pd.read_csv(csv_path, parse_dates = ["Date Time"])
    
    opened by georgeblck 4
  • [Question] SELU weights and dropout

    [Question] SELU weights and dropout

    Hi,

    My name is Pablo Navarro. Your team and I have already exchanged a few mails over the wonderful paper you've made. Thanks again for the contribution.

    Now that the code is released, I have a couple question over the implementation of the SELU activation function.

    Weight init

    For SELU, you force lecun_normal which is in turn a pass on the init_weights() function:

    def init_weights(module, initialization):
        if type(module) == t.nn.Linear:
            if initialization == 'orthogonal':
                t.nn.init.orthogonal_(module.weight)
            elif initialization == 'he_uniform':
                t.nn.init.kaiming_uniform_(module.weight)
            elif initialization == 'he_normal':
                t.nn.init.kaiming_normal_(module.weight)
            elif initialization == 'glorot_uniform':
                t.nn.init.xavier_uniform_(module.weight)
            elif initialization == 'glorot_normal':
                t.nn.init.xavier_normal_(module.weight)
            elif initialization == 'lecun_normal':
                pass
            else:
                assert 1<0, f'Initialization {initialization} not found'
    

    How come the weights are initialized as lecun_normal simply by passing? On my machine, default PyTorch initializes weights uniformly, not normally.

    DropOut on SELU

    I believe that in order to make SELU useful, you need to use AlphaDropout() instead of regular DropOut() layers (PyTorch docs).

    I can't find anything wrapping AlphaDropOut() in your code. Can you point me in the right direction or give the rationale behind it?

    Cheers and keep up the good work!

    opened by pnmartinez 1
Owner
Cristian Challu
Cristian Challu
Code release for NeX: Real-time View Synthesis with Neural Basis Expansion

NeX: Real-time View Synthesis with Neural Basis Expansion Project Page | Video | Paper | COLAB | Shiny Dataset We present NeX, a new approach to novel

null 538 Jan 9, 2023
Generates all variables from your .tf files into a variables.tf file.

tfvg Generates all variables from your .tf files into a variables.tf file. It searches for every var.variable_name in your .tf files and generates a v

null 1 Dec 1, 2022
Medical Image Segmentation using Squeeze-and-Expansion Transformers

Medical Image Segmentation using Squeeze-and-Expansion Transformers Introduction This repository contains the code of the IJCAI'2021 paper 'Medical Im

askerlee 172 Dec 20, 2022
A Moonraker plug-in for real-time compensation of frame thermal expansion

Frame Expansion Compensation A Moonraker plug-in for real-time compensation of frame thermal expansion. Installation Credit to protoloft, from whom I

null 58 Jan 2, 2023
An expansion for RDKit to read all types of files in one line

RDMolReader An expansion for RDKit to read all types of files in one line How to use? Add this single .py file to your project and import MolFromFile(

Ali Khodabandehlou 1 Dec 18, 2021
Code for Mesh Convolution Using a Learned Kernel Basis

Mesh Convolution This repository contains the implementation (in PyTorch) of the paper FULLY CONVOLUTIONAL MESH AUTOENCODER USING EFFICIENT SPATIALLY

Yi_Zhou 35 Jan 3, 2023
Used to record WKU's utility bills on a regular basis.

WKU水电费小助手 一个用于定期记录WKU水电费的脚本 Looking for English Readme? 背景 由于WKU校园内的水电账单系统时常存在扣费延迟的现象,而补扣的费用缺乏令人信服的证明。不少学生为费用摸不着头脑,但也没有申诉的依据。为了更好地掌握水电费使用情况,留下一手证据,我开源

null 2 Jul 21, 2022
This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CNPs), Neural Processes (NPs), Attentive Neural Processes (ANPs).

The Neural Process Family This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CN

DeepMind 892 Dec 28, 2022
Delta Conformity Sociopatterns Analysis - Delta Conformity Sociopatterns Analysis

Delta_Conformity_Sociopatterns_Analysis ∆-Conformity is a local homophily measur

null 2 Jan 9, 2022
Streamlit App For Product Analysis - Streamlit App For Product Analysis

Streamlit_App_For_Product_Analysis Здравствуйте! Перед вами дашборд, позволяющий

Grigory Sirotkin 1 Jan 10, 2022
A static analysis library for computing graph representations of Python programs suitable for use with graph neural networks.

python_graphs This package is for computing graph representations of Python programs for machine learning applications. It includes the following modu

Google Research 258 Dec 29, 2022
Malware Analysis Neural Network project.

MalanaNeuralNetwork Description Malware Analysis Neural Network project. Table of Contents Getting Started Requirements Installation Clone Set-Up VENV

null 2 Nov 13, 2021
Temporal Dynamic Convolutional Neural Network for Text-Independent Speaker Verification and Phonemetic Analysis

TDY-CNN for Text-Independent Speaker Verification Official implementation of Temporal Dynamic Convolutional Neural Network for Text-Independent Speake

Seong-Hu Kim 16 Oct 17, 2022
Code for "Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible Neural Networks", CVPR 2021

Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible Neural Networks This repository contains the code that accompanies our CVPR 20

Despoina Paschalidou 161 Dec 20, 2022
Bayesian-Torch is a library of neural network layers and utilities extending the core of PyTorch to enable the user to perform stochastic variational inference in Bayesian deep neural networks

Bayesian-Torch is a library of neural network layers and utilities extending the core of PyTorch to enable the user to perform stochastic variational inference in Bayesian deep neural networks. Bayesian-Torch is designed to be flexible and seamless in extending a deterministic deep neural network architecture to corresponding Bayesian form by simply replacing the deterministic layers with Bayesian layers.

Intel Labs 210 Jan 4, 2023
An implementation demo of the ICLR 2021 paper Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks in PyTorch.

Neural Attention Distillation This is an implementation demo of the ICLR 2021 paper Neural Attention Distillation: Erasing Backdoor Triggers from Deep

Yige-Li 84 Jan 4, 2023
Differentiable Neural Computers, Sparse Access Memory and Sparse Differentiable Neural Computers, for Pytorch

Differentiable Neural Computers and family, for Pytorch Includes: Differentiable Neural Computers (DNC) Sparse Access Memory (SAM) Sparse Differentiab

ixaxaar 302 Dec 14, 2022
Neural Turing Machine (NTM) & Differentiable Neural Computer (DNC) with pytorch & visdom

Neural Turing Machine (NTM) & Differentiable Neural Computer (DNC) with pytorch & visdom Sample on-line plotting while training(avg loss)/testing(writ

Jingwei Zhang 269 Nov 15, 2022
Complex-Valued Neural Networks (CVNN)Complex-Valued Neural Networks (CVNN)

Complex-Valued Neural Networks (CVNN) Done by @NEGU93 - J. Agustin Barrachina Using this library, the only difference with a Tensorflow code is that y

youceF 1 Nov 12, 2021