LONG-TERM SERIES FORECASTING WITH QUERYSELECTOR – EFFICIENT MODEL OF SPARSEATTENTION

MORAI

Last update: Dec 17, 2022

Related tags

Deep Learning machine-learning deep-learning time-series pytorch transformer forecasting self-attention

Overview

Query Selector

Here you can find code and data loaders for the paper https://arxiv.org/pdf/2107.08687v1.pdf . Query Selector is a novel approach to sparse attention Transformer algorithm that is especially suitable for long term time series forecasting

Depencency

Python            3.7.9
deepspeed         0.4.0
numpy             1.20.3
pandas            1.2.4
scipy             1.6.3
tensorboardX      1.8
torch             1.7.1
torchaudio        0.7.2
torchvision       0.8.2
tqdm              4.61.0

Results on ETT dataset

Univariate

Data	Prediction len	Informer MSE	Informer MAE	Trans former MSE	Trans former MAE	Query Selector MSE	Query Selector MAE	MSE ratio
ETTh1	24	0.0980	0.2470	0.0548	0.1830	0.0436	0.1616	0.445
ETTh1	48	0.1580	0.3190	0.0740	0.2144	0.0721	0.2118	0.456
ETTh1	168	0.1830	0.3460	0.1049	0.2539	0.0935	0.2371	0.511
ETTh1	336	0.2220	0.3870	0.1541	0.3201	0.1267	0.2844	0.571
ETTh1	720	0.2690	0.4350	0.2501	0.4213	0.2136	0.3730	0.794
ETTh2	24	0.0930	0.2400	0.0999	0.2479	0.0843	0.2239	0.906
ETTh2	48	0.1550	0.3140	0.1218	0.2763	0.1117	0.2622	0.721
ETTh2	168	0.2320	0.3890	0.1974	0.3547	0.1753	0.3322	0.756
ETTh2	336	0.2630	0.4170	0.2191	0.3805	0.2088	0.3710	0.794
ETTh2	720	0.2770	0.4310	0.2853	0.4340	0.2585	0.4130	0.933
ETTm1	24	0.0300	0.1370	0.0143	0.0894	0.0139	0.0870	0.463
ETTm1	48	0.0690	0.2030	0.0328	0.1388	0.0342	0.1408	0.475
ETTm1	96	0.1940	0.2030	0.0695	0.2085	0.0702	0.2100	0.358
ETTm1	288	0.4010	0.5540	0.1316	0.2948	0.1548	0.3240	0.328
ETTm1	672	0.5120	0.6440	0.1728	0.3437	0.1735	0.3427	0.338

Multivariate

Data	Prediction len	Informer MSE	Informer MAE	Trans former MSE	Trans former MAE	Query Selector MSE	Query Selector MAE	MSE ratio
ETTh1	24	0.5770	0.5490	0.4496	0.4788	0.4226	0.4627	0.732
ETTh1	48	0.6850	0.6250	0.4668	0.4968	0.4581	0.4878	0.669
ETTh1	168	0.9310	0.7520	0.7146	0.6325	0.6835	0.6088	0.734
ETTh1	336	1.1280	0.8730	0.8321	0.7041	0.8503	0.7039	0.738
ETTh1	720	1.2150	0.8960	1.1080	0.8399	1.1150	0.8428	0.912
ETTh2	24	0.7200	0.6650	0.4237	0.5013	0.4124	0.4864	0.573
ETTh2	48	1.4570	1.0010	1.5220	0.9488	1.4074	0.9317	0.966
ETTh2	168	3.4890	1.5150	1.6225	0.9726	1.7385	1.0125	0.465
ETTh2	336	2.7230	1.3400	2.6617	1.2189	2.3168	1.1859	0.851
ETTh2	720	3.4670	1.4730	3.1805	1.3668	3.0664	1.3084	0.884
ETTm1	24	0.3230	0.3690	0.3150	0.3886	0.3351	0.3875	0.975
ETTm1	48	0.4940	0.5030	0.4454	0.4620	0.4726	0.4702	0.902
ETTm1	96	0.6780	0.6140	0.4641	0.4823	0.4543	0.4831	0.670
ETTm1	288	1.0560	0.7860	0.6814	0.6312	0.6185	0.5991	0.586
ETTm1	672	1.1920	0.9260	1.1365	0.8572	1.1273	0.8412	0.946

State Of Art

Citation

@misc{klimek2021longterm,
      title={Long-term series forecasting with Query Selector -- efficient model of sparse attention}, 
      author={Jacek Klimek and Jakub Klimek and Witold Kraskiewicz and Mateusz Topolewski},
      year={2021},
      eprint={2107.08687},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Contact

If you have any questions please contact us by email - jacek.klimek@morai.eu

Comments

Dimension Mismatch

Hello,

I am trying to reproduce the results for ETTh1 dataset.

When running the following file in Colab: !python -u train.py --data ETTh1 --input_len 240 --output_len 24 --seq_len 48 --dec_seq_len 48 --pred_len 24 --features S --iterations 10 --exps 5 --hidden_size 96 --n_heads 3 --n_encoder_layers 3 --encoder_attention full --n_decoder_layers 3 --decoder_attention full --batch_size 32 --embedding_size 24 --dropout 0.1

I run into the following error: File "train.py", line 189, in main() File "train.py", line 185, in main preform_experiment(args) File "train.py", line 155, in preform_experiment preds, trues = run_iteration(deepspeed_engine if args.deepspeed else model , train_loader, args, training=True, message=' Run {:>3}, iteration: {:>3}: '.format(args.run_num, iter)) File "train.py", line 109, in run_iteration result = model(batch) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/content/drive/MyDrive/Query/model.py", line 287, in forward e = self.encs0 File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/content/drive/MyDrive/Query/model.py", line 239, in forward return super(Linear, self).forward(x) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/linear.py", line 93, in forward return F.linear(input, self.weight, self.bias) File "/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py", line 1692, in linear output = input.matmul(weight.t()) RuntimeError: mat1 dim 1 must match mat2 dim 0

Can you please help me resolve this issue?, thank you.

best wishes Janamejaya

opened by bcjanmay 3
InverseTransform only on prediction, to avoid nan loss
Hi again, and thanks for the implementation of the 'ms' feature.

Two questions:

Is it possible to only inversely transform the numbers of a prediction? Running inverse transform or disabling scaling on an entire training explodes the gradient with no effort. In short; is it possible to only do an inverse transformation on the pred_len?

With 'ms' enabled, how would one expand on it to predict three outputs instead of a singular, is it as simple as feeding a list to the target, which is now defined as 'OT' ?

I hope you can help out, best regards, and again thanks for your contribution!
opened by simoneliasen 2
Enabling 'MS'
Hi i'm trying to implement the code, but with 'MS' feature, which i can see traces of in your code, inspired from the Informer.

I tried adding a 'multiuni' prediction type, alongside the 'multi' and 'uni' one and have changed my setting accordingly to use 'multiuni':

elif self.prediction_type == 'multiuni': sys.argv.extend(["--features", 'MS']) sys.argv.extend(["--input_len", '7', "--output_len", "1"])

I am however, consistently getting issues, most recently, triggering an assertion in the code:

File "train.py", line 178, in preform_experiment len(train_data.data_y[0]), args.output_len) AssertionError: Dataset contains output vectors of length 7 while output_len is set to 1

Is there anyway that you could guide me to the implementation of MS?

Best regards
opened by simoneliasen 1
A quick question about the linear layer

Hi, just a quick question.

I didnt't have a chance to run the code, so I assume the input data have a shape like B,L,C or B,C,L. Line279: self.enc_input_fc = Linear(input_size, dim_val). I think “input_size” means the length while “dim_val” is somewhat related to embedding dimensions. And the shape of the tensor turned into a square which doesn't make sense to me. Is that a typo or I misunderstand something? Many thanks!

opened by kpmokpmo 1
Connection refused error

Hello, Thank you for releasing the code, I had couple of questions. I am trying to run the code in Google Colab.

This is my training script !python train.py --data='ETTm1' --seq_len=720 --pred_len=24 --dec_seq_len=24 --hidden_size=128 --batch_size=100 --embedding_size=32 --n_encoder_layers=3 --encoder_attention='full' --n_decoder_layers=3 --decoder_attention='full' --dropout=0.1 --iterations=10 --exps=5 --features='M' --n_heads=3 --input_len=7 --output_len=7

After 1 iteration I am getting the following error :

Traceback (most recent call last): File "train.py", line 190, in main() File "train.py", line 186, in main preform_experiment(args) File "train.py", line 159, in preform_experiment ipc.sendPartials(iter, mse, mae) File "/content/drive/MyDrive/query-selector-master/ipc.py", line 59, in sendPartials c.connect(('localhost', PORT)) ConnectionRefusedError: [Errno 111] Connection refused

Please do let me know, Thanks.

Best Regards Niharika

opened by Niharikajo 0
Test train split
Hello,

I had a doubt in the data processing step

In ETTm2 dataset in data_loader to split the dataset we use the following code to calculate border and then calculate the number of train and test points. Can you explain this line of code? How is this calculated ? border1s = [0, 12*30*24*4 - self.seq_len, 12*30*24*4+4*30*24*4 - self.seq_len] border2s = [12*30*24*4, 12*30*24*4+4*30*24*4, 12*30*24*4+8*30*24*4]

seq_len is calculated as 24 * 4 * 4 how is this calculated? What does it signify?

Please let me know

Thank you Niharika
opened by Niharikajo 0
save predictions
Hello Jacek,

I was going through the query selector paper, and I am trying to reproduce your code. I have a few doubts :

Where is the train, test, and validation split on the dataset done?

In train.py we train the model, and then validation is done. Where is testing happening?

If our pred length is 24, we get 24 predictions after the last data point, So how can we save the values predicted by the model? In which part of the code is happening ?

Please do let me know

Thank you Regards Niharika Joshi
opened by Niharikajo 0
memory explosion（Not gpu memory）

Hello, your work is excellent. However, when I used the electricity dataset (https://github.com/laiguokun/multivariate-time-series-data), a memory explosion problem occurred. I don't know if there is a solution. When running the EETh data set, it is normal. Thank You.

opened by TruthK 1

Owner

MORAI

GitHub

Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting

Autoformer (NeurIPS 2021) Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting Time series forecasting is a c

847 Jan 8, 2023

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting This is the origin Pytorch implementation of Informer in the followin

3.1k Dec 29, 2022

[ECCVW2020] Robust Long-Term Object Tracking via Improved Discriminative Model Prediction (RLT-DiMP)

Feel free to visit my homepage Robust Long-Term Object Tracking via Improved Discriminative Model Prediction (RLT-DIMP) [ECCVW2020 paper] Presentation

35 Oct 26, 2022

Time-series-deep-learning - Developing Deep learning LSTM, BiLSTM models, and NeuralProphet for multi-step time-series forecasting of stock price.

Stock Price Prediction Using Deep Learning Univariate Time Series Predicting stock price using historical data of a company using Neural networks for

7 Nov 27, 2022

Synthesizing Long-Term 3D Human Motion and Interaction in 3D in CVPR2021

Long-term-Motion-in-3D-Scenes This is an implementation of the CVPR'21 paper "Synthesizing Long-Term 3D Human Motion and Interaction in 3D". Please ch

76 Dec 13, 2022

LSTMs (Long Short Term Memory) RNN for prediction of price trends

Price Prediction with Recurrent Neural Networks LSTMs BTC-USD price prediction with deep learning algorithm. Artificial Neural Networks specifically L

5 Nov 12, 2021

PyTorch implementation for our NeurIPS 2021 Spotlight paper "Long Short-Term Transformer for Online Action Detection".

Long Short-Term Transformer for Online Action Detection Introduction This is a PyTorch implementation for our NeurIPS 2021 Spotlight paper "Long Short

77 Dec 16, 2022

Multi-resolution SeqMatch based long-term Place Recognition

MRS-SLAM for long-term place recognition In this work, we imply an multi-resolution sambling based visual place recognition method. This work is based

6 Dec 6, 2022

Event-forecasting - Event Forecasting Algorithms With Python

event-forecasting Event Forecasting Algorithms Theory Correlating events in comp

4 Feb 15, 2022

Forecasting for knowable future events using Bayesian informative priors (forecasting with judgmental-adjustment).

What is judgyprophet? judgyprophet is a Bayesian forecasting algorithm based on Prophet, that enables forecasting while using information known by the

56 Oct 26, 2022

Implementation of the paper NAST: Non-Autoregressive Spatial-Temporal Transformer for Time Series Forecasting.

Non-AR Spatial-Temporal Transformer Introduction Implementation of the paper NAST: Non-Autoregressive Spatial-Temporal Transformer for Time Series For

66 Nov 28, 2022

Spectral Temporal Graph Neural Network (StemGNN in short) for Multivariate Time-series Forecasting

Spectral Temporal Graph Neural Network for Multivariate Time-series Forecasting This repository is the official implementation of Spectral Temporal Gr

306 Dec 29, 2022

tsai is an open-source deep learning package built on top of Pytorch & fastai focused on state-of-the-art techniques for time series classification, regression and forecasting.

Time series Timeseries Deep Learning Pytorch fastai - State-of-the-art Deep Learning with Time Series and Sequences in Pytorch / fastai

2.8k Jan 8, 2023

This repository contains the implementations related to the experiments of a set of publicly available datasets that are used in the time series forecasting research space.

TSForecasting This repository contains the implementations related to the experiments of a set of publicly available datasets that are used in the tim

80 Dec 30, 2022

Code for the CIKM 2019 paper "DSANet: Dual Self-Attention Network for Multivariate Time Series Forecasting".

Dual Self-Attention Network for Multivariate Time Series Forecasting 20.10.26 Update: Due to the difficulty of installation and code maintenance cause

223 Dec 16, 2022

The GitHub repository for the paper: “Time Series is a Special Sequence: Forecasting with Sample Convolution and Interaction“.

SCINet This is the original PyTorch implementation of the following work: Time Series is a Special Sequence: Forecasting with Sample Convolution and I

386 Jan 1, 2023

LONG-TERM SERIES FORECASTING WITH QUERYSELECTOR – EFFICIENT MODEL OF SPARSEATTENTION

Related tags

Overview

Query Selector

Depencency

Results on ETT dataset

Univariate

Multivariate

State Of Art

Citation

Contact

Comments

Owner

MORAI

Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

[ECCVW2020] Robust Long-Term Object Tracking via Improved Discriminative Model Prediction (RLT-DiMP)

Time-series-deep-learning - Developing Deep learning LSTM, BiLSTM models, and NeuralProphet for multi-step time-series forecasting of stock price.

Synthesizing Long-Term 3D Human Motion and Interaction in 3D in CVPR2021

LSTMs (Long Short Term Memory) RNN for prediction of price trends

PyTorch implementation for our NeurIPS 2021 Spotlight paper "Long Short-Term Transformer for Online Action Detection".

Multi-resolution SeqMatch based long-term Place Recognition

Event-forecasting - Event Forecasting Algorithms With Python

Forecasting for knowable future events using Bayesian informative priors (forecasting with judgmental-adjustment).

Implementation of the paper NAST: Non-Autoregressive Spatial-Temporal Transformer for Time Series Forecasting.

Spectral Temporal Graph Neural Network (StemGNN in short) for Multivariate Time-series Forecasting

tsai is an open-source deep learning package built on top of Pytorch & fastai focused on state-of-the-art techniques for time series classification, regression and forecasting.

This repository contains the implementations related to the experiments of a set of publicly available datasets that are used in the time series forecasting research space.

Code for the CIKM 2019 paper "DSANet: Dual Self-Attention Network for Multivariate Time Series Forecasting".

The GitHub repository for the paper: “Time Series is a Special Sequence: Forecasting with Sample Convolution and Interaction“.

The source code and data of the paper "Instance-wise Graph-based Framework for Multivariate Time Series Forecasting".

Time Series Forecasting with Temporal Fusion Transformer in Pytorch

Code for NAACL 2021 full paper "Efficient Attentions for Long Document Summarization"