Multivariate Time Series Transformer, public version

Overview

Multivariate Time Series Transformer Framework

This code corresponds to the paper: George Zerveas et al. A Transformer-based Framework for Multivariate Time Series Representation Learning, in Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD '21), August 14-18, 2021. ArXiV version: https://arxiv.org/abs/2010.02803

If you find this code or any of the ideas in the paper useful, please consider citing:

@inproceedings{10.1145/3447548.3467401,
author = {Zerveas, George and Jayaraman, Srideepika and Patel, Dhaval and Bhamidipaty, Anuradha and Eickhoff, Carsten},
title = {A Transformer-Based Framework for Multivariate Time Series Representation Learning},
year = {2021},
isbn = {9781450383325},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3447548.3467401},
doi = {10.1145/3447548.3467401},
booktitle = {Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining},
pages = {2114–2124},
numpages = {11},
keywords = {regression, framework, multivariate time series, classification, transformer, deep learning, self-supervised learning, unsupervised learning, imputation},
location = {Virtual Event, Singapore},
series = {KDD '21}
}

Setup

Instructions refer to Unix-based systems (e.g. Linux, MacOS).

cd mvts_transformer/

Inside an already existing root directory, each experiment will create a time-stamped output directory, which contains model checkpoints, performance metrics per epoch, predictions per sample, the experiment configuration, log files etc. The following commands assume that you have created a new root directory inside the project directory like this: mkdir experiments.

[We recommend creating and activating a conda or other Python virtual environment (e.g. virtualenv) to install packages and avoid conficting package requirements; otherwise, to run pip, the flag --user or sudo privileges will be necessary.]

pip install -r requirements.txt

[Note: Because sometimes newer versions of packages break backward compatibility with previous versions or other packages, instead or requirements.txt you can use failsafe_requirements.txt to use the versions which have been tested to work with this codebase.]

Download dataset files and place them in separate directories, one for regression and one for classification.

Classification: http://www.timeseriesclassification.com/Downloads/Archives/Multivariate2018_ts.zip

Regression: https://zenodo.org/record/3902651#.YB5P0OpOm3s

Example commands

To see all command options with explanations, run: python src/main.py --help

You should replace $1 below with the name of the desired dataset. The commands shown here specify configurations intended for BeijingPM25Quality for regression and SpokenArabicDigits for classification.

[To obtain best performance for other datasets, use the hyperparameters as given in the Supplementary Material of the paper. Appropriate downsampling with the option --subsample_factor can be often used on datasets with longer time series to speedup training, without significant performance degradation.]

The configurations as shown below will evaluate the model on the TEST set periodically during training, and at the end of training.

Besides the console output and the logfile output.log, you can monitor the evolution of performance (after installing tensorboard: pip install tensorboard) with:

tensorboard dev upload --name my_exp --logdir path/to/output_dir

Train models from scratch

Regression

(Note: the loss reported for regression is the Mean Square Error, i.e. without the Root)

python src/main.py --output_dir path/to/experiments --comment "regression from Scratch" --name $1_fromScratch_Regression --records_file Regression_records.xls --data_dir path/to/Datasets/Regression/$1/ --data_class tsra --pattern TRAIN --val_pattern TEST --epochs 100 --lr 0.001 --optimizer RAdam  --pos_encoding learnable --task regression

Classification

python src/main.py --output_dir experiments --comment "classification from Scratch" --name $1_fromScratch --records_file Classification_records.xls --data_dir path/to/Datasets/Classification/$1/ --data_class tsra --pattern TRAIN --val_pattern TEST --epochs 400 --lr 0.001 --optimizer RAdam  --pos_encoding learnable  --task classification  --key_metric accuracy

Pre-train models (unsupervised learning through input masking)

Can be used for any downstream task, e.g. regression, classification, imputation.

Make sure that the network architecture parameters of the pretrained model match the parameters of the desired fine-tuned model (e.g. use --d_model 64 for SpokenArabicDigits).

python src/main.py --output_dir experiments --comment "pretraining through imputation" --name $1_pretrained --records_file Imputation_records.xls --data_dir /path/to/$1/ --data_class tsra --pattern TRAIN --val_ratio 0.2 --epochs 700 --lr 0.001 --optimizer RAdam --batch_size 32 --pos_encoding learnable --d_model 128

Fine-tune pretrained models

Make sure that network architecture parameters (e.g. d_model) used to fine-tune a model match the pretrained model.

Regression

python src/main.py --output_dir experiments --comment "finetune for regression" --name BeijingPM25Quality_finetuned --records_file Regression_records.xls --data_dir /path/to/Datasets/Regression/BeijingPM25Quality/ --data_class tsra --pattern TRAIN --val_pattern TEST  --epochs 200 --lr 0.001 --optimizer RAdam --pos_encoding learnable --d_model 128 --load_model path/to/BeijingPM25Quality_pretrained/checkpoints/model_best.pth --task regression --change_output --batch_size 128

Classification

python src/main.py --output_dir experiments --comment "finetune for classification" --name SpokenArabicDigits_finetuned --records_file Classification_records.xls --data_dir /path/to/Datasets/Classification/SpokenArabicDigits/ --data_class tsra --pattern TRAIN --val_pattern TEST --epochs 100 --lr 0.001 --optimizer RAdam --batch_size 128 --pos_encoding learnable --d_model 64 --load_model path/to/SpokenArabicDigits_pretrained/checkpoints/model_best.pth --task classification --change_output --key_metric accuracy
Comments
  • How to perform mask as Figure.1?

    How to perform mask as Figure.1?

    Hello, George

    Thank you very much for your code. Could you tell me how to perform the mask as in Figure 1? I found many functions in dataset.py, but it is difficult to me to implement them without any guidance. Could you provide some more details?

    Thanks a lot.

    opened by xiqxin1 9
  • repository directory

    repository directory

    Hi Goerge, thanks for your open-source codes. It is very clear and organized.

    But I am new to use the shell script, could you please give a directory tree of the entire repository? That would be very helpful to understand the architecture. I am confused about where I should put the downloaded data and where I should make the experiments folder. Currently, I am trying with the following tree:

    • experiments
    • src - datasets - models - regression - utils - main.py - optimizers.py - options.py - running.py

    After cd mvts_transformer, I run python src/main.py --output_dir experiments --comment "regression from Scratch" --name FloodModeling1_fromScratch_Regression --records_file Regression_records.xls --data_dir Datasets/Regression/FloodModeling1/ --data_class tsra --pattern TRAIN --val_pattern TEST --epochs 100 --lr 0.001 --optimizer RAdam --pos_encoding learnable --task regression, but it shows No files found using: Datasets/Regression/FloodModeling1/*.

    opened by JimengShi 7
  • Need your suggestions, Thanks

    Need your suggestions, Thanks

    Hi, I have read the paper (A Transformer-based Framework for Multivariate Time Series Representation Learning), it is a very meaningful work. I want to use this project to finish a prediction work. My taks is a regression problem, and my dataset can be described as follows: X = [ [[time series sequence 1]], [[time series sequence 2]], *** [[time series sequence s]], ] and y = [ [[label 1]], [[label 2]], *** [[label s]], ] where sequences are not the same length. So I want to just use your model definition (in ts_transformer.py](https://github.com/gzerveas/mvts_transformer/blob/master/src/models/ts_transformer.py)) and paddding all sequences to the same length in my dataset before input the TST model.

    Is there any thing I need to pay attention to in this job ? Or do you have other suggestions ?

    Thansk.

    opened by chuzheng88 7
  • Something wrong when training from scratch

    Something wrong when training from scratch

    Hello

    I want to pre-train the model from the very beginning using your data. I've tried your classification command, but many bugs occur.

    Does anyone successfully run the code from scratch?

    BTW, how to perform mask in the code?

    opened by xiqxin1 2
  • forecaster.predict results

    forecaster.predict results

    I've launched one epoch training for the toy2 dataset and modified the train.py code to call forecaster.predict twice for the fisrt test data sample:

    xc, yc, xt, _ = test_samples yt_pred1 = forecaster.predict(xc, yc, xt) print("yt_pred 1[0][0]:") print(yt_pred1[0][0]) yt_pred2 = forecaster.predict(xc, yc, xt) print("yt_pred 2[0][0]:") print(yt_pred2[0][0])

    But I'm getting two different predict results with the same input (same xc, yc and yt):

    yt_pred 1[0][0]: tensor([ 0.2833, 0.2584, 0.3955, 0.1239, 0.1491, -0.2220, 0.3673, 0.1451, 0.0191, 0.0947, 0.4993, -0.2045, 0.2724, 0.0498, 0.0839, 0.2188, 0.0291, -0.0505, 0.2537, 0.2825]) yt_pred 2[0][0]: tensor([-0.0851, 0.1524, 0.1037, -0.0464, -0.1989, 0.0934, 0.0636, 0.0913, 0.2973, 0.0513, 0.3559, 0.1850, 0.1016, 0.1844, 0.5109, 0.0665, 0.2945, 0.3052, 0.3375, 0.1235])

    Why two different predictions? What am I missing?

    opened by luisfdz-gmt 2
  • One pre-trained model for all datasets or one for each?

    One pre-trained model for all datasets or one for each?

    Hi @gzerveas , when I read your paper, I thought you pre-trained ONE model based on all datasets and fine-tuned on each dataset for the specific classification/regression task. However, based on the README in this repository, it seems that each dataset has a pre-trained model. Which method did you use in your experiments?

    opened by Roxyi 2
  • Question on multiplying linear projection by sqrt(d_model)

    Question on multiplying linear projection by sqrt(d_model)

    Hello! Thank you for the great paper and for sharing your implementation! I have a quick question. I'm wondering why the linear projection is multiplied by the constant, square root of self.d_model, while this is not mentioned in the paper and not shown in other implementations (I don't think).

    This line:

    https://github.com/gzerveas/mvts_transformer/blob/fe3b539ccc2162f55cf7196c8edc7b46b41e7267/src/models/ts_transformer.py#L299

    Just curious, thank you!

    opened by evanatyourservice 2
  • what's the right way to calculate predicted value and extract the representation?

    what's the right way to calculate predicted value and extract the representation?

    Hi George, Great project. This really open a new way for MTS analysis. I have run your code and everything looks good and very fast. When I tried to use the model to calculate the prediction, the value can't match the results in "predictions" folder. So what I did is load the check points, load the best model, then use first row in "predictions" targets.npy to feed into the model and compare the output with first row in predictions.npy . padding_masks = torch.BoolTensor([[1]]) . My results can't match the prediction output from yours. could you share some idea how to do this on a record by record base? Another question is how to extract the encoder value, for example I have PM25 data 24 by 9 dataframe, How to get the corresponding 24 by 128 representation data frame? As once this step done I can use the new dataframe for other models input and don't need to calculate these values on the fly to save some time? Thank you in advance.

    opened by superfan123 1
  • Test result in multivariate dataset without pretrain

    Test result in multivariate dataset without pretrain

    Hi,

    I'm trying to study your code for multivariate classification dataset without pretrain, and I choose Handwriting for an example.

    In order to achieve the paper's performance , I used the hyperparameters which is shown in your paper.

    So I train the model with command below.

    python src/main.py --output_dir experiments --comment "classification from Scratch" --name HW --records_file Classification_records.xls --data_dir data/Multivariate_ts/Handwriting --data_class tsra --pattern TRAIN --epochs 400 --lr 0.001 --optimizer RAdam --pos_encoding learnable --task classification --key_metric accuracy --val_ratio 0.2 --num_layers 3 --num_heads 16 --d_model 128 --dim_feedforward 256 --batch_size 128

    And then I test the model with command below.

    python src/main.py --output_dir experiments --comment "classification from Scratch" --name HW --records_file Classification_records.xls --data_dir data/Multivariate_ts/Handwriting --data_class tsra --pattern TRAIN --epochs 400 --lr 0.001 --optimizer RAdam --pos_encoding learnable --task classification --key_metric accuracy --val_ratio 0 --num_layers 3 --num_heads 16 --d_model 128 --dim_feedforward 256 --batch_size 128 --test_pattern TEST --test_only testset --load_model experiments/HW_2022-07-27_20-01-05_axV/checkpoints/model_best.pth

    I thought I use the same data split, same model and same hyperparameters, but finally I find the acc is 0.25882352941176473 and it's different from 0.3 in paper. Is there any step I missed?

    opened by Mingzhe-Han 1
  • very long time series, performance drop

    very long time series, performance drop

    implied in the text,who combine the triplet loss with a deep causal CNN with dilation, in order to make the method effective for very long time series. Is there any relevant code to refer to?This method does not seem to work well on a time series dataset with a sequence length of 5000 and a feature number of 2.

    opened by aappaappoo 1
  • failsafe_requirements.txt missing

    failsafe_requirements.txt missing

    Hello, the file failsafe_requirements.txt mentioned in README does not seem to be present in the repo?

    When installing the package from a new conda environment, I just had to downgrade python to 3.8 to avoid sktime installation issues, and it now seems to work.

    opened by mariomorvan 1
  • EEG Classification

    EEG Classification

    I am implementing your paper for EEG classification. The EEG data is of dimension 19x120000 where 19 is the number of electrodes and 120000 are the time points. I would like to understand how this dataset can be added to the code for the implementation.

    opened by nehagour 0
  • question about my dataset

    question about my dataset

    Thank you for sharing your code, there are some parts of your code that I don't understand because of my limited ability, could you help me?

    1. The scenario you are modelling is a direct segmentation of the data, but the scenario I am dealing with uses a sliding window to dynamically traverse the data, so where should I fix this if I want to load my own data.
    2. I still don't quite understand the difference between ImputationDataset and TransductionDataset, it seems that there is no difference between classification and regression tasks in the unsupervised learning phase. Looking forward to your reply, thanks for your help
    opened by SSSUNSHINNING 4
  • Learning loss problem & predict procedure

    Learning loss problem & predict procedure

    Hi George, When perform "train models from scratch" with my dataset, rmse loss turn to nan. I want to check normalizer for my dataset, I wonder how. If normalization is performed, I wonder if the normalized value is predicted even in the mask section.

    Also, I want to check finetuning model structer, but I cannot find. Can you tell me how? I wonder if the mask is applied to the test set even when finetuned.

    Thank you

    opened by techzzt 6
  • Problem running Test_only mode

    Problem running Test_only mode

    Hi George, really like the project! I have been trying it out for a couple weeks now, training multiple models including some with my own datasets. However during all this time, while training works without any problems, i have not been able to get the test_only mode running. I continue to get this error: per_batch['predictions'].append(predictions.cpu().numpy()) RuntimeError: Can't call numpy() on Tensor that requires grad. Use tensor.detach().numpy() instead.

    I have used the following commands: Training: python src/main.py --output_dir .\experiments --comment "regression from Scratch" --name custom_regression --records_file Regression_records.xls --data_dir ..\Datasets\CUSTOM --data_class tsra --pattern TRAIN --val_pattern TEST --epochs 100 --lr 0.001 --optimizer RAdam --pos_encoding learnable --task regression

    Testing (not working): python src/main.py --output_dir .\experiments --comment "regression from Scratch" --name Custom_regression --records_file Regression_records.xls --data_dir ..\Datasets\CUSTOM --data_class tsra --pattern TRAIN --val_pattern TEST --epochs 100 --lr 0.001 --optimizer RAdam --pos_encoding learnable --task regression --test_pattern TEST --test_only testset --load_model ./experiments/custom_regression_2022-10-20_17-05-04_MjH/checkpoints/model_best.pth

    I have also tried the exact commands mentioned in this issue, which seem to work for the user that opened that issue, yet i still get the same error.

    I have tested with both python 3.7 and 3.8 with the normal requirements.txt as well as the failsafe_requirements.txt. (using anaconda)

    At this point i am unsure what i am doing wrong and what else to try to get the test_only mode working.

    opened by stasj145 2
  • Very high loss when finetuning

    Very high loss when finetuning

    Dear Author,

    I am running your commands and find that the pretraining process seems good while the finetuning is weird. The pretraining loss is just 0.140160306.

    The commands I run are

    CUDA_VISIBLE_DEVICES=4 python src/main.py --output_dir experiments --comment "pretraining through imputation" --name BeijingPM25Quality_pretrained --records_file Imputation_records.xls --data_dir BeijingPM25Quality --data_class tsra --pattern TRAIN --val_ratio 0.2 --epochs 700 --lr 0.001 --optimizer RAdam --batch_size 32 --pos_encoding learnable --d_model 128

    CUDA_VISIBLE_DEVICES=1 python src/main.py --output_dir experiments --comment "finetune for regression" --name BeijingPM25Quality_finetuned --records_file Regression_records.xls --data_dir BeijingPM25Quality --data_class tsra --pattern TRAIN --val_pattern TEST --epochs 200 --lr 0.001 --optimizer RAdam --pos_encoding learnable --d_model 128 --load_model /home/xzhoubi/paperreading/mvts_transformer/experiments/BeijingPM25Quality_pretrained_2022-07-19_10-27-28_tlB/checkpoints/model_best.pth --task regression --change_output --batch_size 128

    Can you please help check it?

    2022-07-19 17:42:53,244 | INFO : Epoch 85 Training Summary: epoch: 85.000000 | loss: 1024.587302 | 
    2022-07-19 17:42:53,244 | INFO : Epoch runtime: 0.0 hours, 0.0 minutes, 4.77277946472168 seconds
    
    2022-07-19 17:42:53,244 | INFO : Avg epoch train. time: 0.0 hours, 0.0 minutes, 4.6006609103258915 seconds
    2022-07-19 17:42:53,245 | INFO : Avg batch train. time: 0.048943201173679694 seconds
    2022-07-19 17:42:53,245 | INFO : Avg sample train. time: 0.00038602625527151295 seconds
    Training Epoch:  42%|████████████████████▊                            | 85/200 [07:40<10:05,  5.26s/it]Training Epoch 86   0.0% | batch:         0 of        94 |       loss: 566.886
    Training Epoch 86   1.1% | batch:         1 of        94        |       loss: 686.58
    Training Epoch 86   2.1% | batch:         2 of        94        |       loss: 1297.63
    Training Epoch 86   3.2% | batch:         3 of        94        |       loss: 976.956
    Training Epoch 86   4.3% | batch:         4 of        94        |       loss: 565.19
    Training Epoch 86   5.3% | batch:         5 of        94        |       loss: 809.262
    Training Epoch 86   6.4% | batch:         6 of        94        |       loss: 1095.96
    Training Epoch 86   7.4% | batch:         7 of        94        |       loss: 1047.49
    Training Epoch 86   8.5% | batch:         8 of        94        |       loss: 782.682
    Training Epoch 86   9.6% | batch:         9 of        94        |       loss: 697.767
    Training Epoch 86  10.6% | batch:        10 of        94        |       loss: 900.141
    Training Epoch 86  11.7% | batch:        11 of        94        |       loss: 919.351
    Training Epoch 86  12.8% | batch:        12 of        94        |       loss: 782.872
    Training Epoch 86  13.8% | batch:        13 of        94        |       loss: 1082.41
    Training Epoch 86  14.9% | batch:        14 of        94        |       loss: 1004.29
    Training Epoch 86  16.0% | batch:        15 of        94        |       loss: 960.513
    Training Epoch 86  17.0% | batch:        16 of        94        |       loss: 776.499
    Training Epoch 86  18.1% | batch:        17 of        94        |       loss: 995.985
    Training Epoch 86  19.1% | batch:        18 of        94        |       loss: 655.607
    Training Epoch 86  20.2% | batch:        19 of        94        |       loss: 733.846
    Training Epoch 86  21.3% | batch:        20 of        94        |       loss: 1190.87
    Training Epoch 86  22.3% | batch:        21 of        94        |       loss: 698.143
    Training Epoch 86  23.4% | batch:        22 of        94        |       loss: 992.943
    Training Epoch 86  24.5% | batch:        23 of        94        |       loss: 1017.47
    Training Epoch 86  25.5% | batch:        24 of        94        |       loss: 696.403
    Training Epoch 86  26.6% | batch:        25 of        94        |       loss: 822.942
    Training Epoch 86  27.7% | batch:        26 of        94        |       loss: 935.869
    Training Epoch 86  28.7% | batch:        27 of        94        |       loss: 1040.06
    Training Epoch 86  29.8% | batch:        28 of        94        |       loss: 904.523
    Training Epoch 86  30.9% | batch:        29 of        94        |       loss: 882.923
    Training Epoch 86  31.9% | batch:        30 of        94        |       loss: 805.928
    Training Epoch 86  33.0% | batch:        31 of        94        |       loss: 803.492
    Training Epoch 86  34.0% | batch:        32 of        94        |       loss: 1720.69
    Training Epoch 86  35.1% | batch:        33 of        94        |       loss: 778.216
    Training Epoch 86  36.2% | batch:        34 of        94        |       loss: 729.644
    Training Epoch 86  37.2% | batch:        35 of        94        |       loss: 1233.58
    Training Epoch 86  38.3% | batch:        36 of        94        |       loss: 960.826
    Training Epoch 86  39.4% | batch:        37 of        94        |       loss: 986.129
    Training Epoch 86  40.4% | batch:        38 of        94        |       loss: 1316.68
    Training Epoch 86  41.5% | batch:        39 of        94        |       loss: 1351.79
    Training Epoch 86  42.6% | batch:        40 of        94        |       loss: 1661.48
    Training Epoch 86  43.6% | batch:        41 of        94        |       loss: 956.305
    Training Epoch 86  44.7% | batch:        42 of        94        |       loss: 1017.96
    Training Epoch 86  45.7% | batch:        43 of        94        |       loss: 851.958
    Training Epoch 86  46.8% | batch:        44 of        94        |       loss: 816.494
    Training Epoch 86  47.9% | batch:        45 of        94        |       loss: 603.491
    Training Epoch 86  48.9% | batch:        46 of        94        |       loss: 710.572
    Training Epoch 86  50.0% | batch:        47 of        94        |       loss: 1318.47
    Training Epoch 86  51.1% | batch:        48 of        94        |       loss: 905.094
    Training Epoch 86  52.1% | batch:        49 of        94        |       loss: 662.117
    Training Epoch 86  53.2% | batch:        50 of        94        |       loss: 850.853
    Training Epoch 86  54.3% | batch:        51 of        94        |       loss: 1007.81
    Training Epoch 86  55.3% | batch:        52 of        94        |       loss: 1236.99
    Training Epoch 86  56.4% | batch:        53 of        94        |       loss: 809.194
    Training Epoch 86  57.4% | batch:        54 of        94        |       loss: 1075.82
    Training Epoch 86  58.5% | batch:        55 of        94        |       loss: 859.909
    Training Epoch 86  59.6% | batch:        56 of        94        |       loss: 739.112
    Training Epoch 86  60.6% | batch:        57 of        94        |       loss: 992.518
    Training Epoch 86  61.7% | batch:        58 of        94        |       loss: 953.861
    Training Epoch 86  62.8% | batch:        59 of        94        |       loss: 881.18
    Training Epoch 86  63.8% | batch:        60 of        94        |       loss: 878.613
    Training Epoch 86  64.9% | batch:        61 of        94        |       loss: 1006.92
    Training Epoch 86  66.0% | batch:        62 of        94        |       loss: 728.144
    Training Epoch 86  67.0% | batch:        63 of        94        |       loss: 865.157
    Training Epoch 86  68.1% | batch:        64 of        94        |       loss: 895.809
    Training Epoch 86  69.1% | batch:        65 of        94        |       loss: 616.984
    Training Epoch 86  70.2% | batch:        66 of        94        |       loss: 893.007
    Training Epoch 86  71.3% | batch:        67 of        94        |       loss: 859.431
    Training Epoch 86  72.3% | batch:        68 of        94        |       loss: 1648.19
    Training Epoch 86  73.4% | batch:        69 of        94        |       loss: 657.725
    Training Epoch 86  74.5% | batch:        70 of        94        |       loss: 960.164
    Training Epoch 86  75.5% | batch:        71 of        94        |       loss: 666.139
    Training Epoch 86  76.6% | batch:        72 of        94        |       loss: 3079.8
    Training Epoch 86  77.7% | batch:        73 of        94        |       loss: 802.407
    Training Epoch 86  78.7% | batch:        74 of        94        |       loss: 1103.64
    Training Epoch 86  79.8% | batch:        75 of        94        |       loss: 1029.07
    Training Epoch 86  80.9% | batch:        76 of        94        |       loss: 1488.64
    Training Epoch 86  81.9% | batch:        77 of        94        |       loss: 924.513
    Training Epoch 86  83.0% | batch:        78 of        94        |       loss: 909.587
    Training Epoch 86  84.0% | batch:        79 of        94        |       loss: 862.864
    Training Epoch 86  85.1% | batch:        80 of        94        |       loss: 607.052
    Training Epoch 86  86.2% | batch:        81 of        94        |       loss: 967.5
    Training Epoch 86  87.2% | batch:        82 of        94        |       loss: 942.684
    Training Epoch 86  88.3% | batch:        83 of        94        |       loss: 1217.01
    Training Epoch 86  89.4% | batch:        84 of        94        |       loss: 685.092
    Training Epoch 86  90.4% | batch:        85 of        94        |       loss: 949.638
    Training Epoch 86  91.5% | batch:        86 of        94        |       loss: 737.985
    Training Epoch 86  92.6% | batch:        87 of        94        |       loss: 1085.89
    Training Epoch 86  93.6% | batch:        88 of        94        |       loss: 936.676
    Training Epoch 86  94.7% | batch:        89 of        94        |       loss: 1203.51
    Training Epoch 86  95.7% | batch:        90 of        94        |       loss: 677.801
    Training Epoch 86  96.8% | batch:        91 of        94        |       loss: 2214.77
    Training Epoch 86  97.9% | batch:        92 of        94        |       loss: 1357.56
    Training Epoch 86  98.9% | batch:        93 of        94        |       loss: 1019.23
    
    2022-07-19 17:42:57,306 | INFO : Epoch 86 Training Summary: epoch: 86.000000 | loss: 974.012262 | 
    2022-07-19 17:42:57,307 | INFO : Epoch runtime: 0.0 hours, 0.0 minutes, 3.9919965267181396 seconds
    
    2022-07-19 17:42:57,307 | INFO : Avg epoch train. time: 0.0 hours, 0.0 minutes, 4.593583417493243 seconds
    2022-07-19 17:42:57,307 | INFO : Avg batch train. time: 0.04886790869673663 seconds
    2022-07-19 17:42:57,307 | INFO : Avg sample train. time: 0.00038543240623370055 seconds
    2022-07-19 17:42:57,307 | INFO : Evaluating on validation set ...
    Evaluating Epoch 86   0.0% | batch:         0 of        40      |       loss: 7538.28
    Evaluating Epoch 86   2.5% | batch:         1 of        40      |       loss: 1100.53
    Evaluating Epoch 86   5.0% | batch:         2 of        40      |       loss: 2441.92
    Evaluating Epoch 86   7.5% | batch:         3 of        40      |       loss: 7944.98
    Evaluating Epoch 86  10.0% | batch:         4 of        40      |       loss: 2934.04
    Evaluating Epoch 86  12.5% | batch:         5 of        40      |       loss: 2394.65
    Evaluating Epoch 86  15.0% | batch:         6 of        40      |       loss: 8225.28
    Evaluating Epoch 86  17.5% | batch:         7 of        40      |       loss: 3071.4
    Evaluating Epoch 86  20.0% | batch:         8 of        40      |       loss: 3004.23
    Evaluating Epoch 86  22.5% | batch:         9 of        40      |       loss: 2549.05
    Evaluating Epoch 86  25.0% | batch:        10 of        40      |       loss: 5039.37
    Evaluating Epoch 86  27.5% | batch:        11 of        40      |       loss: 1271.33
    Evaluating Epoch 86  30.0% | batch:        12 of        40      |       loss: 7026.6
    Evaluating Epoch 86  32.5% | batch:        13 of        40      |       loss: 4039.62
    Evaluating Epoch 86  35.0% | batch:        14 of        40      |       loss: 1919.55
    Evaluating Epoch 86  37.5% | batch:        15 of        40      |       loss: 3505.34
    Evaluating Epoch 86  40.0% | batch:        16 of        40      |       loss: 5214.82
    Evaluating Epoch 86  42.5% | batch:        17 of        40      |       loss: 2959.36
    Evaluating Epoch 86  45.0% | batch:        18 of        40      |       loss: 2551.97
    Evaluating Epoch 86  47.5% | batch:        19 of        40      |       loss: 6823
    Evaluating Epoch 86  50.0% | batch:        20 of        40      |       loss: 4544.8
    Evaluating Epoch 86  52.5% | batch:        21 of        40      |       loss: 1190.93
    Evaluating Epoch 86  55.0% | batch:        22 of        40      |       loss: 3702.28
    Evaluating Epoch 86  57.5% | batch:        23 of        40      |       loss: 3874.76
    Evaluating Epoch 86  60.0% | batch:        24 of        40      |       loss: 1572.05
    Evaluating Epoch 86  62.5% | batch:        25 of        40      |       loss: 3755.92
    Evaluating Epoch 86  65.0% | batch:        26 of        40      |       loss: 10556.1
    Evaluating Epoch 86  67.5% | batch:        27 of        40      |       loss: 3082.73
    Evaluating Epoch 86  70.0% | batch:        28 of        40      |       loss: 1867.05
    Evaluating Epoch 86  72.5% | batch:        29 of        40      |       loss: 10148.6
    Evaluating Epoch 86  75.0% | batch:        30 of        40      |       loss: 1724.54
    Evaluating Epoch 86  77.5% | batch:        31 of        40      |       loss: 1341.73
    Evaluating Epoch 86  80.0% | batch:        32 of        40      |       loss: 7704.38
    Evaluating Epoch 86  82.5% | batch:        33 of        40      |       loss: 7095.86
    Evaluating Epoch 86  85.0% | batch:        34 of        40      |       loss: 1109.71
    Evaluating Epoch 86  87.5% | batch:        35 of        40      |       loss: 5296.75
    Evaluating Epoch 86  90.0% | batch:        36 of        40      |       loss: 6882.2
    Evaluating Epoch 86  92.5% | batch:        37 of        40      |       loss: 2588.44
    Evaluating Epoch 86  95.0% | batch:        38 of        40      |       loss: 3639.52
    Evaluating Epoch 86  97.5% | batch:        39 of        40      |       loss: 11084.6
    2022-07-19 17:42:58,800 | INFO : Validation runtime: 0.0 hours, 0.0 minutes, 1.4921939373016357 seconds
    
    2022-07-19 17:42:58,800 | INFO : Avg val. time: 0.0 hours, 0.0 minutes, 1.4729443497127956 seconds
    2022-07-19 17:42:58,800 | INFO : Avg batch val. time: 0.03682360874281989 seconds
    2022-07-19 17:42:58,800 | INFO : Avg sample val. time: 0.0002917877079462749 seconds
    
    opened by x-zho14 1
  • Sparse, Binary Data: Interpolate Missing

    Sparse, Binary Data: Interpolate Missing

    I am running into an issue related to the type of data I am using. I built a new data class that preprocesses data into the same dataframe format and indexing as the provided examples (appending repeated (sample) sequences to a dataframe indexed by sample number with each row as a timestep and each column as a feature). However, the data I am leveraging is extremely sparse and binary - many NaNs and few 1s. I noticed that your data.py has a function called interpolate_missing, that I am running on my sparse dataframe. However, it's replacing the NaNs with ones, creating a univariate DF. I'm happy to write my own function that simply replaces my NaNs with 0s, but am worried binary data might not work well with this model-type. Could you please provide intuition or guidance here?

    Also, I am running this as a supervised regression task to predict the end-time (discretized) of the sequence. My current strategy is to simply provide a label_df with the numerical discretized end-time for each sample, but I know there are other ways to label for this task. Any intuition on if this strategy will be effective or if I should try something else?

    Thanks,

    Ian

    opened by ianhill60 4
Owner
null
Ian Covert 130 Jan 1, 2023
Code for the CIKM 2019 paper "DSANet: Dual Self-Attention Network for Multivariate Time Series Forecasting".

Dual Self-Attention Network for Multivariate Time Series Forecasting 20.10.26 Update: Due to the difficulty of installation and code maintenance cause

Kyon Huang 223 Dec 16, 2022
USAD - UnSupervised Anomaly Detection on multivariate time series

USAD - UnSupervised Anomaly Detection on multivariate time series Scripts and utility programs for implementing the USAD architecture. Implementation

null 116 Jan 4, 2023
Library for implementing reservoir computing models (echo state networks) for multivariate time series classification and clustering.

Framework overview This library allows to quickly implement different architectures based on Reservoir Computing (the family of approaches popularized

Filippo Bianchi 249 Dec 21, 2022
The source code and data of the paper "Instance-wise Graph-based Framework for Multivariate Time Series Forecasting".

IGMTF The source code and data of the paper "Instance-wise Graph-based Framework for Multivariate Time Series Forecasting". Requirements The framework

Wentao Xu 24 Dec 5, 2022
Multivariate Time Series Forecasting with efficient Transformers. Code for the paper "Long-Range Transformers for Dynamic Spatiotemporal Forecasting."

Spacetimeformer Multivariate Forecasting This repository contains the code for the paper, "Long-Range Transformers for Dynamic Spatiotemporal Forecast

QData 440 Jan 2, 2023
Time-series-deep-learning - Developing Deep learning LSTM, BiLSTM models, and NeuralProphet for multi-step time-series forecasting of stock price.

Stock Price Prediction Using Deep Learning Univariate Time Series Predicting stock price using historical data of a company using Neural networks for

Abdultawwab Safarji 7 Nov 27, 2022
Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting This is the origin Pytorch implementation of Informer in the followin

Haoyi 3.1k Dec 29, 2022
Implementation of the paper NAST: Non-Autoregressive Spatial-Temporal Transformer for Time Series Forecasting.

Non-AR Spatial-Temporal Transformer Introduction Implementation of the paper NAST: Non-Autoregressive Spatial-Temporal Transformer for Time Series For

Chen Kai 66 Nov 28, 2022
Time Series Forecasting with Temporal Fusion Transformer in Pytorch

Forecasting with the Temporal Fusion Transformer Multi-horizon forecasting often contains a complex mix of inputs – including static (i.e. time-invari

Nicolás Fornasari 6 Jan 24, 2022
Anomaly Transformer: Time Series Anomaly Detection with Association Discrepancy" (ICLR 2022 Spotlight)

About Code release for Anomaly Transformer: Time Series Anomaly Detection with Association Discrepancy (ICLR 2022 Spotlight)

THUML @ Tsinghua University 221 Dec 31, 2022
Implementation of ETSformer, state of the art time-series Transformer, in Pytorch

ETSformer - Pytorch Implementation of ETSformer, state of the art time-series Transformer, in Pytorch Install $ pip install etsformer-pytorch Usage im

Phil Wang 121 Dec 30, 2022
Multivariate Boosted TRee

Multivariate Boosted TRee What is MBTR MBTR is a python package for multivariate boosted tree regressors trained in parameter space. The package can h

SUPSI-DACD-ISAAC 61 Dec 19, 2022
VSR-Transformer - This paper proposes a new Transformer for video super-resolution (called VSR-Transformer).

VSR-Transformer By Jiezhang Cao, Yawei Li, Kai Zhang, Luc Van Gool This paper proposes a new Transformer for video super-resolution (called VSR-Transf

Jiezhang Cao 225 Nov 13, 2022
Implementation of the Transformer variant proposed in "Transformer Quality in Linear Time"

FLASH - Pytorch Implementation of the Transformer variant proposed in the paper Transformer Quality in Linear Time Install $ pip install FLASH-pytorch

Phil Wang 209 Dec 28, 2022
A PaddlePaddle version of Neural Renderer, refer to its PyTorch version

Neural 3D Mesh Renderer in PadddlePaddle A PaddlePaddle version of Neural Renderer, refer to its PyTorch version Install Run: pip install neural-rende

AgentMaker 13 Jul 12, 2022
A unified framework for machine learning with time series

Welcome to sktime A unified framework for machine learning with time series We provide specialized time series algorithms and scikit-learn compatible

The Alan Turing Institute 6k Jan 8, 2023
MINIROCKET: A Very Fast (Almost) Deterministic Transform for Time Series Classification

MINIROCKET: A Very Fast (Almost) Deterministic Transform for Time Series Classification

null 187 Dec 26, 2022