A universal framework for learning timestamp-level representations of time series

Related tags

Deep Learning ts2vec
Overview

TS2Vec

This repository contains the official implementation for the paper Learning Timestamp-Level Representations for Time Series with Hierarchical Contrastive Loss.

Requirements

The recommended requirements for TS2Vec are specified as follows:

  • Python 3.8
  • scipy==1.6.1
  • torch==1.8.1
  • numpy==1.19.2
  • pandas==1.0.1
  • scikit_learn==0.24.1

The dependencies can be installed by:

pip install -r requirements.txt

Data

The datasets can be obtained and put into datasets/ folder in the following way:

  • 128 UCR datasets should be put into datasets/UCR/ so that each data file can be located by datasets/UCR/<dataset_name>/<dataset_name>_*.csv.
  • 30 UEA datasets should be put into datasets/UEA/ so that each data file can be located by datasets/UEA/<dataset_name>/<dataset_name>_*.arff.
  • 3 ETT datasets should be placed at datasets/ETTh1.csv, datasets/ETTh2.csv and datasets/ETTm1.csv.
  • Electricity dataset should be resampled into hourly data of 321 clients over the last 3 years and placed at datasets/electricity.csv.

Usage

To train and evaluate TS2Vec on a dataset, run the following command:

python train.py <dataset_name> <run_name> --archive <archive> --batch-size <batch_size> --repr-dims <repr_dims> --gpu <gpu> --eval

The detailed descriptions about the arguments are as following:

Parameter name Description of parameter
dataset_name The dataset name
run_name The folder name used to save model, output and evaluation metrics. This can be set to any word
archive The archive name that the dataset belongs to. This can be set to UCR, UEA, forecast_csv or forecast_csv_univar
batch_size The batch size (defaults to 8)
repr_dims The representation dimensions (defaults to 320)
gpu The gpu no. used for training and inference (defaults to 0)
eval Whether to perform evaluation after training

(For descriptions of more arguments, run python train.py -h.)

After training and evaluation, the trained encoder, output and evaluation metrics can be found in training/DatasetName__RunName_Date_Time/.

Scripts: The scripts for reproduction are provided in scripts/ folder.

Code Example

from ts2vec import TS2Vec
import datautils

# Load the ECG200 dataset from UCR archive
train_data, train_labels, test_data, test_labels = datautils.load_UCR('ECG200')
# (Both train_data and test_data have a shape of n_instances x n_timestamps x n_features)

# Train a TS2Vec model
model = TS2Vec(
    input_dims=1,
    device=0,
    output_dims=320
)
loss_log = model.fit(
    train_data,
    verbose=True
)

# Compute timestamp-level representations for test set
test_repr = model.encode(test_data)  # n_instances x n_timestamps x output_dims

# Compute instance-level representations for test set
test_repr = model.encode(test_data, encoding_window='full_series')  # n_instances x output_dims

# Sliding inference for test set
test_repr = model.encode(
    test_data,
    casual=True,
    sliding_length=1,
    sliding_padding=50
)  # n_instances x n_timestamps x output_dims
# (The timestamp t's representation vector is computed using the observations located in [t-50+1, t])
Comments
  • 请问关于encoder中的相关问题

    请问关于encoder中的相关问题

    微信图片_20220825151336 大佬您好,想请问下encoder中的Dilated Convolutions。

    1. 我看代码后的理解是,分别将a1-b1子序列和a2-b2子序列送入encoder,得到两个子序列表征。再对两个子序列时间公共的部分a2~b1进行对比学习。可以这么理解吗?
    2. 请问encoder中的Dilated Convolution是因果卷积吗?是每一点的表征学习只用到了该时刻及之前的数据?
    3. 如果是因果卷积,可否认为将a2-b2子序列送入encoder学习表征时,a2-b1段学到的表征,没有使用到b1-b2段的信息?

    恳请大佬答疑解惑,多谢多谢!

    opened by guobing21 11
  • 请问训练轮次如何控制?

    请问训练轮次如何控制?

    大佬您好,恭喜sota,有两个小问题:

    1. 我对非监督不是很熟悉,监督学习下判断过拟合欠拟合我主要是用valid set的early stop。这里任务被解耦成特征提取和一个判别网络,似乎只能观测一下特征提取这个阶段目标的loss curve性状来大体判断一下,而这个目标也不是任务整体的目标,那么如何决策要训练多少轮呢?是否有过、欠拟合风险呢?
    2. 我的任务是时序分类,并且全部贴好了标签。那么比起最大化某序列和另一个随机抽样序列的差异,是否有意地在另一个类别里抽样会更好呢?

    非常感谢!

    opened by kpmokpmo 4
  • Simple sin wave results

    Simple sin wave results

    sinwave.csv I am using a simple sin wave to test the algor. s1 is long wave, s2 is med wave, s3 is short wave, s4=s1+s2+s3. The model can predict s1/s2/s3 successfully, but for s4 it performs poorly compared to even LSTM. Could you share some insights on this? I've tried default hyper-parameters, and also tried to tune it. No significant improvement. thanks.

    opened by huangtarn 3
  • data shape, loading custom data, possible lookahead

    data shape, loading custom data, possible lookahead

    Hi, I am trying to test on my own datasets which are multivariate time series. I load the data into a Dataframe and then create the slices for train validate and test, just mimicking the existing code.

    There is a point where my n x m data, where m is the number of features, or covariate time series, is expanded to 1 x n x m. The comments in your code say "number of instances x timestamps x features". What is instances in this context?

    I am worried that my results are perhaps too good to be true and I am trying to make sure I understand where lookahead might be.

    opened by gminorcoles 2
  • clarification on the sliding length and padding

    clarification on the sliding length and padding

    Thank you for your great contribution. I was unable to understand the difference in usage for the sliding length and sliding padding. For example, if I wanted to utilize X days for a forecasting problem, what would be the proper usage for the parameters be?

    Thank you in advance.

    sliding_length sliding_padding

    Note: I noticed on my dataset that using 24 =>sliding length > 1 yields better results, however for sliding length >24 a size mismatch error occurs at evaluation. The impact for increasing the padding was less impactful than the length, so if you can clarify the proper usage it would be great.

    opened by m13ammed 2
  • where can I download 'electricity.csv'

    where can I download 'electricity.csv'

    Hi Zhihan,

    Thanks for the very useful repository. Could you point me to the place where I can download the 'electricity.csv'. From the link in the README, Electricity dataset, I can only get a file named LD2011_2014.txt.zip. Not sure how to convert it to the 'electricity.csv'.

    opened by hehaodele 2
  • 请教下游任务中的padding

    请教下游任务中的padding

    大佬您好,代码中有两个问题想请教您,希望大佬能够解答一下,多谢大佬! 1 下游预测任务中,padding设置了200,这个padding在预训练中有没有相对应的地方呢,也就是说预训练中要不要也padding呢?如果要,预训练的padding和下游任务的padding是否必须保持一致呢?200的值是怎么选取的呢?您使用的电力数据集,数据很长,padding设置为了200。如果数据没有那么长,例如只有100个点,那padding怎么设置合理点呢? padding = 200 t = time.time() all_repr = model.encode( data, casual=True, sliding_length=1, sliding_padding=padding, batch_size=256 ) 2 生成训练数据的时候,为什么要drop掉padding个数据呢? train_features, train_labels = generate_pred_samples(train_repr, train_data, pred_len, drop=padding)

    恳请大佬解惑,多谢大佬!!!

    opened by guobing21 1
  • drop=padding in forecasting

    drop=padding in forecasting

    Hi,

    Is there any reason you set drop to equal padding lengths for training in forecasting, but not for valid and test? This could train forecast function with complete history only.

    https://github.com/yuezhihan/ts2vec/blob/12a737e6561878452fffb68c81c98d24628f274a/tasks/forecasting.py#L46

    opened by opsuisppn 1
  • what is difference between n_instance and n_features?

    what is difference between n_instance and n_features?

    in ts2vec.py, fit() requires train_data type (n_instance. n_timestamps, n_features). what is difference between n_instance and n_features? I think n_feature means the number of time series (e.g. univariate time series -> n_features=1) Is same that n_instance to window size? or something? thank you.

    opened by Haebuk 1
  • CUDA out of memory

    CUDA out of memory

    I run python3 train.py ETTm1 mytest --loader forecast_csv

    and have got an error as follows. Could you pls help me? thanks ############################## Dataset: ETTm1 Arguments: Namespace(batch_size=8, dataset='ETTm1', epochs=None, eval=False, gpu=0, irregular=0, iters=None, loader='forecast_csv', lr=0.001, max_threads=None, max_train_length=3000, repr_dims=320, run_name='binh', save_every=None, seed=None) Loading data... done Traceback (most recent call last): File "train.py", line 120, in loss_log = model.fit( File "/home/binh/experiments/ts2vec/ts2vec.py", line 137, in fit loss.backward() File "/home/binh/.local/lib/python3.8/site-packages/torch/tensor.py", line 245, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs) File "/home/binh/.local/lib/python3.8/site-packages/torch/autograd/init.py", line 145, in backward Variable._execution_engine.run_backward( RuntimeError: CUDA out of memory. Tried to allocate 380.00 MiB (GPU 0; 3.82 GiB total capacity; 1.80 GiB already allocated; 254.62 MiB free; 2.25 GiB reserved in total by PyTorch)

    opened by Thanh-Binh 1
  • Format of Yahoo dataset for pre-processing

    Format of Yahoo dataset for pre-processing

    Thank you so much for continuously open-sourcing your findings! I noticed that the downloaded data from Yahoo seems to be in a different format than the one required for preprocess_yahoo.py. Will it be possible for you to look into this? Thank you very much!

    Yahoo follows the format of A1/real_1.csv ... A2/synthetic_1.csv ...

    While the required format is path/1 ... path/367, which seems to contain dictionaries.

    opened by gorold 1
  • The dataset about ETT

    The dataset about ETT

    Hi,it is a very nice work. But I have a question about Multivariate time series forecasting results. This Github repo https://github.com/zhouhaoyi/ETDataset only offer ETT-small dataset that is a Univariate time serie. I don't konw how to use Multivariate time serie dataset of ETT to run this code. Thank you very much Best wishes

    opened by fuyuyuputao 0
  • 下游任务相关问题?

    下游任务相关问题?

    你好,我想请教一下下游任务的问题

    Compute timestamp-level representations for test set

    test_repr = model.encode(test_data) # n_instances x n_timestamps x output_dims

    Compute instance-level representations for test set

    test_repr = model.encode(test_data, encoding_window='full_series') # n_instances x output_dims

    1. n_instances x n_timestamps x output_dims 我可以理解成为 batchsize * t * channels吗?

    2. 然后这个时间级别的表示和实例级别的表示在分类任务中有什么性能上的差异吗? 另外,我看代码是将输入维度升高到了320,这好像和我理解的传统的特征提取不太一样?

    opened by sunuo1997 0
  • Rounding error concerning max_train_length

    Rounding error concerning max_train_length

    Hi, I think there is a rounding error concerning the max_train_length

    https://github.com/yuezhihan/ts2vec/blob/631bd533aab3547d1310f4e02a20f3eb53de26be/ts2vec.py#L77-L80

    To crop the data into cropped into some sequences, each of which has a length less than <max_train_length>, the number of sections should be rounded up.

    For example in the ETTh1 dataset cropping the train slice of length 8640 with max_train_length = 201 results in 42 sections of length 206, instead of 43 sections of length 201.

    opened by RichardAffolter 0
  • Training iterations

    Training iterations

    Thanks for putting out this paper, sounds very promising. Would you mind clarifying the number of iterations you use for self-supervised training. You mention in the paper that you use 600 for datasets larger than 100,000 (with batch size 8). That seems incredibly low, are you sure you don't mean 600 epochs?

    Any clarification would be great, thank you!

    opened by TKassis 0
  • How to use your approach for downstream forecasting tasks

    How to use your approach for downstream forecasting tasks

    Summary

    Thanks for making the code available. I really like the idea of first learning the embeddings in a self-supervised manner and then using a simpler model for forecasting. However, I am struggling how to use the learned embeddings for the forecasting part.

    Problem Description

    Say you are tasked with forecasting a monthly univariate time series Y = (y1, ..., yT), which is historically available from January.2010 until December.2020. The task is to forecast 2021, with the forecasting horizon being h=12 months. Based on your framework, we are using the TCN-Encoder to learn the embeddings for January.2010 until December.2020. For training of the downstream forecasting model, say a Ridge Regression Model, we are using the final timestamp of the learned representations. So far so good.

    @yuezhihan & @linytsysu My questions is: given the representations and the trained Ridge model, how do we forecast 2021, since the data and hence representations are available until end of 2020 only? More specifically, what are the features for the Ridge model used for forecasting 2021?

    In your Paper, Section C.2 you state that

    For each task, we only use the training set to train the representation model, and apply the model to the testing set to get representations

    Does this mean you show the actual test-data to the model, create the representations/embeddings based on the test-data and then use these to fit the same test-data? Isn't this a simple interpolation of the test-data, using the representations instead of the actuals, rather than forecasting?

    I highly appreciate your comments on this. Many thanks.

    opened by StatMixedML 18
  • A dont understad part about random cropping in code?

    A dont understad part about random cropping in code?

    For below part ,why use crop_right substract crop_eleft instead of substracting crop_left? out1 = self._net(take_per_row(x, crop_offset + crop_eleft, crop_right - crop_eleft)) out1 = out1[:, -crop_l:]

                out2 = self._net(take_per_row(x, crop_offset + crop_left, crop_eright - crop_left))
                out2 = out2[:, :crop_l]
    
    opened by meihuameii 2
Owner
Zhihan Yue
Zhihan Yue
A unified framework for machine learning with time series

Welcome to sktime A unified framework for machine learning with time series We provide specialized time series algorithms and scikit-learn compatible

The Alan Turing Institute 6k Jan 8, 2023
Merlion: A Machine Learning Framework for Time Series Intelligence

Merlion: A Machine Learning Library for Time Series Table of Contents Introduction Installation Documentation Getting Started Anomaly Detection Foreca

Salesforce 2.8k Dec 30, 2022
Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, CVPR 2018

Learning Pixel-level Semantic Affinity with Image-level Supervision This code is deprecated. Please see https://github.com/jiwoon-ahn/irn instead. Int

Jiwoon Ahn 337 Dec 15, 2022
[ACL 20] Probing Linguistic Features of Sentence-level Representations in Neural Relation Extraction

REval Table of Contents Introduction Overview Requirements Installation Probing Usage Citation License ?? Introduction REval is a simple framework for

null 13 Jan 6, 2023
The source code and data of the paper "Instance-wise Graph-based Framework for Multivariate Time Series Forecasting".

IGMTF The source code and data of the paper "Instance-wise Graph-based Framework for Multivariate Time Series Forecasting". Requirements The framework

Wentao Xu 24 Dec 5, 2022
Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image classification, in Pytorch

Transformer in Transformer Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image c

Phil Wang 272 Dec 23, 2022
PyTorch implementation of SimCLR: A Simple Framework for Contrastive Learning of Visual Representations

PyTorch implementation of SimCLR: A Simple Framework for Contrastive Learning of Visual Representations

Thalles Silva 1.7k Dec 28, 2022
tsai is an open-source deep learning package built on top of Pytorch & fastai focused on state-of-the-art techniques for time series classification, regression and forecasting.

Time series Timeseries Deep Learning Pytorch fastai - State-of-the-art Deep Learning with Time Series and Sequences in Pytorch / fastai

timeseriesAI 2.8k Jan 8, 2023
Ian Covert 130 Jan 1, 2023
git《USD-Seg:Learning Universal Shape Dictionary for Realtime Instance Segmentation》(2020) GitHub: [fig2]

USD-Seg This project is an implement of paper USD-Seg:Learning Universal Shape Dictionary for Realtime Instance Segmentation, based on FCOS detector f

Ruolin Ye 80 Nov 28, 2022
Official Implementation of 'UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers' ICLR 2021(spotlight)

UPDeT Official Implementation of UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers (ICLR 2021 spotlight) The

hhhusiyi 96 Dec 22, 2022
Source for the paper "Universal Activation Function for machine learning"

Universal Activation Function Tensorflow and Pytorch source code for the paper Yuen, Brosnan, Minh Tu Hoang, Xiaodai Dong, and Tao Lu. "Universal acti

null 4 Dec 3, 2022
Code for "MetaMorph: Learning Universal Controllers with Transformers", Gupta et al, ICLR 2022

MetaMorph: Learning Universal Controllers with Transformers This is the code for the paper MetaMorph: Learning Universal Controllers with Transformers

Agrim Gupta 50 Jan 3, 2023
Pytorch implementation of COIN, a framework for compression with implicit neural representations 🌸

COIN ?? This repo contains a Pytorch implementation of COIN: COmpression with Implicit Neural representations, including code to reproduce all experim

Emilien Dupont 104 Dec 14, 2022
Here is the implementation of our paper S2VC: A Framework for Any-to-Any Voice Conversion with Self-Supervised Pretrained Representations.

S2VC Here is the implementation of our paper S2VC: A Framework for Any-to-Any Voice Conversion with Self-Supervised Pretrained Representations. In thi

null 81 Dec 15, 2022
ImageNet-CoG is a benchmark for concept generalization. It provides a full evaluation framework for pre-trained visual representations which measure how well they generalize to unseen concepts.

The ImageNet-CoG Benchmark Project Website Paper (arXiv) Code repository for the ImageNet-CoG Benchmark introduced in the paper "Concept Generalizatio

NAVER 23 Oct 9, 2022
Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting This is the origin Pytorch implementation of Informer in the followin

Haoyi 3.1k Dec 29, 2022