Library for implementing reservoir computing models (echo state networks) for multivariate time series classification and clustering.

Overview

Framework overview

This library allows to quickly implement different architectures based on Reservoir Computing (the family of approaches popularized in machine learning by Echo State Networks) for classification or clustering of univariate/multivariate time series.

Several options are available to customize the RC model, by selecting different configurations for each module.

  1. The reservoir module specifies the reservoir configuration (e.g., bidirectional, leaky neurons, circle topology);
  2. The dimensionality reduction module (optionally) applies a dimensionality reduction on the produced sequence of the reservoir's states;
  3. The representation module defines how to represent the input time series from the sequence of reservoir's states;
  4. The readout module specifies the model to use to perform the final classification.

The representations obtained at step 3 can also be used to perform clustering.

This library also implements the novel reservoir model space as representation for the time series. Details on the methodology can be found in the original paper (Arix version here).

Required libraries

  • sklearn (tested on version 0.22.1)
  • scipy

The code has been tested on Python 3.7, but lower versions should work as well.

Quick execution

Run the script classification_example.py or clustering_example.py to perform a quick execution on a benchmark dataset of multivariate time series.

For the clustering example, check also the notebook here.

Configure the RC-model

The main class RC_model contained in modules.py permits to specify, train and test an RC-model. The RC-model is configured by passing to the constructor of the class RC_model a set of parameters. To get an idea, you can check classification_example.py or clustering_example.py where the parameters are specified through a dictionary (config).

The available configuration hyperparameters are listed in the following and, for the sake of clarity, are grouped according to which module of the architecture they refer to.

1. Reservoir:

  • n_drop - number of transient states to drop
  • bidir - use a bidirectional reservoir (True or False)
  • reservoir - precomputed reservoir (object of class Reservoir in reservoir.py; if None, the following hyperparameters must be specified:
    • n_internal_units = number of processing units in the reservoir
    • spectral_radius = largest eigenvalue of the reservoir matrix of connection weights (to guarantee the Echo State Property, set spectral_radius <= leak <= 1)
    • leak = amount of leakage in the reservoir state update (optional, None or 1.0 --> no leakage)
    • circ = if True, generate a determinisitc reservoir with circle topology where each connection has the same weight
    • connectivity = percentage of nonzero connection weights (ignored if circ = True)
    • input_scaling = scaling of the input connection weights (note that weights are randomly drawn from {-1,1})
    • noise_level = deviation of the Gaussian noise injected in the state update

2. Dimensionality reduction:

  • dimred_method - procedure for reducing the number of features in the sequence of reservoir states; possible options are: None (no dimensionality reduction), 'pca' (standard PCA) or 'tenpca' (tensorial PCA for multivariate time series data)
  • n_dim - number of resulting dimensions after the dimensionality reduction procedure

3. Representation:

  • mts_rep - type of multivariate time series representation. It can be 'last' (last state), 'mean' (mean of all states), 'output' (output model space), or 'reservoir' (reservoir model space)
  • w_ridge_embedding - regularization parameter of the ridge regression in the output model space and reservoir model space representation; ignored if mts_rep is None

4. Readout:

  • readout_type - type of readout used for classification. It can be 'lin' (ridge regression), 'mlp' (multilayer perceptron), 'svm' (support vector machine), or None. If None, the input representations will be stored in the .input_repr attribute: this is useful for clustering and visualization. Also, if None, the other Readout hyperparameters can be left unspecified.
  • w_ridge - regularization parameter of the ridge regression readout (only when readout_type is 'lin')
  • mlp_layout - list with the sizes of MLP layers, e.g. [20,20,10] defines a MLP with 3 layers of 20, 20 and 10 units respectively (only when readout_type is 'mlp')
  • batch_size - size of the mini batches used during training (only when readout_type is 'mlp')
  • num_epochs - number of iterations during the optimization (only when readout_type is 'mlp')
  • w_l2 = weight of the L2 regularization (only when readout_type is 'mlp')
  • learning_rate = learning rate in the gradient descent optimization (only when readout_type is 'mlp')
  • nonlinearity = type of activation function; it can be {'relu', 'tanh', 'logistic', 'identity'} (only when readout_type is 'mlp')
  • svm_gamma = bandwith of the RBF kernel (only when readout_type is 'svm')
  • svm_C = regularization for the SVM hyperplane (only when readout_type is 'svm')

Train and test the RC-model for classification

The training and test function requires in input training and test data, which must be provided as multidimensional NumPy arrays of shape [N,T,V], with:

  • N = number of samples
  • T = number of time steps in each sample
  • V = number of variables in each sample

Training and test labels (Y and Yte) must be provided in one-hot encoding format, i.e. a matrix [N,C], where C is the number of classes.

Training

RC_model.train(X, Y)

Inputs:

  • X, Y: training data and respective labels

Outputs:

  • tr_time: time (in seconds) used to train the classifier

Test

RC_module.test(Xte, Yte)

Inputs:

  • Xte, Yte: test data and respective labels

Outputs:

  • accuracy, F1 score: metrics achieved on the test data

Train the RC-model for clustering

As in the case of classification, the data must be provided as multidimensional NumPy arrays of shape [N,T,V]

Training

RC_model.train(X)

Inputs:

  • X: time series data

Outputs:

  • tr_time: time (in seconds) used to generate the representations

Additionally, the representations of the input data X are stored in the attribute RC_model.input_repr

Time series datasets

A collection of univariate and multivariate time series dataset is available for download here. The dataset are provided both in MATLAB and Python (Numpy) format. Original raw data come from UCI, UEA, and UCR public repositories.

Citation

Please, consider citing the original paper if you are using this library in your reasearch

@article{bianchi2020reservoir,
  title={Reservoir computing approaches for representation and classification of multivariate time series},
  author={Bianchi, Filippo Maria and Scardapane, Simone and L{\o}kse, Sigurd and Jenssen, Robert},
  journal={IEEE Transactions on Neural Networks and Learning Systems},
  year={2020},
  publisher={IEEE}
}

Tensorflow version

In the latest version of the repository there is no longer a dependency from Tensorflow, reducing the dependecies of this repository only to scipy and scikit-learn. The MLP readout is now based on the scikit-learn implementation that, however, does not support dropout and the two custom activation functions, Maxout and Kafnets. These functionalities are still available in the branch "Tensorflow". Checkout it to use the Tensorflow version of this repository.

License

The code is released under the MIT License. See the attached LICENSE file.

Comments
  • Input weights shape

    Input weights shape

    I'm facing a problem understanding the architecture of the reservoir if we look at it as a neural network. In here the shape of the input for the reservoir "current_input" in the code is (N, V), and the shape of the _input_weights is (internal_units, V). And normally, the shape of the _input_weights should be (size of reservoir* size of the input). Can I kindly ask why the size of the _input_weights is (internal_units, V) and not (internal_units, V*N)?

    opened by NorhenAbdennadher 3
  • Please add Code license

    Please add Code license

    Hi thanks for providing the code in github. As I don't know other ways to reach you , I am raising this as an issue. But my query is can we use your code for testing and if we use it for any competitions.may be adding a license info will benefit people like me. Thank you

    opened by aihill 2
  • ESN predicting constant values in classification

    ESN predicting constant values in classification

    Hi, I'm using the ESNClassification to classify a binary outcome from data which is input in the format (N,T,V). I've played around with the hyperparameters but no matter the combination, I get either only ones or only zeroes.

    Here is the config set-up:

    `config['seed'] = 1 np.random.seed(config['seed'])

    Hyperarameters of the reservoir

    config['n_internal_units'] = 100 # size of the reservoir config['spectral_radius'] = 0.4 # largest eigenvalue of the reservoir config['leak'] = 0.5 # amount of leakage in the reservoir state update (None or 1.0 --> no leakage) config['connectivity'] = 0.1 # percentage of nonzero connections in the reservoir config['input_scaling'] = 0.1 # scaling of the input weights config['noise_level'] = 0.01 # noise in the reservoir state update config['n_drop'] = 0 # transient states to be dropped config['bidir'] = False # if True, use bidirectional reservoir config['circ'] = False # use reservoir with circle topology

    Dimensionality reduction hyperparameters

    config['dimred_method'] ='tenpca' # options: {None (no dimensionality reduction), 'pca', 'tenpca'} config['n_dim'] = 40 # number of resulting dimensions after the dimensionality reduction procedure

    Type of MTS representation

    config['mts_rep'] = 'last' # MTS representation: {'last', 'mean', 'output', 'reservoir'} config['w_ridge_embedding'] = 5.0 # regularization parameter of the ridge regression

    Type of readout

    config['readout_type'] = 'mlp' # readout used for classification: {'lin', 'mlp', 'svm'}

    Linear readout hyperparameters

    config['w_ridge'] = 5.0 # regularization of the ridge regression readout

    SVM readout hyperparameters

    config['svm_gamma'] = 0.005 # bandwith of the RBF kernel config['svm_C'] = 5.0 # regularization for SVM hyperplane

    MLP readout hyperparameters

    config['mlp_layout'] = (40,1) # neurons in each MLP layer config['num_epochs'] = 20 # number of epochs config['w_l2'] = 0.05 # weight of the L2 regularization config['nonlinearity'] = 'relu' # type of activation function {'relu', 'tanh', 'logistic', 'identity'}`

    And the format of the input for training: print(X_tr.shape) print(X_te.shape) print(y_tr.shape) print(y_te.shape) with output: (7801, 200, 130) (1132, 200, 130) (7801, 1) (1132, 1)

    when I train on X_tr and y_tr, the output are either all ones or all zeroes. Any guidance on what could be causing this error? Cheers

    opened by Laoban-man 1
  • application to meteorological forcasting variable

    application to meteorological forcasting variable

    Hi, I want to expériment your work on meteorological forcasting. I'm not professionnal coder and work on python and keras since 2 years (never directly on tensorflow)). I take a look to your code and I'm not sure what and where modify the code to change the classification method (MAE in lieu of cross_entropy).

    I see the line 166 in MLP.py, my gess is to change this one to !? and remove the one hot enoder in the data preproscessing.

    An other modification is the posibility to set the activate fonction to, exemple: 'relu' and the output to 'linear', if my comprehension is good, all output layer have the same activation function.

    My goal is to use your esn to preprocessing my data and eventualy feed an more complex model.

    my last question, can I use only only reservoir processing (with no bidirectionnal ) and get the process value to feed an other kind of model? If yes, what is the goot output var in the code?

    Sorry for all this question, Im just a bit entthousiast about your work an I have so much idea to explore with it.

    PS: I add 'softsign' activation to your code, it is a very effective one! PPS: If you can just guide me I will try to integer my idea to your code and will summit you the modif for approbation if you dont have time.

    opened by jnthnroy 0
  • TensorPCA yields complex data type array which causes error in Ridge module

    TensorPCA yields complex data type array which causes error in Ridge module

    Hi @FilippoMB,

    I noticed that for the dataset that I'm using, the result of tensorPCA yields a complex data type Numpy array. This in turn causes an error in the ridge module which says that it does not support complex data type. Specifically, error ValueError: Complex data not supported is generated at https://github.com/FilippoMB/Time-series-classification-and-clustering-with-Reservoir-Computing/blob/master/code/modules.py#L205

    I don't face this issue when I use PCA with the same dataset.

    I tried to print out the eigenvalue and eigenvector data type at https://github.com/FilippoMB/Time-series-classification-and-clustering-with-Reservoir-Computing/blob/master/code/tensorPCA.py#L33-L38 and both these vectors are of the data type, complex128, for the dataset I am using.

    I tried Googling a bit and found some resources such as https://stackoverflow.com/questions/10420648/complex-eigen-values-in-pca-calculation and https://stackoverflow.com/questions/48695430/how-to-make-the-eigenvalues-and-eigenvectors-stay-real-instead-of-complex. From what I understood, due to some numerical error, the eigenvalues and eigenvectors can have a small imaginary value when linalg.eig is used. I'm not sure whether my understanding is correct.

    Any thoughts on this?

    opened by jagandecapri 0
Owner
Filippo Bianchi
Filippo Bianchi
Ian Covert 130 Jan 1, 2023
The source code and data of the paper "Instance-wise Graph-based Framework for Multivariate Time Series Forecasting".

IGMTF The source code and data of the paper "Instance-wise Graph-based Framework for Multivariate Time Series Forecasting". Requirements The framework

Wentao Xu 24 Dec 5, 2022
Spectral Temporal Graph Neural Network (StemGNN in short) for Multivariate Time-series Forecasting

Spectral Temporal Graph Neural Network for Multivariate Time-series Forecasting This repository is the official implementation of Spectral Temporal Gr

Microsoft 306 Dec 29, 2022
Code for the CIKM 2019 paper "DSANet: Dual Self-Attention Network for Multivariate Time Series Forecasting".

Dual Self-Attention Network for Multivariate Time Series Forecasting 20.10.26 Update: Due to the difficulty of installation and code maintenance cause

Kyon Huang 223 Dec 16, 2022
USAD - UnSupervised Anomaly Detection on multivariate time series

USAD - UnSupervised Anomaly Detection on multivariate time series Scripts and utility programs for implementing the USAD architecture. Implementation

null 116 Jan 4, 2023
Multivariate Time Series Forecasting with efficient Transformers. Code for the paper "Long-Range Transformers for Dynamic Spatiotemporal Forecasting."

Spacetimeformer Multivariate Forecasting This repository contains the code for the paper, "Long-Range Transformers for Dynamic Spatiotemporal Forecast

QData 440 Jan 2, 2023
This repo contains the code required to train the multivariate time-series Transformer.

Multi-Variate Time-Series Transformer This repo contains the code required to train the multivariate time-series Transformer. Download the data The No

Gregory Duthé 4 Nov 24, 2022
Time-series-deep-learning - Developing Deep learning LSTM, BiLSTM models, and NeuralProphet for multi-step time-series forecasting of stock price.

Stock Price Prediction Using Deep Learning Univariate Time Series Predicting stock price using historical data of a company using Neural networks for

Abdultawwab Safarji 7 Nov 27, 2022
tsai is an open-source deep learning package built on top of Pytorch & fastai focused on state-of-the-art techniques for time series classification, regression and forecasting.

Time series Timeseries Deep Learning Pytorch fastai - State-of-the-art Deep Learning with Time Series and Sequences in Pytorch / fastai

timeseriesAI 2.8k Jan 8, 2023
Learn about quantum computing and algorithm on quantum computing

quantum_computing this repo contains everything i learn about quantum computing and algorithm on quantum computing what is aquantum computing quantum

arfy slowy 8 Dec 25, 2022
Sky Computing: Accelerating Geo-distributed Computing in Federated Learning

Sky Computing Introduction Sky Computing is a load-balanced framework for federated learning model parallelism. It adaptively allocate model layers to

HPC-AI Tech 72 Dec 27, 2022
This Repostory contains the pretrained DTLN-aec model for real-time acoustic echo cancellation.

This Repostory contains the pretrained DTLN-aec model for real-time acoustic echo cancellation.

Nils L. Westhausen 182 Jan 7, 2023
Quickly comparing your image classification models with the state-of-the-art models (such as DenseNet, ResNet, ...)

Image Classification Project Killer in PyTorch This repo is designed for those who want to start their experiments two days before the deadline and ki

null 349 Dec 8, 2022
Graph Regularized Residual Subspace Clustering Network for hyperspectral image clustering

Graph Regularized Residual Subspace Clustering Network for hyperspectral image clustering

Yaoming Cai 5 Jul 18, 2022
Awesome Deep Graph Clustering is a collection of SOTA, novel deep graph clustering methods

ADGC: Awesome Deep Graph Clustering ADGC is a collection of state-of-the-art (SOTA), novel deep graph clustering methods (papers, codes and datasets).

yueliu1999 297 Dec 27, 2022
ICML 21 - Voice2Series: Reprogramming Acoustic Models for Time Series Classification

Voice2Series-Reprogramming Voice2Series: Reprogramming Acoustic Models for Time Series Classification International Conference on Machine Learning (IC

null 49 Jan 3, 2023
Implementation of ETSformer, state of the art time-series Transformer, in Pytorch

ETSformer - Pytorch Implementation of ETSformer, state of the art time-series Transformer, in Pytorch Install $ pip install etsformer-pytorch Usage im

Phil Wang 121 Dec 30, 2022
Cache Requests in Deta Bases and Echo them with Deta Micros

Deta Echo Cache Leverage the awesome Deta Micros and Deta Base to cache requests and echo them as needed. Stop worrying about slow public APIs or agre

Gingerbreadfork 8 Dec 7, 2021
PyTorch Autoencoders - Implementing a Variational Autoencoder (VAE) Series in Pytorch.

PyTorch Autoencoders Implementing a Variational Autoencoder (VAE) Series in Pytorch. Inspired by this repository Model List check model paper conferen

Subin An 8 Nov 21, 2022