Implementation of ToeplitzLDA for spatiotemporal stationary time series data.

Jan Sosulski

Last update: Nov 7, 2022

Related tags

toeplitzlda

Overview

ToeplitzLDA

Code for the ToeplitzLDA classifier proposed in here. The classifier conforms sklearn and can be used as a drop-in replacement for other LDA classifiers. For in-depth usage refer to the learning from label proportions (LLP) example or the example script.

Note we used Ubuntu 20.04 with python 3.8.10 to generate our results.

Getting Started / User Setup

If you only want to use this library, you can use the following setup. Note that this setup is based on a fresh Ubuntu 20.04 installation.

Getting fresh ubuntu ready

apt install python3-pip python3-venv

Python package installation

In this setup, we assume you want to run the examples that actually make use of real EEG data or the actual unsupervised speller replay. If you only want to employ ToeplitzLDA in your own spatiotemporal data / without mne and moabb then you can remove the package extra neuro, i.e. pip install toeplitzlda or pip install toeplitzlda[solver]

(Optional) Install fortran Compiler. On ubuntu: apt install gfortran
Create virtual environment: python3 -m venv toeplitzlda_venv
Activate virtual environment: source toeplitzlda_venv/bin/activate
Install toeplitzlda: pip install toeplitzlda[neuro,solver], if you dont have a fortran compiler: pip install toeplitzlda[neuro]

Check if everything works

Either clone this repo or just download the scripts/example_toeplitz_lda_bci_data.py file and run it: python example_toeplitz_lda_bci_data.py. Note that this will automatically download EEG data with a size of around 650MB.

Alternatively, you can use the scripts/example_toeplitz_lda_generated_data.py where artificial data is generated. Note however, that only stationary background noise is modeled and no interfering artifacts as is the case in, e.g., real EEG data. As a result, the overfitting effect of traditional slda on these artifacts is reduced.

Using ToeplitzLDA in place of traditional shrinkage LDA from sklearn

If you have already your own pipeline, you can simply add toeplitzlda as a dependency in your project and then replace sklearns LDA, i.e., instead of:

from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
clf = LinearDiscriminantAnalysis(solver="lsqr", shrinkage="auto")  # or eigen solver

use

from toeplitzlda.classification import ToeplitzLDA
clf = ToeplitzLDA(n_channels=your_n_channels)

where your_n_channels is the number of channels of your signal and needs to be provided for this method to work.

If you prefer using sklearn, you can only replace the covariance estimation part, note however, that this in practice (on our data) yields worse performance, as sklearn estimates the class-wise covariance matrices and averages them afterwards, whereas we remove the class-wise means and the estimate one covariance matrix from the pooled data.

So instead of:

from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
clf = LinearDiscriminantAnalysis(solver="lsqr", shrinkage="auto")  # or eigen solver

you would use

from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
from toeplitzlda.classification.covariance import ToepTapLW
toep_cov = ToepTapLW(n_channels=your_n_channels)
clf = LinearDiscriminantAnalysis(solver="lsqr", covariance_estimator=toep_cov)  # or eigen solver

Development Setup

We use a fortran compiler to provide speedups for solving block-Toeplitz linear equation systems. If you are on ubuntu you can install gfortran.

We use poetry for dependency management. If you have it installed you can simply use poetry install to set up the virtual environment with all dependencies. All extra features can be installed with poetry install -E solver,neuro.

If setup does not work for you, please open an issue. We cannot guarantee support for many different platforms, but could provide a singularity image.

Learning from label proportions

Use the run_llp.py script to apply ToeplitzLDA in the LLP scenario and create the results file for the different preprocessing parameters. These can then be visualized using visualize_llp.py to create the plots shown in our publication. Note that running LLP takes a while and the two datasets will be downloaded automatically and are approximately 16GB in size. Alternatively, you can use the results provided by us that are stored in scripts/usup_replay/provided_results by moving/copying them to the location that visualize_llp.py looks for.

ERP benchmark

This is not yet available.

Note this benchmark will take quite a long time if you do not have access to a computing cluster. The public datasets (including the LLP datasets) total a size of approximately 120GB.

BLOCKING TODO: How should we handle the private datasets?

FAQ

Why is my classification performance for my stationary spatiotemporal data really bad?

Check if your data is in channel-prime order, i.e., in the flattened feature vector, you first enumerate over all channels (or some other spatially distributed sensors) for the first time point and then for the second time point and so on. If this is not the case, tell the classifier: e.g. ToeplitzLDA(n_channels=16, data_is_channel_prime=False)

You might also like...

Implementation of the paper NAST: Non-Autoregressive Spatial-Temporal Transformer for Time Series Forecasting.

Non-AR Spatial-Temporal Transformer Introduction Implementation of the paper NAST: Non-Autoregressive Spatial-Temporal Transformer for Time Series For

66 Nov 28, 2022

This project is a loose implementation of paper "Algorithmic Financial Trading with Deep Convolutional Neural Networks: Time Series to Image Conversion Approach"

Stock Market Buy/Sell/Hold prediction Using convolutional Neural Network This repo is an attempt to implement the research paper titled "Algorithmic F

136 Dec 28, 2022

PyTorch implementation of Soft-DTW: a Differentiable Loss Function for Time-Series in CUDA

Implementation of ToeplitzLDA for spatiotemporal stationary time series data.

Related tags

Overview

ToeplitzLDA

Getting Started / User Setup

Getting fresh ubuntu ready

Python package installation

Check if everything works

Using ToeplitzLDA in place of traditional shrinkage LDA from sklearn

Development Setup

Learning from label proportions

ERP benchmark

FAQ

Why is my classification performance for my stationary spatiotemporal data really bad?

You might also like...

Implementation of the paper NAST: Non-Autoregressive Spatial-Temporal Transformer for Time Series Forecasting.

This project is a loose implementation of paper "Algorithmic Financial Trading with Deep Convolutional Neural Networks: Time Series to Image Conversion Approach"

PyTorch implementation of Soft-DTW: a Differentiable Loss Function for Time-Series in CUDA

An implementation of the [Hierarchical (Sig-Wasserstein) GAN] algorithm for large dimensional Time Series Generation

Implementation of ETSformer, state of the art time-series Transformer, in Pytorch

A unified framework for machine learning with time series

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

MINIROCKET: A Very Fast (Almost) Deterministic Transform for Time Series Classification

AntroPy: entropy and complexity of (EEG) time-series in Python

Owner

Jan Sosulski

Time-series-deep-learning - Developing Deep learning LSTM, BiLSTM models, and NeuralProphet for multi-step time-series forecasting of stock price.

Official implementation for NIPS'17 paper: PredRNN: Recurrent Neural Networks for Predictive Learning Using Spatiotemporal LSTMs.

This repository is an open-source implementation of the ICRA 2021 paper: Locus: LiDAR-based Place Recognition using Spatiotemporal Higher-Order Pooling.

PyTorch implementation of the R2Plus1D convolution based ResNet architecture described in the paper "A Closer Look at Spatiotemporal Convolutions for Action Recognition"

A multi-entity Transformer for multi-agent spatiotemporal modeling.

PyTorch Code of "Memory In Memory: A Predictive Neural Network for Learning Higher-Order Non-Stationarity from Spatiotemporal Dynamics"

Physics-informed convolutional-recurrent neural networks for solving spatiotemporal PDEs

The source code and data of the paper "Instance-wise Graph-based Framework for Multivariate Time Series Forecasting".

TAug :: Time Series Data Augmentation using Deep Generative Models

A real world application of a Recurrent Neural Network on a binary classification of time series data