Deep learning-based approach to discovering Granger causality networks in multivariate time series

Overview

Neural Granger Causality

The Neural-GC repository contains code for a deep learning-based approach to discovering Granger causality networks in multivariate time series. The methods implemented here are described in this paper.

Installation

To install the code, please clone the repository. All you need is Python 3, PyTorch (>= 0.4.0), numpy and scipy.

Usage

See examples of how to apply our approach in the notebooks cmlp_lagged_var_demo.ipynb, clstm_lorenz_demo.ipynb, and crnn_lorenz_demo.ipynb.

How it works

The models implemented in this repository, called the cMLP, cLSTM and cRNN, are neural networks that model multivariate time series by forecasting each time series separately. During training, sparse penalties on the input layer's weight matrix set groups of parameters to zero, which can be interpreted as discovering Granger non-causality.

The cMLP model can be trained with three different penalties: group lasso, group sparse group lasso, and hierarchical. The cLSTM and cRNN models both use a group lasso penalty, and they differ from one another only in the type of RNN cell they use.

Training models with non-convex loss functions and non-smooth penalties requires a specialized optimization strategy, and we use a proximal gradient descent approach (ISTA). Our paper finds that ISTA provides comparable performance to two other approaches: proximal gradient descent with a line search (GISTA), which guarantees convergence to a local minimum, and Adam, which converges faster (although it requires an additional thresholding parameter).

Other information

  • Selecting the right regularization strength can be difficult and time consuming. To get results for many regularization strengths, you may want to run parallel training jobs or use a warm start strategy.
  • Pretraining (training without regularization) followed by ISTA can lead to a different result than training directly with ISTA. Given the non-convex objective function, this is unsurprising, because the initialization from pretraining is very different than a random initialization. You may need to experiment to find what works best for you.
  • If you want to train a debiased model with the learned sparsity pattern, use the cMLPSparse, cLSTMSparse, and cRNNSparse classes.

Authors

References

  • Alex Tank, Ian Covert, Nicholas Foti, Ali Shojaie, Emily Fox. "Neural Granger Causality." Transactions on Pattern Analysis and Machine Intelligence, 2021.
Comments
  • Working with 2-D series

    Working with 2-D series

    Hi Ian, Can I modify the RNN to take in 2D series instead of 1D, is it feasible? Or is there some other solution which think is possible, like concatenating the values somehow? My current solution is just taking the mean across the 2nd dimension so that it becomes 1D. Appreciate any help!

    opened by SwapnilDreams100 10
  • NaN loss and all 1 GC

    NaN loss and all 1 GC

    HI I am using this for non-linear GC for some brain time series, which looks like this: image This is what it looks like on training with the params: CRNN context=10, lam=10.0, lam_ridge=1e-2, lr=1e-3, max_iter=20
    ----------Iter = 50---------- Loss = 151.557373 Variable usage = 99.95% ----------Iter = 100---------- Loss = nan Variable usage = 57.95% ----------Iter = 150---------- Loss = nan Variable usage = 50.97% ----------Iter = 200---------- Loss = nan Variable usage = 42.81% ----------Iter = 250---------- Loss = nan Variable usage = 36.39% ----------Iter = 300---------- Loss = nan Variable usage = 38.50% Stopping early

    The estimated GC is also all 1. Any intuition will be helpful!

    opened by SwapnilDreams100 7
  • GC-est has all zero values

    GC-est has all zero values

    I am trying to use Neural-GC for 8 variable time series having a length of 5840 each. I have tried both cmlp and clstm on the dataset. The training goes fine but the GC-est values are all zero. Can you help me with this. Following is the log at the start and end

    Start ----------Iter = 100---------- Loss = 0.713952 Variable usage = 100.00% ----------Iter = 200---------- Loss = 0.697342 Variable usage = 100.00% ----------Iter = 300---------- Loss = 0.684179 Variable usage = 100.00%

    End

    Loss = 0.025705 Variable usage = 100.00% ----------Iter = 49800---------- Loss = 0.025695 Variable usage = 100.00% ----------Iter = 49900---------- Loss = 0.025685 Variable usage = 100.00% ----------Iter = 50000---------- Loss = 0.025675 Variable usage = 100.00%

    opened by manmeet3591 4
  • Optimizing lambda

    Optimizing lambda

    Hi,

    What is a good way to optimize lambda here in cases where the true answer is unknown? I tried using the MSE of a validation set as a guide but upon testing it leads to many false detections. Is there a better way to optimize lambda?

    Thank you, Suryadi

    opened by suryadi-t 2
  • Why not get GC and prediction in one model?

    Why not get GC and prediction in one model?

    Hi, thanks for your awesome work! I have a question about cMLP codes, there are two models including cMLP and cMLPSparse, the former aims to get GC and the second one provide the prediction, why not use only one model to do it? Is there any difference between these two models? I would appreciate it if you could answer my question! : )

    opened by WEIFZH 2
  • Calculating gradient only on smooth penalty in train_model_ista for cLSTM

    Calculating gradient only on smooth penalty in train_model_ista for cLSTM

    I am wondering the reason you are calculating gradient only on smooth loss in the train_model_ista function, even when we are making the proximal gradient step.

    My doubt is shouldn't we call joint_loss.backward(), where joint_loss=smooth + nonsmooth? And this might be a little trivial, but can you explain the reasoning for taking mean_loss by dividing the 'smooth + nonsmooth' term with no. of features

    opened by tsnnay 1
  • why apply ridge regularization on output layer?

    why apply ridge regularization on output layer?

    Hi, It's a great implementation, but I noted the Loss function in the code is the MSE + ridge + nonsmooth, and in the paper, the Loss function seems to be the MSE + nonsmooth. What is the ridge? I saw the comment"Apply ridge penalty at linear layer" in clstm.py and "Apply ridge penalty at all subsequent layers" in cmlp.py, but still not sure why the ridge penalty is needed?

    Thanks, Yunjin

    opened by WuYunjin 1
  • Regularization Tuning

    Regularization Tuning

    Hey! Do you have any advice on finding the optimal regularization parameter for the cMLP? For my dataset, I had to increase the lambda and lambda_ridge parameters a lot to get Granger Causal coefficients near zero and was a bit concerned. Should I evaluate the model on the lower loss achievable during training or set up a test set?

    opened by mattdias96 1
Owner
Ian Covert
PhD student at University of Washington.
Ian Covert
Library for implementing reservoir computing models (echo state networks) for multivariate time series classification and clustering.

Framework overview This library allows to quickly implement different architectures based on Reservoir Computing (the family of approaches popularized

Filippo Bianchi 249 Dec 21, 2022
The source code and data of the paper "Instance-wise Graph-based Framework for Multivariate Time Series Forecasting".

IGMTF The source code and data of the paper "Instance-wise Graph-based Framework for Multivariate Time Series Forecasting". Requirements The framework

Wentao Xu 24 Dec 5, 2022
This project is a loose implementation of paper "Algorithmic Financial Trading with Deep Convolutional Neural Networks: Time Series to Image Conversion Approach"

Stock Market Buy/Sell/Hold prediction Using convolutional Neural Network This repo is an attempt to implement the research paper titled "Algorithmic F

Asutosh Nayak 136 Dec 28, 2022
Spectral Temporal Graph Neural Network (StemGNN in short) for Multivariate Time-series Forecasting

Spectral Temporal Graph Neural Network for Multivariate Time-series Forecasting This repository is the official implementation of Spectral Temporal Gr

Microsoft 306 Dec 29, 2022
Code for the CIKM 2019 paper "DSANet: Dual Self-Attention Network for Multivariate Time Series Forecasting".

Dual Self-Attention Network for Multivariate Time Series Forecasting 20.10.26 Update: Due to the difficulty of installation and code maintenance cause

Kyon Huang 223 Dec 16, 2022
USAD - UnSupervised Anomaly Detection on multivariate time series

USAD - UnSupervised Anomaly Detection on multivariate time series Scripts and utility programs for implementing the USAD architecture. Implementation

null 116 Jan 4, 2023
Multivariate Time Series Forecasting with efficient Transformers. Code for the paper "Long-Range Transformers for Dynamic Spatiotemporal Forecasting."

Spacetimeformer Multivariate Forecasting This repository contains the code for the paper, "Long-Range Transformers for Dynamic Spatiotemporal Forecast

QData 440 Jan 2, 2023
This repo contains the code required to train the multivariate time-series Transformer.

Multi-Variate Time-Series Transformer This repo contains the code required to train the multivariate time-series Transformer. Download the data The No

Gregory Duthé 4 Nov 24, 2022
Repository for Traffic Accident Benchmark for Causality Recognition (ECCV 2020)

Causality In Traffic Accident (Under Construction) Repository for Traffic Accident Benchmark for Causality Recognition (ECCV 2020) Overview Data Prepa

Tackgeun 21 Nov 20, 2022
CAUSE: Causality from AttribUtions on Sequence of Events

CAUSE: Causality from AttribUtions on Sequence of Events

Wei Zhang 21 Dec 1, 2022
DIR-GNN - Discovering Invariant Rationales for Graph Neural Networks

DIR-GNN "Discovering Invariant Rationales for Graph Neural Networks" (ICLR 2022)

Ying-Xin (Shirley) Wu 70 Nov 13, 2022
The source code for the Cutoff data augmentation approach proposed in this paper: "A Simple but Tough-to-Beat Data Augmentation Approach for Natural Language Understanding and Generation".

Cutoff: A Simple Data Augmentation Approach for Natural Language This repository contains source code necessary to reproduce the results presented in

Dinghan Shen 49 Dec 22, 2022
discovering subdomains, hidden paths, extracting unique links

python-website-crawler discovering subdomains, hidden paths, extracting unique links pip install -r requirements.txt discover subdomain: You can give

merve 4 Sep 5, 2022
Supplementary code for SIGGRAPH 2021 paper: Discovering Diverse Athletic Jumping Strategies

SIGGRAPH 2021: Discovering Diverse Athletic Jumping Strategies project page paper demo video Prerequisites Important Notes We suspect there are bugs i

null 54 Dec 6, 2022
The self-supervised goal reaching benchmark introduced in Discovering and Achieving Goals via World Models

Lexa-Benchmark Codebase for the self-supervised goal reaching benchmark introduced in 'Discovering and Achieving Goals via World Models'. Setup Create

null 1 Oct 14, 2021
Discovering Interpretable GAN Controls [NeurIPS 2020]

GANSpace: Discovering Interpretable GAN Controls Figure 1: Sequences of image edits performed using control discovered with our method, applied to thr

Erik Härkönen 1.7k Jan 3, 2023
Multivariate Boosted TRee

Multivariate Boosted TRee What is MBTR MBTR is a python package for multivariate boosted tree regressors trained in parameter space. The package can h

SUPSI-DACD-ISAAC 61 Dec 19, 2022
tsai is an open-source deep learning package built on top of Pytorch & fastai focused on state-of-the-art techniques for time series classification, regression and forecasting.

Time series Timeseries Deep Learning Pytorch fastai - State-of-the-art Deep Learning with Time Series and Sequences in Pytorch / fastai

timeseriesAI 2.8k Jan 8, 2023
Calculates carbon footprint based on fuel mix and discharge profile at the utility selected. Can create graphs and tabular output for fuel mix based on input file of series of power drawn over a period of time.

carbon-footprint-calculator Conda distribution ~/anaconda3/bin/conda install anaconda-client conda-build ~/anaconda3/bin/conda config --set anaconda_u

Seattle university Renewable energy research 7 Sep 26, 2022