Deep learning-based approach to discovering Granger causality networks in multivariate time series

Ian Covert

Last update: Jan 1, 2023

Related tags

Deep Learning Neural-GC

Overview

Neural Granger Causality

The Neural-GC repository contains code for a deep learning-based approach to discovering Granger causality networks in multivariate time series. The methods implemented here are described in this paper.

Installation

To install the code, please clone the repository. All you need is Python 3, PyTorch (>= 0.4.0), numpy and scipy.

Usage

See examples of how to apply our approach in the notebooks cmlp_lagged_var_demo.ipynb, clstm_lorenz_demo.ipynb, and crnn_lorenz_demo.ipynb.

How it works

The models implemented in this repository, called the cMLP, cLSTM and cRNN, are neural networks that model multivariate time series by forecasting each time series separately. During training, sparse penalties on the input layer's weight matrix set groups of parameters to zero, which can be interpreted as discovering Granger non-causality.

The cMLP model can be trained with three different penalties: group lasso, group sparse group lasso, and hierarchical. The cLSTM and cRNN models both use a group lasso penalty, and they differ from one another only in the type of RNN cell they use.

Training models with non-convex loss functions and non-smooth penalties requires a specialized optimization strategy, and we use a proximal gradient descent approach (ISTA). Our paper finds that ISTA provides comparable performance to two other approaches: proximal gradient descent with a line search (GISTA), which guarantees convergence to a local minimum, and Adam, which converges faster (although it requires an additional thresholding parameter).

Other information

Selecting the right regularization strength can be difficult and time consuming. To get results for many regularization strengths, you may want to run parallel training jobs or use a warm start strategy.
Pretraining (training without regularization) followed by ISTA can lead to a different result than training directly with ISTA. Given the non-convex objective function, this is unsurprising, because the initialization from pretraining is very different than a random initialization. You may need to experiment to find what works best for you.
If you want to train a debiased model with the learned sparsity pattern, use the cMLPSparse, cLSTMSparse, and cRNNSparse classes.

Authors

Ian Covert ([email protected])
Alex Tank
Nicholas Foti
Ali Shojaie
Emily Fox

References

Alex Tank, Ian Covert, Nicholas Foti, Ali Shojaie, Emily Fox. "Neural Granger Causality." Transactions on Pattern Analysis and Machine Intelligence, 2021.

Comments

Working with 2-D series

Hi Ian, Can I modify the RNN to take in 2D series instead of 1D, is it feasible? Or is there some other solution which think is possible, like concatenating the values somehow? My current solution is just taking the mean across the 2nd dimension so that it becomes 1D. Appreciate any help!

opened by SwapnilDreams100 10
NaN loss and all 1 GC

HI I am using this for non-linear GC for some brain time series, which looks like this: This is what it looks like on training with the params: CRNN context=10, lam=10.0, lam_ridge=1e-2, lr=1e-3, max_iter=20
----------Iter = 50---------- Loss = 151.557373 Variable usage = 99.95% ----------Iter = 100---------- Loss = nan Variable usage = 57.95% ----------Iter = 150---------- Loss = nan Variable usage = 50.97% ----------Iter = 200---------- Loss = nan Variable usage = 42.81% ----------Iter = 250---------- Loss = nan Variable usage = 36.39% ----------Iter = 300---------- Loss = nan Variable usage = 38.50% Stopping early

The estimated GC is also all 1. Any intuition will be helpful!

opened by SwapnilDreams100 7
GC-est has all zero values

I am trying to use Neural-GC for 8 variable time series having a length of 5840 each. I have tried both cmlp and clstm on the dataset. The training goes fine but the GC-est values are all zero. Can you help me with this. Following is the log at the start and end

Start ----------Iter = 100---------- Loss = 0.713952 Variable usage = 100.00% ----------Iter = 200---------- Loss = 0.697342 Variable usage = 100.00% ----------Iter = 300---------- Loss = 0.684179 Variable usage = 100.00%

End

Loss = 0.025705 Variable usage = 100.00% ----------Iter = 49800---------- Loss = 0.025695 Variable usage = 100.00% ----------Iter = 49900---------- Loss = 0.025685 Variable usage = 100.00% ----------Iter = 50000---------- Loss = 0.025675 Variable usage = 100.00%

opened by manmeet3591 4
Optimizing lambda

Hi,

What is a good way to optimize lambda here in cases where the true answer is unknown? I tried using the MSE of a validation set as a guide but upon testing it leads to many false detections. Is there a better way to optimize lambda?

Thank you, Suryadi

opened by suryadi-t 2
Why not get GC and prediction in one model?

Hi, thanks for your awesome work! I have a question about cMLP codes, there are two models including cMLP and cMLPSparse, the former aims to get GC and the second one provide the prediction, why not use only one model to do it? Is there any difference between these two models? I would appreciate it if you could answer my question! : )

opened by WEIFZH 2
Calculating gradient only on smooth penalty in train_model_ista for cLSTM

I am wondering the reason you are calculating gradient only on smooth loss in the train_model_ista function, even when we are making the proximal gradient step.

My doubt is shouldn't we call joint_loss.backward(), where joint_loss=smooth + nonsmooth? And this might be a little trivial, but can you explain the reasoning for taking mean_loss by dividing the 'smooth + nonsmooth' term with no. of features

opened by tsnnay 1
why apply ridge regularization on output layer?

Hi, It's a great implementation, but I noted the Loss function in the code is the MSE + ridge + nonsmooth, and in the paper, the Loss function seems to be the MSE + nonsmooth. What is the ridge? I saw the comment"Apply ridge penalty at linear layer" in clstm.py and "Apply ridge penalty at all subsequent layers" in cmlp.py, but still not sure why the ridge penalty is needed?

Thanks, Yunjin

opened by WuYunjin 1
Regularization Tuning

Hey! Do you have any advice on finding the optimal regularization parameter for the cMLP? For my dataset, I had to increase the lambda and lambda_ridge parameters a lot to get Granger Causal coefficients near zero and was a bit concerned. Should I evaluate the model on the lower loss achievable during training or set up a test set?

opened by mattdias96 1

Owner

Ian Covert

PhD student at University of Washington.

GitHub

Library for implementing reservoir computing models (echo state networks) for multivariate time series classification and clustering.

Framework overview This library allows to quickly implement different architectures based on Reservoir Computing (the family of approaches popularized

249 Dec 21, 2022

The source code and data of the paper "Instance-wise Graph-based Framework for Multivariate Time Series Forecasting".

IGMTF The source code and data of the paper "Instance-wise Graph-based Framework for Multivariate Time Series Forecasting". Requirements The framework

24 Dec 5, 2022

This project is a loose implementation of paper "Algorithmic Financial Trading with Deep Convolutional Neural Networks: Time Series to Image Conversion Approach"

Stock Market Buy/Sell/Hold prediction Using convolutional Neural Network This repo is an attempt to implement the research paper titled "Algorithmic F

136 Dec 28, 2022

Spectral Temporal Graph Neural Network (StemGNN in short) for Multivariate Time-series Forecasting

Spectral Temporal Graph Neural Network for Multivariate Time-series Forecasting This repository is the official implementation of Spectral Temporal Gr

306 Dec 29, 2022

Code for the CIKM 2019 paper "DSANet: Dual Self-Attention Network for Multivariate Time Series Forecasting".

Dual Self-Attention Network for Multivariate Time Series Forecasting 20.10.26 Update: Due to the difficulty of installation and code maintenance cause

223 Dec 16, 2022

USAD - UnSupervised Anomaly Detection on multivariate time series

USAD - UnSupervised Anomaly Detection on multivariate time series Scripts and utility programs for implementing the USAD architecture. Implementation

116 Jan 4, 2023

Multivariate Time Series Forecasting with efficient Transformers. Code for the paper "Long-Range Transformers for Dynamic Spatiotemporal Forecasting."

Spacetimeformer Multivariate Forecasting This repository contains the code for the paper, "Long-Range Transformers for Dynamic Spatiotemporal Forecast

440 Jan 2, 2023

This repo contains the code required to train the multivariate time-series Transformer.

Multi-Variate Time-Series Transformer This repo contains the code required to train the multivariate time-series Transformer. Download the data The No

4 Nov 24, 2022

Repository for Traffic Accident Benchmark for Causality Recognition (ECCV 2020)

Causality In Traffic Accident (Under Construction) Repository for Traffic Accident Benchmark for Causality Recognition (ECCV 2020) Overview Data Prepa

21 Nov 20, 2022

CAUSE: Causality from AttribUtions on Sequence of Events

21 Dec 1, 2022

DIR-GNN - Discovering Invariant Rationales for Graph Neural Networks

DIR-GNN "Discovering Invariant Rationales for Graph Neural Networks" (ICLR 2022)

70 Nov 13, 2022

The source code for the Cutoff data augmentation approach proposed in this paper: "A Simple but Tough-to-Beat Data Augmentation Approach for Natural Language Understanding and Generation".

Cutoff: A Simple Data Augmentation Approach for Natural Language This repository contains source code necessary to reproduce the results presented in

49 Dec 22, 2022

discovering subdomains, hidden paths, extracting unique links

python-website-crawler discovering subdomains, hidden paths, extracting unique links pip install -r requirements.txt discover subdomain: You can give

4 Sep 5, 2022

Supplementary code for SIGGRAPH 2021 paper: Discovering Diverse Athletic Jumping Strategies

SIGGRAPH 2021: Discovering Diverse Athletic Jumping Strategies project page paper demo video Prerequisites Important Notes We suspect there are bugs i

54 Dec 6, 2022

The self-supervised goal reaching benchmark introduced in Discovering and Achieving Goals via World Models

Lexa-Benchmark Codebase for the self-supervised goal reaching benchmark introduced in 'Discovering and Achieving Goals via World Models'. Setup Create

1 Oct 14, 2021

Discovering Interpretable GAN Controls [NeurIPS 2020]

GANSpace: Discovering Interpretable GAN Controls Figure 1: Sequences of image edits performed using control discovered with our method, applied to thr

1.7k Jan 3, 2023

Multivariate Boosted TRee

Multivariate Boosted TRee What is MBTR MBTR is a python package for multivariate boosted tree regressors trained in parameter space. The package can h

61 Dec 19, 2022

tsai is an open-source deep learning package built on top of Pytorch & fastai focused on state-of-the-art techniques for time series classification, regression and forecasting.

Time series Timeseries Deep Learning Pytorch fastai - State-of-the-art Deep Learning with Time Series and Sequences in Pytorch / fastai

2.8k Jan 8, 2023

Calculates carbon footprint based on fuel mix and discharge profile at the utility selected. Can create graphs and tabular output for fuel mix based on input file of series of power drawn over a period of time.

carbon-footprint-calculator Conda distribution ~/anaconda3/bin/conda install anaconda-client conda-build ~/anaconda3/bin/conda config --set anaconda_u

Seattle university Renewable energy research

7 Sep 26, 2022