Code and model benchmarks for "SEVIR : A Storm Event Imagery Dataset for Deep Learning Applications in Radar and Satellite Meteorology"

Related tags

Deep Learning neurips-2020-sevir

Overview

NeurIPS 2020 SEVIR

Code for paper: SEVIR : A Storm Event Imagery Dataset for Deep Learning Applications in Radar and Satellite Meteorology

Requirements

To test pretrained models and train on single GPU, this requires

tensorflow 2.1.0 or higher
pandas
matplotlib
pytorch 1.4.0 or higher to calculate the LPIPS metric. See Perceptual Similarity Metric and Dataset

Distributed (multi-GPU) training of these models requires

Horovod 0.19.0 or higher for distributed training. See Horovod

To visualize results with statelines as is done in the paper, a geospatial plotting library is required. We recommend either of the following:

basemap
cartopy

To run the rainymotion benchmark, you'll also need to install this module. See https://rainymotion.readthedocs.io/en/latest/

Downloading pretrained models

To download the models trained in the paper, run the following

cd models/
python download_models.py

See the notebooks directory for how to apply these models to some sample test data.

Downloading SEVIR

Download information and additional resources for SEVIR data are available at https://registry.opendata.aws/sevir/.

To download, install AWS CLI, and download all of SEVIR (~1TB) to your current directory run

aws s3 sync --no-sign-request s3://sevir .

Extracting training/testing datasets

The models implemented in the paper are implemented on training data collected prior to June 1, 2019, and testing data collected after June 1, 2019. These datasets can be extrated from SEVIR by running the following scripts (one for nowcasting, and one for synrad). Depending on your CPU and speed of your filesystem, these scripts may take several hours to run.

cd src/data

# Generates nowcast training & testing datasets
python make_nowcast_dataset.py --sevir_data ../../data/sevir --sevir_catalog ../../data/CATALOG.csv --output_location ../../data/interim/

# Generate synrad training & testing datasets
python make_synrad_dataset.py --sevir_data ../../data/sevir --sevir_catalog ../../data/CATALOG.csv --output_location ../../data/interim/

Testing pretrained models

Pretrained models used in the paper are located under models/. To run test metrics on these datasets, run the test_*.py scripts and point to the pretrained model, and the test dataset. To test, we recommend setting num_test to a small number, and increasing thereafter (not specifying will use all test data). This shows an example

# Test a trained synrad model
python test_synrad.py  --num_test 1000 --model models/synrad_mse.h5   --test_data data/interim/synrad_testing.h5  -output test_output.csv

Also check out the examples in notebooks/ for how to run pretrained models and visualize results.

Model training

This section describes how to train the nowcast and synthetic weather radar (synrad) models yourself. Models discussed in the paper were trained using distributed training over 8 NVIDIA Volta V100 GPUs with 32GB of memory. However the code in this repo is setup to train on a single GPU.

The training datasets are pretty large, and running on the full dataset requires a significant amount of RAM. We suggest that you first test the model with --num_train set to a low number to start, and increase this to the limits of your system. Training with all the data may require writing your own generator that batches the data so that it fits in memory.

Training `nowcast`

To train the nowcast model, make sure the nowcast_training.h5 file is created using the previous steps. Below we set num_train to be only 1024, but this should be increased for better results. Results described in the paper were generated with num_train = 44,760. When training the model with the mse loss, the largest batch size possible is 32 and for all other cases, a maximum batch size of 4 must be used. Larger batch sizes will result in out-of-memory errors on the GPU. There are four choices of loss functions configured:

MSE Loss:

python train_nowcast.py   --num_train 1024  --nepochs 25  --batch_size 32 --loss_fn  mse  --logdir logs/mse_`date +yymmddHHMMSS`

Style and Content Loss:

python train_nowcast.py   --num_train 1024  --nepochs 25  --batch_size 4 --loss_fn  vgg  --logdir logs/mse_`date +yymmddHHMMSS`

MSE + Style and Content Loss:

python train_nowcast.py   --num_train 1024  --nepochs 25  --batch_size 4 --loss_fn  mse+vgg  --logdir logs/mse_`date +yymmddHHMMSS`

Conditional GAN Loss:

python train_nowcast.py   --num_train 1024  --nepochs 25  --batch_size 32 --loss_fn  cgan  --logdir logs/mse_`date +yymmddHHMMSS`

Each of these will write several files into the date-stamped directory in logs/, including tracking of metrics, and a model saved after each epoch. Run python train_nowcast.py -h for additional input parameters that can be specified.

Training `synrad`

To train synrad, make sure the synrad_training.h5 file is created using the previous step above. Below we set num_train to be only 10,000, but this should be increased for better results. There are three choices of loss functions configured:

MSE Loss:

python train_synrad.py   --num_train 10000  --nepochs 100  --loss_fn  mse  --loss_weights 1.0  --logdir logs/mse_`date +yymmddHHMMSS`

MSE+Content Loss:

python train_synrad.py   --num_train 10000  --nepochs 100  --loss_fn  mse+vgg  --loss_weights 1.0 1.0 --logdir logs/mse_vgg_`date +yymmddHHMMSS`

cGAN + MAE Loss:

python train_synrad.py   --num_train 10000  --nepochs 100  --loss_fn  gan+mae  --loss_weights 1.0 --logdir logs/gan_mae_`date +yymmddHHMMSS`

Each of these will write several files into the date-stamped directory in logs/, including tracking of metrics, and a model saved after each epoch.

Analyzing results

The notebooks under notebooks contain code for anaylzing the results of training, and for visualizing the results on sample test cases.

Comments

Where is the model's py file

I wanted to learn more about the earthformer model structure, but I couldn't find it in the Model folder, so I wanted to ask you where is the py file for the earthformer model?Looking forward to your reply！

opened by kasa999 0
link to nowcast_testing.h5 is broken

The link https://www.dropbox.com/s/27pqogywg75as5f in AnalyzeNowcast.ipynb seems to be broken... please fix or re-upload the nowcasting sample test dataset: nowcast_testing.h5 if possible. Thank you.

opened by andrekos 2

Code and model benchmarks for "SEVIR : A Storm Event Imagery Dataset for Deep Learning Applications in Radar and Satellite Meteorology"

Related tags

Overview

NeurIPS 2020 SEVIR

Requirements

Downloading pretrained models

Downloading SEVIR

Extracting training/testing datasets

Testing pretrained models

Model training

Training nowcast

MSE Loss:

Style and Content Loss:

MSE + Style and Content Loss:

Conditional GAN Loss:

Training synrad

MSE Loss:

MSE+Content Loss:

cGAN + MAE Loss:

Analyzing results

You might also like...

Capture all information throughout your model's development in a reproducible way and tie results directly to the model code!

This repo uses a combination of logits and feature distillation method to teach the PSPNet model of ResNet18 backbone with the PSPNet model of ResNet50 backbone. All the models are trained and tested on the PASCAL-VOC2012 dataset.

An atmospheric growth and evolution model based on the EVo degassing model and FastChem 2.0

This is the code for our KILT leaderboard submission to the T-REx and zsRE tasks. It includes code for training a DPR model then continuing training with RAG.

Reference implementation of code generation projects from Facebook AI Research. General toolkit to apply machine learning to code, from dataset creation to model training and evaluation. Comes with pretrained models.

Implementation of STAM (Space Time Attention Model), a pure and simple attention model that reaches SOTA for video classification

Demonstrates how to divide a DL model into multiple IR model files (division) and introduce a simplest way to implement a custom layer works with OpenVINO IR models.

A multi-functional library for full-stack Deep Learning. Simplifies Model Building, API development, and Model Deployment.

Comments

Where is the model's py file

link to nowcast_testing.h5 is broken

Owner

USAF - MIT Artificial Intelligence Accelerator

Training code and evaluation benchmarks for the "Self-Supervised Policy Adaptation during Deployment" paper.

"NAS-Bench-301 and the Case for Surrogate Benchmarks for Neural Architecture Search".

Sequence modeling benchmarks and temporal convolutional networks

NeurIPS 2021 Datasets and Benchmarks Track

Benchmarks for semi-supervised domain generalization.

Benchmarks for the Optimal Power Flow Problem

Benchmark spaces - Benchmarks of how well different two dimensional spaces work for clustering algorithms

This is the official source code for SLATE. We provide the code for the model, the training code, and a dataset loader for the 3D Shapes dataset. This code is implemented in Pytorch.

In this project we investigate the performance of the SetCon model on realistic video footage. Therefore, we implemented the model in PyTorch and tested the model on two example videos.

Step by Step on how to create an vision recognition model using LOBE.ai, export the model and run the model in an Azure Function

Training `nowcast`

Training `synrad`