Multi-Task Temporal Shift Attention Networks for On-Device Contactless Vitals Measurement (NeurIPS 2020)

Overview

MTTS-CAN: Multi-Task Temporal Shift Attention Networks for On-Device Contactless Vitals Measurement

License: MIT made-with-python

Paper

Xin Liu, Josh Fromm, Shwetak Patel, Daniel McDuff, “Multi-Task Temporal Shift Attention Networks for On-Device Contactless Vitals Measurement”, NeurIPS 2020, Oral Presentation (105 out of 9454 submissions)

Link: https://papers.nips.cc/paper/2020/file/e1228be46de6a0234ac22ded31417bc7-Paper.pdf

Abstract

Telehealth and remote health monitoring have become increasingly important during the SARS-CoV-2 pandemic and it is widely expected that this will have a lasting impact on healthcare practices. These tools can help reduce the risk of exposing patients and medical staff to infection, make healthcare services more accessible, and allow providers to see more patients. However, objective measurement of vital signs is challenging without direct contact with a patient. We present a video-based and on-device optical cardiopulmonary vital sign measurement approach. It leverages a novel multi-task temporal shift convolutional attention network (MTTS-CAN) and enables real-time cardiovascular and respiratory measurements on mobile platforms. We evaluate our system on an ARM CPU and achieve state-of-the-art accuracy while running at over 150 frames per second which enables real-time applications. Systematic experimentation on large benchmark datasets reveals that our approach leads to substantial (20%-50%) reductions in error and generalizes well across datasets.

Waveform Samples

Pulse

pulse_waveform

Respiration

resp_waveform

Citation

@article{liu2020multi,
  title={Multi-Task Temporal Shift Attention Networks for On-Device Contactless Vitals Measurement},
  author={Liu, Xin and Fromm, Josh and Patel, Shwetak and McDuff, Daniel},
  journal={arXiv preprint arXiv:2006.03790},
  year={2020}
}

Demo

Try out our live demo via link here.

Our demo code: https://github.com/ubicomplab/rppg-web

TVM

If you want to use TVM, pleaea follow this tutorial to set it up. Then, you will need to replace the code in incubator-tvm/python/tvm/relay/frontend/keras.py with our code/tvm-ops-mtts-can.py. We implemented required tensor operations for attention, tensor shift module used in our models.

Training

python code/train.py --exp_name test --exp_name [e.g., test] --data_dir [DATASET_PATH] --temporal [e.g., MMTS_CAN]

Inference

python code/predict_vitals.py --video_path [VIDEO_PATH]

The default video sampling rate is 30Hz.

Note

During the inference, the program will generate a sample pre-processed frame. Please ensure it is in portrait orientation. If not, you can comment out line 30 (rotation) in the inference_preprocess.py.

Requirements

Tensorflow 2.0+

conda create -n tf-gpu tensorflow-gpu cudatoolkit=10.1 -- this command takes care of both CUDA and TF environments.

pip install opencv-python scipy numpy matplotlib

Ifpip install opencv-python does not work, I found these commands always work on my mac.

conda install -c menpo opencv -y
pip install opencv-python

Contact

Please post your technical questions regarding this repo via Github Issues.

Comments
  • Request for infos regarding environment setup (strange results with tensorflow 2.8.0)

    Request for infos regarding environment setup (strange results with tensorflow 2.8.0)

    Hi Xin. It would be very helpful if you could deliver some information about the environment within which you ran your experiments (some kind of requirements.txt). It would be good to know the exact versions of the packages you used - especially regarding tensorflow.

    I discovered the problem, that when using tensorflow==2.8.0 (the latest version at the time of writing) the network outputs nonsense. I checked this backwards and beginning with tensorflow==2.6.x this problem occurs. I have no clue what causes this issue but I think it is important to know! So when using tensorflow==2.4.x all seems to work fine.

    Steps to reproduce:

    • Run inferencing on the same video (I used the UBFC)
      • once with tensorflow==2.8.0
      • once with tensorflow==2.4.1

    You can even check this by feeding a batch of zeros into the network. You will get significant different results depending on the tensorflow version.

    Tested this on two different machines - same observation. Maybe someone can check this - am still not sure if I am doing anything wrong as this is a very strange behavior.

    opened by BeCuriousS 3
  • Data preprocessing and interface

    Data preprocessing and interface

    Hello, I want to ask what kind of requirements are there for the data preprocessing part of the program, and how to write the interface with the main model framework? I would appreciate it if you could answer.

    opened by ruiiiiiiiiii 0
  • Could you provide more details how you pre-process the dataset?

    Could you provide more details how you pre-process the dataset?

    Dear Sir/Madam,

    Thanks for your code. I tried to reproduce your code on VIPL_HR1 dataset but I am not sure how the dataset is pre-processed. Could you provide more details such as how the video is preprocessed, what is the format of the path of the input data, and how the ground truth labels are preprocessed and saved? I checked your pre_process.py file and it seems that the dataset is saved into a h5py file with .mat. Could you at least provide an example of the h5py file so that I can know how to preprocess my dataset into the correct format? I also studied your predict_vitals.py file, but It seems that it only makes a prediction of a single video. Could you also provide a code about the evaluation metrics and how the ground truth labels are pre-processed?

    Best Wishes, Yue

    opened by tangyuelm 0
  • dataset problem

    dataset problem

    Hello,

    I am interested in your work but I cannot obtain the AFRL dataset. I tried to contact the authors of the [33] in your paper but no one replies to me. Do you know how to access such a dataset?

    Besides, I have successfully received the VIPL-HR v1 dataset, and I am wondering whether you can provide the benchmark on that dataset.

    Best Wishes, Yue

    opened by tangyuelm 0
  • TS-CAN and Hybrid-Can dataloaders

    TS-CAN and Hybrid-Can dataloaders

    From the paper I see that TS-CAN and Hybrid-CAN have the same input (appearance branch is averaged 1x36x36x3 and the motion branch has 10x36x36 Normalized frames) but there are two different loaders for both Hybrid CAN and TS-CAN in the dataloaders.py file. I fail to understand why there is a difference between them? Any clue would help.

    opened by B-Acharya 0
Owner
Xin Liu
CS PhD student at the University of Washington, Seattle. Research Interests: Mobile Health, Machine Learning and Sensing for Heathcare, and HCI.
Xin Liu
A solution to ensure Crowd Management with Contactless and Safe systems.

CovidTrack A Solution to ensure Crowd Management with Contactless and Safe systems. ML Model Mask Detection Social Distancing Detection Analytics Page

Om Khare 1 Nov 10, 2021
Implementation of "Fast and Flexible Temporal Point Processes with Triangular Maps" (Oral @ NeurIPS 2020)

Fast and Flexible Temporal Point Processes with Triangular Maps This repository includes a reference implementation of the algorithms described in "Fa

Oleksandr Shchur 20 Dec 2, 2022
[NeurIPS 2020] Blind Video Temporal Consistency via Deep Video Prior

pytorch-deep-video-prior (DVP) Official PyTorch implementation for NeurIPS 2020 paper: Blind Video Temporal Consistency via Deep Video Prior TensorFlo

Yazhou XING 90 Oct 19, 2022
《Truly shift-invariant convolutional neural networks》(2021)

Truly shift-invariant convolutional neural networks [Paper] Authors: Anadi Chaman and Ivan Dokmanić Convolutional neural networks were always assumed

Anadi Chaman 46 Dec 19, 2022
[TIP 2020] Multi-Temporal Scene Classification and Scene Change Detection with Correlation based Fusion

Multi-Temporal Scene Classification and Scene Change Detection with Correlation based Fusion Code for Multi-Temporal Scene Classification and Scene Ch

Lixiang Ru 33 Dec 12, 2022
Official PyTorch implementation of "AASIST: Audio Anti-Spoofing using Integrated Spectro-Temporal Graph Attention Networks"

AASIST This repository provides the overall framework for training and evaluating audio anti-spoofing systems proposed in 'AASIST: Audio Anti-Spoofing

Clova AI Research 56 Jan 2, 2023
Codes for TIM2021 paper "Anchor-Based Spatio-Temporal Attention 3-D Convolutional Networks for Dynamic 3-D Point Cloud Sequences"

Codes for TIM2021 paper "Anchor-Based Spatio-Temporal Attention 3-D Convolutional Networks for Dynamic 3-D Point Cloud Sequences"

Intelligent Robotics and Machine Vision Lab 4 Jul 19, 2022
Multi-task Learning of Order-Consistent Causal Graphs (NeuRIPs 2021)

Multi-task Learning of Order-Consistent Causal Graphs (NeuRIPs 2021) Authors: Xinshi Chen, Haoran Sun, Caleb Ellington, Eric Xing, Le Song Link to pap

Xinshi Chen 2 Dec 20, 2021
CVPR2021: Temporal Context Aggregation Network for Temporal Action Proposal Refinement

Temporal Context Aggregation Network - Pytorch This repo holds the pytorch-version codes of paper: "Temporal Context Aggregation Network for Temporal

Zhiwu Qing 63 Sep 27, 2022
Implementation of temporal pooling methods studied in [ICIP'20] A Comparative Evaluation Of Temporal Pooling Methods For Blind Video Quality Assessment

Implementation of temporal pooling methods studied in [ICIP'20] A Comparative Evaluation Of Temporal Pooling Methods For Blind Video Quality Assessment

Zhengzhong Tu 5 Sep 16, 2022
Cascaded Deep Video Deblurring Using Temporal Sharpness Prior and Non-local Spatial-Temporal Similarity

This repository is the official PyTorch implementation of Cascaded Deep Video Deblurring Using Temporal Sharpness Prior and Non-local Spatial-Temporal Similarity

hippopmonkey 4 Dec 11, 2022
A real-time motion capture system that estimates poses and global translations using only 6 inertial measurement units

TransPose Code for our SIGGRAPH 2021 paper "TransPose: Real-time 3D Human Translation and Pose Estimation with Six Inertial Sensors". This repository

Xinyu Yi 261 Dec 31, 2022
Defending graph neural networks against adversarial attacks (NeurIPS 2020)

GNNGuard: Defending Graph Neural Networks against Adversarial Attacks Authors: Xiang Zhang ([email protected]), Marinka Zitnik (marinka@hms.

Zitnik Lab @ Harvard 44 Dec 7, 2022
ESTDepth: Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks (CVPR 2021)

ESTDepth: Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks (CVPR 2021) Project Page | Video | Paper | Data We present a novel metho

null 65 Nov 28, 2022
Implementation of our paper 'RESA: Recurrent Feature-Shift Aggregator for Lane Detection' in AAAI2021.

RESA PyTorch implementation of the paper "RESA: Recurrent Feature-Shift Aggregator for Lane Detection". Our paper has been accepted by AAAI2021. Intro

null 137 Jan 2, 2023
Official code for "Mean Shift for Self-Supervised Learning"

MSF Official code for "Mean Shift for Self-Supervised Learning" Requirements Python >= 3.7.6 PyTorch >= 1.4 torchvision >= 0.5.0 faiss-gpu >= 1.6.1 In

UMBC Vision 44 Nov 21, 2022
Supporting code for the paper "Dangers of Bayesian Model Averaging under Covariate Shift"

Dangers of Bayesian Model Averaging under Covariate Shift This repository contains the code to reproduce the experiments in the paper Dangers of Bayes

Pavel Izmailov 25 Sep 21, 2022
This is official implementaion of paper "Token Shift Transformer for Video Classification".

This is official implementaion of paper "Token Shift Transformer for Video Classification". We achieve SOTA performance 80.40% on Kinetics-400 val. Paper link

VideoNet 60 Dec 30, 2022
The code repository for "RCNet: Reverse Feature Pyramid and Cross-scale Shift Network for Object Detection" (ACM MM'21)

RCNet: Reverse Feature Pyramid and Cross-scale Shift Network for Object Detection (ACM MM'21) By Zhuofan Zong, Qianggang Cao, Biao Leng Introduction F

TempleX 9 Jul 30, 2022