EssentialMC2 Video Understanding

Alibaba

Last update: Dec 11, 2022

Related tags

Deep Learning EssentialMC2

Overview

EssentialMC2

Introduction

EssentialMC2 is a complete system to solve video understanding tasks including MHRL(representation learning), MECR2( relation reasoning) and MOSL3(openset life-long learning) powered by DAMO Academy MinD(Machine IntelligenNce of Damo) Lab.

Features

Simple and easy to use
High efficiency
Include SOTA papers presented by DAMO Academy
Include various pretrained models

Installation

Install by pip

Run pip install essmc2.

Install from source

Requirements

Python 3.6+
PytTorch 1.5+

Run python setup.py install. For each specific task, please refer to task specific README.

Model Zoo

Will be released soon!

SOTA Tasks

Task	Paper
ICCV2021-NGC	link
CVPR2021-MoSI	link

License

EssentialMC2 is released under MIT license.

MIT License

Copyright (c) 2021 Alibaba

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

Acknowledgement

EssentialMC2 is an open source project that is contributed by researchers from DAMO Academy. We appreciate users who give valuable feedbacks.

Comments

The dataset files for open-set noise experiments with Places-365 or TingImageNet.

Thanks for uploading your source code of NGC.

When I try to reproduce the results in Table.2, where you report the experiments with OOD dataset noise (e.g., from Places-365 or TinyImageNet), and find that you load Places-365 (https://github.com/alibaba/EssentialMC2/blob/main/papers/ICCV2021-NGC/impls/datasets/cifar_noisy_openset_dataset.py#L43) & TinyImageNet (https://github.com/alibaba/EssentialMC2/blob/main/papers/ICCV2021-NGC/impls/datasets/cifar_noisy_openset_dataset.py#L45) from .npy files. But I cannot found any uploaded files for it.

Would you like to upload the missing npy files to make the reproduce complete? Thank you for your valuable attentions and I will be appreciated for any reply.

opened by Openning07 3
Get find_unsed_parameters from config

What do these changes do?

For some models which contain unused parameters(Wav2Vec2 from huggingface etc.), original code will raise an ERROR for find_unused_parameters in DistributedDataParallel is False by default. Make the parameter take correct value from config file to avoid the error.

opened by Logistic1994 1
add minio storage system

Thanks for your contribution and we appreciate it a lot. Please review CONTRIBUTING before opening an issue.

What do these changes do?

Please describe the motivation of this PR and give a short brief about it.

Related issue number (Optional)

Are there any issues opened that will be resolved by merging this change?

Use cases (Optional)

If this PR introduces a new feature, please list some use cases here and update the documentation.

opened by jiangzeyinzi 1
update readme and submodules

Thanks for your contribution and we appreciate it a lot. Please review CONTRIBUTING before opening an issue.

What do these changes do?

Update the readme to reflect the latest status of included works, and update submodules to point to the latest git repo.

Related issue number (Optional)

No related issues.

Use cases (Optional)

No new features.

opened by huang-ziyuan 1
Fix fs bug of unique tmp filename

Thanks for your contribution and we appreciate it a lot. Please review CONTRIBUTING before opening an issue.

What do these changes do?

Please describe the motivation of this PR and give a short brief about it.

Related issue number (Optional)

Are there any issues opened that will be resolved by merging this change?

Use cases (Optional)

If this PR introduces a new feature, please list some use cases here and update the documentation.

opened by jiangzeyinzi 1
fix links and update submodule

What do these changes do?

Fixed wrong links in the readme and update submodule in TAdaConv.

Related issue number (Optional)

None

Use cases (Optional)

None

opened by huang-ziyuan 0

Owner

Alibaba

Alibaba Open Source

GitHub

PyTorchVideo is a deeplearning library with a focus on video understanding work

PyTorchVideo is a deeplearning library with a focus on video understanding work. PytorchVideo provides resusable, modular and efficient components needed to accelerate the video understanding research. PyTorchVideo is developed using PyTorch and supports different deeplearning video components like video models, video datasets, and video-specific transforms.

2.7k Jan 7, 2023

OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark

Introduction English | 简体中文 MMAction2 is an open-source toolbox for video understanding based on PyTorch. It is a part of the OpenMMLab project. The m

2.7k Jan 7, 2023

The official pytorch implementation of our paper "Is Space-Time Attention All You Need for Video Understanding?"

TimeSformer This is an official pytorch implementation of Is Space-Time Attention All You Need for Video Understanding?. In this repository, we provid

1k Dec 31, 2022

Towards Long-Form Video Understanding

Towards Long-Form Video Understanding Chao-Yuan Wu, Philipp Krähenbühl, CVPR 2021 [Paper] [Project Page] [Dataset] Citation @inproceedings{lvu2021,

69 Dec 26, 2022

The implementation of CVPR2021 paper Temporal Query Networks for Fine-grained Video Understanding, by Chuhan Zhang, Ankush Gupta and Andrew Zisserman.

Temporal Query Networks for Fine-grained Video Understanding ?? This repository contains the implementation of CVPR2021 paper Temporal_Query_Networks

55 Dec 21, 2022

Official Pytorch Implementation of Relational Self-Attention: What's Missing in Attention for Video Understanding

Relational Self-Attention: What's Missing in Attention for Video Understanding This repository is the official implementation of "Relational Self-Atte

43 Dec 7, 2022

Video-Captioning - A machine Learning project to generate captions for video frames indicating the relationship between the objects in the video

1 Jan 23, 2022

aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)

Bayesian Methods for Hackers Using Python and PyMC The Bayesian method is the natural approach to inference, yet it is hidden from readers behind chap

25.1k Jan 2, 2023

Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning (ICLR 2021)

Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning (ICLR 2021) Citation Please cite as: @inproceedings{liu2020understan

22 Nov 25, 2022

The source code for the Cutoff data augmentation approach proposed in this paper: "A Simple but Tough-to-Beat Data Augmentation Approach for Natural Language Understanding and Generation".

Cutoff: A Simple Data Augmentation Approach for Natural Language This repository contains source code necessary to reproduce the results presented in

49 Dec 22, 2022

Implementation of CVPR'21: RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction

RfD-Net [Project Page] [Paper] [Video] RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction Yinyu Nie, Ji Hou, Xiaoguang Han, Matthi

162 Jan 6, 2023

Official PyTorch code of Holistic 3D Scene Understanding from a Single Image with Implicit Representation (CVPR 2021)

Implicit3DUnderstanding (Im3D) [Project Page] Holistic 3D Scene Understanding from a Single Image with Implicit Representation Cheng Zhang, Zhaopeng C

149 Jan 8, 2023

Official code of our work, Unified Pre-training for Program Understanding and Generation [NAACL 2021].

PLBART Code pre-release of our work, Unified Pre-training for Program Understanding and Generation accepted at NAACL 2021. Note. A detailed documentat

138 Dec 30, 2022

[CVPR2021] UAV-Human: A Large Benchmark for Human Behavior Understanding with Unmanned Aerial Vehicles

UAV-Human Official repository for CVPR2021: UAV-Human: A Large Benchmark for Human Behavior Understanding with Unmanned Aerial Vehicle Paper arXiv Res

129 Jan 4, 2023

Code for CVPR 2021 oral paper "Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts"

Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts The rapid progress in 3D scene understanding has come with growing dem

182 Dec 30, 2022

The source code of the paper "Understanding Graph Neural Networks from Graph Signal Denoising Perspectives"

GSDN-F and GSDN-EF This repository provides a reference implementation of GSDN-F and GSDN-EF as described in the paper "Understanding Graph Neural Net

18 Nov 14, 2022

Source code and dataset for ACL2021 paper: "ERICA: Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive Learning".

ERICA Source code and dataset for ACL2021 paper: "ERICA: Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive L

75 Nov 2, 2022

Code for the CVPR 2021 paper: Understanding Failures of Deep Networks via Robust Feature Extraction

Welcome to Barlow Barlow is a tool for identifying the failure modes for a given neural network. To achieve this, Barlow first creates a group of imag

33 Dec 5, 2022

Towards Part-Based Understanding of RGB-D Scans

Towards Part-Based Understanding of RGB-D Scans (CVPR 2021) We propose the task of part-based scene understanding of real-world 3D environments: from

26 Nov 23, 2022

EssentialMC2 Video Understanding

Related tags

Overview

EssentialMC2

Introduction

Features

Installation

Install by pip

Install from source

Requirements

Model Zoo

SOTA Tasks

License

Acknowledgement

Comments

The dataset files for open-set noise experiments with Places-365 or TingImageNet.

Get find_unsed_parameters from config

What do these changes do?

add minio storage system

What do these changes do?

Related issue number (Optional)

Use cases (Optional)

update readme and submodules

What do these changes do?

Related issue number (Optional)

Use cases (Optional)

Fix fs bug of unique tmp filename

What do these changes do?

Related issue number (Optional)

Use cases (Optional)

fix links and update submodule

What do these changes do?

Related issue number (Optional)

Use cases (Optional)

Owner

Alibaba

PyTorchVideo is a deeplearning library with a focus on video understanding work

OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark

The official pytorch implementation of our paper "Is Space-Time Attention All You Need for Video Understanding?"

Towards Long-Form Video Understanding

The implementation of CVPR2021 paper Temporal Query Networks for Fine-grained Video Understanding, by Chuhan Zhang, Ankush Gupta and Andrew Zisserman.

Official Pytorch Implementation of Relational Self-Attention: What's Missing in Attention for Video Understanding

Video-Captioning - A machine Learning project to generate captions for video frames indicating the relationship between the objects in the video

aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)

Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning (ICLR 2021)

The source code for the Cutoff data augmentation approach proposed in this paper: "A Simple but Tough-to-Beat Data Augmentation Approach for Natural Language Understanding and Generation".

Implementation of CVPR'21: RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction

Official PyTorch code of Holistic 3D Scene Understanding from a Single Image with Implicit Representation (CVPR 2021)

Official code of our work, Unified Pre-training for Program Understanding and Generation [NAACL 2021].

[CVPR2021] UAV-Human: A Large Benchmark for Human Behavior Understanding with Unmanned Aerial Vehicles

Code for CVPR 2021 oral paper "Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts"

The source code of the paper "Understanding Graph Neural Networks from Graph Signal Denoising Perspectives"

Source code and dataset for ACL2021 paper: "ERICA: Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive Learning".

Code for the CVPR 2021 paper: Understanding Failures of Deep Networks via Robust Feature Extraction

Towards Part-Based Understanding of RGB-D Scans