Episodic-memory - Ego4D Episodic Memory Benchmark

Last update: Feb 18, 2022

Related tags

Deep Learning episodic-memory

Overview

Ego4D Episodic Memory Benchmark

EGO4D is the world's largest egocentric (first person) video ML dataset and benchmark suite.

For more information on Ego4D or to download the dataset, read: Start Here.

The Episodic Memory Benchmark aims to make past video queryable and requires localizing where the answer can be seen within the user’s past video. The repository contains the code needed to reproduce the results in the Ego4D: Around the World in 3,000 Hours of Egocentric Video.

There are 4 related tasks within a benchmark. Please see the README within each benchmark for details on setting up the codebase.

VQ2D: Visual Queries with 2D Localization

This task asks: “When did I last see [this]?” Given an egocentric video clip and an image crop depicting the query object, the goal is to return the last occurrence of the object in the input video, in terms of the tracked bounding box (2D + temporal localization). The novelty of this task is to upgrade traditional object instance recognition to deal with video, and particularly ego-video with challenging view transformations.

VQ3D: Visual Queries with 3D Localization

This task asks, “Where did I last see [this]?” Given an egocentric video clip and an image crop depicting the query object, the goal is to localize the last time it was seen in the video and return a 3D displacement vector from the camera center of the query frame to the center of the object in 3D. Hence, this task builds on the 2D localization above, expanding it to require localization in the 3D environment. The task is novel in how it requires both video object instance recognition and 3D reasoning.

NLQ: Natural Language Queries

This task asks, "What/when/where....?" -- general natural language questions about the video past. Given a video clip and a query expressed in natural language, the goal is to localize the temporal window within all the video history where the answer to the question is evident. The task is novel because it requires searching through video to answer flexible linguistic queries. For brevity, these example clips illustrate the video surrounding the ground truth (whereas the original input videos are each ~8 min).

MQ: Moments Queries

This task asks, "When did I do X?” Given an egocentric video and an activity name (i.e., a "moment"), the goal is to localize all instances of that activity in the past video. The task is activity detection, but specifically for the egocentric activity of the camera wearer who is largely out of view.

License

Ego4D is released under the MIT License.

You might also like...

OpenMMLab Detection Toolbox and Benchmark

MMDetection is an open source object detection toolbox based on PyTorch. It is a part of the OpenMMLab project.

22.5k Jan 5, 2023

[ICLR 2021] HW-NAS-Bench: Hardware-Aware Neural Architecture Search Benchmark

HW-NAS-Bench: Hardware-Aware Neural Architecture Search Benchmark Accepted as a spotlight paper at ICLR 2021. Table of content File structure Prerequi

72 Jan 3, 2023

Repository for Traffic Accident Benchmark for Causality Recognition (ECCV 2020)

Causality In Traffic Accident (Under Construction) Repository for Traffic Accident Benchmark for Causality Recognition (ECCV 2020) Overview Data Prepa

21 Nov 20, 2022

Open-L2O: A Comprehensive and Reproducible Benchmark for Learning to Optimize Algorithms

Open-L2O This repository establishes the first comprehensive benchmark efforts of existing learning to optimize (L2O) approaches on a number of proble

161 Jan 2, 2023

A code repository associated with the paper A Benchmark for Rough Sketch Cleanup by Chuan Yan, David Vanderhaeghe, and Yotam Gingold from SIGGRAPH Asia 2020.

A Benchmark for Rough Sketch Cleanup This is the code repository associated with the paper A Benchmark for Rough Sketch Cleanup by Chuan Yan, David Va

33 Dec 18, 2022

Episodic-memory - Ego4D Episodic Memory Benchmark

Related tags

Overview

Ego4D Episodic Memory Benchmark

VQ2D: Visual Queries with 2D Localization

VQ3D: Visual Queries with 3D Localization

NLQ: Natural Language Queries

MQ: Moments Queries

You might also like...

OpenMMLab Detection Toolbox and Benchmark

[ICLR 2021] HW-NAS-Bench: Hardware-Aware Neural Architecture Search Benchmark

Repository for Traffic Accident Benchmark for Causality Recognition (ECCV 2020)

Open-L2O: A Comprehensive and Reproducible Benchmark for Learning to Optimize Algorithms

A code repository associated with the paper A Benchmark for Rough Sketch Cleanup by Chuan Yan, David Vanderhaeghe, and Yotam Gingold from SIGGRAPH Asia 2020.

OpenMMLab Semantic Segmentation Toolbox and Benchmark.

OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark

OpenMMLab Pose Estimation Toolbox and Benchmark.

[CVPR 2021 Oral] ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis

Owner

The Dual Memory is build from a simple CNN for the deep memory and Linear Regression fro the fast Memory

[CVPR'21] FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space

Episodic Transformer (E.T.) is a novel attention-based architecture for vision-and-language navigation. E.T. is based on a multimodal transformer that encodes language inputs and the full episode history of visual observations and actions.

Easily benchmark PyTorch model FLOPs, latency, throughput, max allocated memory and energy consumption

Segcache: a memory-efficient and scalable in-memory key-value cache for small objects

PyTorch Code of "Memory In Memory: A Predictive Neural Network for Learning Higher-Order Non-Stationarity from Spatiotemporal Dynamics"

Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"

A machine learning benchmark of in-the-wild distribution shifts, with data loaders, evaluators, and default models.

DeepMind Alchemy task environment: a meta-reinforcement learning benchmark