Activity image-based video retrieval

Overview

Cross-modal-retrieval

Our approach is focus on Activity Image-to-Video Retrieval (AIVR) task. The compared methods are state-of-the-art single modality hashing methods, multiple modalities hashing methods and cross-modal retrieval methods.

Single modality hashing methods

Some hashing baselines for image retrieval can be found in https://github.com/willard-yuan/hashing-baseline-for-image-retrieval.

Multiple modalities hashing methods

More details refer to https://github.com/czxxjtu/Hash-Learning.github.io. Some details about hashing methods are in hashing-baseline-for-image-retrieval-master folder.

Cross-modal retrieval methods

The compared cross-modal retrieval methods are according to the paper:

Datasets

THUMOS'14 Dataset:

https://pan.baidu.com/s/1H6c8nh_Hs7gVkhESpxtvAg 提取码:qp26

ActivityNet Dataset:

https://pan.baidu.com/s/1P0jRecEmplCPaTPwFoOpVQ 提取码:pnw9

Bibtex

When using images from our dataset, please cite our paper using the following BibTeX[PDF]

@article{pba2020,
author    = {Ruicong Xu and Li Niu and Jianfu Zhang and Liqing Zhang},
title     = {A Proposal-based Approach for Activity Image-to-Video Retrieval},
journal   = {AAAI},
year      = {2020}}
You might also like...
Group Activity Recognition with Clustered Spatial Temporal Transformer

GroupFormer Group Activity Recognition with Clustered Spatial-TemporalTransformer Backbone Style Action Acc Activity Acc Config Download Inv3+flow+pos

PyZebrascope - an open-source Python platform for brain-wide neural activity imaging in behaving zebrafish
PyZebrascope - an open-source Python platform for brain-wide neural activity imaging in behaving zebrafish

PyZebrascope - an open-source Python platform for brain-wide neural activity imaging in behaving zebrafish

HiPAL: A Deep Framework for Physician Burnout Prediction Using Activity Logs in Electronic Health Records
HiPAL: A Deep Framework for Physician Burnout Prediction Using Activity Logs in Electronic Health Records

HiPAL Code for KDD'22 Applied Data Science Track submission -- HiPAL: A Deep Framework for Physician Burnout Prediction Using Activity Logs in Electro

Activity tragle - Google is tracking everything, we just look at it
Activity tragle - Google is tracking everything, we just look at it

activity_tragle Google is tracking everything, we just look at it here. You need

Official Implementation of CoSMo: Content-Style Modulation for Image Retrieval with Text Feedback
Official Implementation of CoSMo: Content-Style Modulation for Image Retrieval with Text Feedback

CoSMo.pytorch Official Implementation of CoSMo: Content-Style Modulation for Image Retrieval with Text Feedback, Seungmin Lee*, Dongwan Kim*, Bohyung

Code for 'Single Image 3D Shape Retrieval via Cross-Modal Instance and Category Contrastive Learning', ICCV 2021
Code for 'Single Image 3D Shape Retrieval via Cross-Modal Instance and Category Contrastive Learning', ICCV 2021

CMIC-Retrieval Code for Single Image 3D Shape Retrieval via Cross-Modal Instance and Category Contrastive Learning. ICCV 2021. Introduction In this wo

Instance-level Image Retrieval using Reranking Transformers

Instance-level Image Retrieval using Reranking Transformers Fuwen Tan, Jiangbo Yuan, Vicente Ordonez, ICCV 2021. Abstract Instance-level image retriev

 Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking
Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking

Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking We revisit and address issues with Oxford 5k and Paris 6k image retrieval benchm

cisip-FIRe - Fast Image Retrieval

Fast Image Retrieval (FIRe) is an open source image retrieval project release by Center of Image and Signal Processing Lab (CISiP Lab), Universiti Malaya. This project implements most of the major binary hashing methods to date, together with different popular backbone networks and public datasets.

Owner
BCMI
Center for Brain-Like Computing and Machine Intelligence, Shanghai Jiao Tong University.
BCMI
A Joint Video and Image Encoder for End-to-End Retrieval

Frozen️ in Time ❄️ ️️️️ ⏳ A Joint Video and Image Encoder for End-to-End Retrieval project page | arXiv | webvid-data Repository containing the code,

null 225 Dec 25, 2022
Source code of our TTH paper: Targeted Trojan-Horse Attacks on Language-based Image Retrieval.

Targeted Trojan-Horse Attacks on Language-based Image Retrieval Source code of our TTH paper: Targeted Trojan-Horse Attacks on Language-based Image Re

fine 7 Aug 23, 2022
[2021 MultiMedia] CONQUER: Contextual Query-aware Ranking for Video Corpus Moment Retrieval

CONQUER: Contexutal Query-aware Ranking for Video Corpus Moment Retreival PyTorch implementation of CONQUER: Contexutal Query-aware Ranking for Video

Hou zhijian 23 Dec 26, 2022
Fake videos detection by tracing the source using video hashing retrieval.

Vision Transformer Based Video Hashing Retrieval for Tracing the Source of Fake Videos ??️ ?? Directory Introduction VTL Trace Samples and Acc of Hash

null 56 Dec 22, 2022
Near-Duplicate Video Retrieval with Deep Metric Learning

Near-Duplicate Video Retrieval with Deep Metric Learning This repository contains the Tensorflow implementation of the paper Near-Duplicate Video Retr

null 2 Jan 24, 2022
[arXiv22] Disentangled Representation Learning for Text-Video Retrieval

Disentangled Representation Learning for Text-Video Retrieval This is a PyTorch implementation of the paper Disentangled Representation Learning for T

Qiang Wang 49 Dec 18, 2022
UMT is a unified and flexible framework which can handle different input modality combinations, and output video moment retrieval and/or highlight detection results.

Unified Multi-modal Transformers This repository maintains the official implementation of the paper UMT: Unified Multi-modal Transformers for Joint Vi

Applied Research Center (ARC), Tencent PCG 84 Jan 4, 2023
Video-Captioning - A machine Learning project to generate captions for video frames indicating the relationship between the objects in the video

Video-Captioning - A machine Learning project to generate captions for video frames indicating the relationship between the objects in the video

null 1 Jan 23, 2022
This is the research repository for Vid2Doppler: Synthesizing Doppler Radar Data from Videos for Training Privacy-Preserving Activity Recognition.

Vid2Doppler: Synthesizing Doppler Radar Data from Videos for Training Privacy-Preserving Activity Recognition This is the research repository for Vid2

Future Interfaces Group (CMU) 26 Dec 24, 2022
Shallow Convolutional Neural Networks for Human Activity Recognition using Wearable Sensors

-IEEE-TIM-2021-1-Shallow-CNN-for-HAR [IEEE TIM 2021-1] Shallow Convolutional Neural Networks for Human Activity Recognition using Wearable Sensors All

Wenbo Huang 1 May 17, 2022