VCM EE1.2 P-layer feature map anchor generation 137th MPEG-VCM

Overview

VCM_EE1.2_P-layer_feature_map_anchor_generation_137th_MPEG-VCM

#######################################################

Author: Minhun Lee, Hansol Choi, Seungjin Park, Minsub Kim, and Donggyu Sim

E-mail: {minhun, hschoi95, promo, minsub, dgsim}@kw.ac.kr

####################################################### [Introduction]

This package contains scripts to generate anchor results of object detection on P-layer features (p2, p3, p4, p5) extracted from OpenImages dataset for MPEG Video Coding for Machines(VCM).

Please note that this test procedure is organized based on Nokia's latest contribution(m57343) for generating VCM anchor on the OpenImages dataset V6.

####################################################### [Software environment]

Ubuntu 20.04.1 LTS

Python 3.8.11

Torch 1.9.0

Detectron2 0.5

Object-detection 0.1

Pandas 1.3.3

Numpy 1.21.2

Opencv-python 4.5.3.56

Pillow 8.3.1

ffmpeg 4.4

VTM 12.0

####################################################### [Faster-RCNN model parameter]

Download the Faster-RCNN model parameters from the following link: https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_X_101_32x8d_FPN_3x/139173657/model_final_68b088.pkl

Place the downloaded model_final_68b088.pkl file in the models/x101fpn/ directory.

####################################################### [OpenImages V6 Dataset]

Download OpenImages V6 validation set according to the instruction from the following webpage: https://storage.googleapis.com/openimages/web/challenge2019_downloads.html

The downloaded validation.tar.gz file has size of 12G bytes and contains 41620 jpg images. Untar this file to directory dataset/validation

For annotations, we have already set files to './dataset/annotations/' and './oi_eval/' directories. For dataset, have to move only 5k images of the OpenImages dataset V6 to './dataset/val_openimage_v6/*.jpg' directory as below.

####################################################### [Dataset directory structure]

./dataset/val_openimage_v6/ 0a1bd356f90aaab6.jpg ... ffddf3805faf3cbf.jpg # only 5k images

####################################################### [Instructions]

Please run 'demo.sh' script to generate P-layer anchor results. The outputs will be stored in './feature/' and './output/' directories which are generated automatically, and the results from our experiments are also included in 'P-layer_anchor_report.xlsm' file. This top procedure consits of three phases as below.

In the first phase, the P-layer features are extracted from the faster_rcnn_X_101_32x8d_FPN_3 network, and the extracted P-layer features are stored as YUV 4:0:0 format using FFmpeg (png to yuv) after tiling and uniform quantization (10-bits). For feature tiling into YUV 4:0:0, we arranged 256 channels of the p2, p3, p4, and p5 feature maps in a raster scanning order, respectively, so that each YUV 4:0:0 data includes 2D feature for each input image. For the uniform quantisation process, we measured the global maximum and minimum values in the P-layer features over the whole dataset, and the the global maximum and minimum values were 20.3891 and -22.3948, respectively.

In the second phase, the YUV format data are encoded and then decoded via VTM 12.0 software with six different QP values, 35, 37, 39, 41, 43 and 45. Here we store the encoded bitstreams ('./feature/{QP}_bit/') and the reconstructed YUV format data ('./feature/{QP}_rec/') and the original feature map data ('./feature/{QP}_ori/') in the designated directory for each QP value. In addition, please note that we actually performed the encoding jobs in a parallel manner using threading, the thread is setting the default value '4', you can change the value at each './settings/{QP}.json' file.

In the thrid phase, we calculate the bit-per-pixel(bpp) and measure the mAP performance for each QP, based on the bitstreams and the reconstructions generated at the phase two. And the result files are stored './output/{QP}_AP.txt' for each qp value.

You might also like...
Code for Crowd counting via unsupervised cross-domain feature adaptation.

CDFA-pytorch Code for Unsupervised crowd counting via cross-domain feature adaptation. Pre-trained models Google Drive Baidu Cloud : t4qc Environment

Arcpy Tool developed for ArcMap 10.x that checks DVOF points against TDS data and creates an output feature class as well as a check database.

DVOF_check_tool Arcpy Tool developed for ArcMap 10.x that checks DVOF points against TDS data and creates an output feature class as well as a check d

Magenta: Music and Art Generation with Machine Intelligence
Magenta: Music and Art Generation with Machine Intelligence

Magenta is a research project exploring the role of machine learning in the process of creating art and music. Primarily this involves developing new

The next generation Canto RSS daemon

Canto Daemon This is the RSS backend for Canto clients. Canto-curses is the default client at: http://github.com/themoken/canto-curses Requirements De

Procedural 3D data generation pipeline for architecture
Procedural 3D data generation pipeline for architecture

Synthetic Dataset Generator Authors: Stanislava Fedorova Alberto Tono Meher Shashwat Nigam Jiayao Zhang Amirhossein Ahmadnia Cecilia bolognesi Dominik

python DroneCAN code generation, interface and utilities

UAVCAN v0 stack in Python Python implementation of the UAVCAN v0 protocol stack. UAVCAN is a lightweight protocol designed for reliable communication

An Agora Python Flask token generation server

A Flask Starter Application with Login and Registration About A token generation Server using the factory pattern and Blueprints. A forked stripped do

Python client SDK designed to simplify integrations by automating key generation and certificate enrollment using Venafi machine identity services.
Python client SDK designed to simplify integrations by automating key generation and certificate enrollment using Venafi machine identity services.

This open source project is community-supported. To report a problem or share an idea, use Issues; and if you have a suggestion for fixing the issue,

Hopefully the the next-generation backend server of bgm.tv

Hopefully the the next-generation backend server of bgm.tv

Owner
IPSL
Welcome to the Image Processing Systems Laboratory. The IPSL was established in the department of Computer Engineering at Kwangwoon University in 2005.
IPSL
Feature engineering library that helps you keep track of feature dependencies, documentation and schema

Feature engineering library that helps you keep track of feature dependencies, documentation and schema

null 28 May 31, 2022
nbsafety adds a layer of protection to computational notebooks by solving the stale dependency problem when executing cells out-of-order

nbsafety adds a layer of protection to computational notebooks by solving the stale dependency problem when executing cells out-of-order

null 150 Jan 7, 2023
TrackGen - The simplest tropical cyclone track map generator

TrackGen - The simplest tropical cyclone track map generator Usage Each line is a point to be plotted on the map Each field gives information about th

TrackGen 6 Jul 20, 2022
Easily map device and application controls to a midi controller

pymidicontroller Introduction Easily map device and application controls to a midi controller

Tane Barriball 24 May 16, 2022
Prop-based map editor for the Apex Legends mod, R5Reloaded

R5R Map Editor A tool to build maps out of props in the Apex Legends mod, R5Reloaded Instuctions Install R5R Download this program Get the prop spawne

null 7 Dec 16, 2022
Cvdl-hw2 - Find Contour, Camera Calibration, Augmented Reality and Stereo Disparity Map

opevcvdl-hw2 This project uses openCV and Qt to achieve the requirements. Version Python 3.7 opencv-contrib-python 3.4.2.17 Matplotlib 3.1.1 pyqt5 5.1

Kenny Cheng 3 Aug 17, 2022
Project aims to map out common user behavior on the computer

User-Behavior-Mapping-Tool Project aims to map out common user behavior on the computer. Most of the code is based on the research by kacos2000 found

trustedsec 136 Dec 23, 2022
A tool for generating skill map/tree like diagram

skillmap A tool for generating skill map/tree like diagram. What is a skill map/tree? Skill tree is a term used in video games, and it can be used for

Yue 98 Jan 7, 2023
Virtual webcam that takes real webcam footage and replaces the background in order to have Virtual Backgrounds in MS Teams for Linux where the feature is unimplemented.

Background Remover The Need It's been good long while since Microsoft first released a Teams version for Linux and yet, one of Teams' coolest features

Dylan Turner 80 Dec 20, 2022
A C-like hardware description language (HDL) adding high level synthesis(HLS)-like automatic pipelining as a language construct/compiler feature.

██████╗ ██╗██████╗ ███████╗██╗ ██╗███╗ ██╗███████╗ ██████╗ ██╔══██╗██║██╔══██╗██╔════╝██║ ██║████╗ ██║██╔════╝██╔════╝ ██████╔╝██║██████╔╝█

Julian Kemmerer 391 Jan 1, 2023