TransNet V2: Shot Boundary Detection Neural Network

Tomáš Souček

Last update: Dec 27, 2022

Related tags

Overview

TransNet V2: Shot Boundary Detection Neural Network

This repository contains code for TransNet V2: An effective deep network architecture for fast shot transition detection.

Our reevaluation of other publicly available state-of-the-art shot boundary methods (F1 scores):

Model	ClipShots	BBC Planet Earth	RAI
TransNet V2 (this repo)	77.9	96.2	93.9
TransNet (github)	73.5	92.9	94.3
Hassanien et al. (github)	75.9	92.6	93.9
Tang et al., ResNet baseline (github)	76.1	89.3	92.8

🎥 USE IT ON YOUR VIDEO

➡️ See inference folder and its README file. ⬅️

REPLICATE THE WORK

Note the datasets for training are tens of gigabytes in size, hundreds of gigabytes when exported.

You do not need to train the network, use code and instructions in inference folder to detect shots in your videos.

This repository contains all that is needed to run any experiment for TransNet V2 network including network training and dataset creation. All experiments should be runnable in this NVIDIA DOCKER file.

In general these steps need to be done in order to replicate our work (in training folder):

Download RAI and BBC Planet Earth test datasets (link). Download ClipShots train/test dataset (link). Optionally get IACC.3 dataset.
Edit and run consolidate_datasets.py in order to transform ground truth from all the datasets into one common format.
Take some videos from ClipShotsTrain aside as a validation dataset.
Run create_dataset.py to create all train/validation/test datasets.
Run training.py ../configs/transnetv2.gin to train a model.
Run evaluate.py /path/to/run_log_dir epoch_no /path/to/test_dataset for proper evaluation.

CREDITS

If found useful, please cite us;)

This paper: TransNet V2: An effective deep network architecture for fast shot transition detection

@article{soucek2020transnetv2,
    title={TransNet V2: An effective deep network architecture for fast shot transition detection},
    author={Sou{\v{c}}ek, Tom{\'a}{\v{s}} and Loko{\v{c}}, Jakub},
    year={2020},
    journal={arXiv preprint arXiv:2008.04838},
}

ACM Multimedia paper of the older version: A Framework for Effective Known-item Search in Video
The older version paper: TransNet: A deep network for fast detection of common shot transitions

Comments

DecodeError: Error parsing message

Hello, I'm trying to run the model but I get the following error

020-06-22 00:20:45.880699: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
[TransNetV2] Using weights from transnetv2-weights/.
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/saved_model/loader_impl.py", line 98, in parse_saved_model
    saved_model.ParseFromString(file_content)
google.protobuf.message.DecodeError: Error parsing message

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "transnetv2.py", line 188, in <module>
    main()
  File "transnetv2.py", line 160, in main
    model = TransNetV2(args.weights)
  File "transnetv2.py", line 17, in __init__
    self._model = tf.saved_model.load(model_dir)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/saved_model/load.py", line 578, in load
    return load_internal(export_dir, tags)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/saved_model/load.py", line 588, in load_internal
    loader_impl.parse_saved_model_with_debug_info(export_dir))
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/saved_model/loader_impl.py", line 56, in parse_saved_model_with_debug_info
    saved_model = _parse_saved_model(export_dir)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/saved_model/loader_impl.py", line 101, in parse_saved_model
    raise IOError("Cannot parse file %s: %s." % (path_to_pb, str(e)))
OSError: Cannot parse file b'transnetv2-weights/saved_model.pb': Error parsing message.

Any idea on what could be wrong? thanks

opened by jorgeecr 8

About weight download

Is there a link where I can manually download the weight, because this weight always times out when my server pulls it. I look forward to hearing from you soon. Thank you very much.

opened by Scr-w 5

ffmpeg._run.Error

Traceback (most recent call last):
  File "/home/tom/projects/Studium/Studienarbeit/cutting/TransNetV2/inference/transnetv2.py", line 193, in <module>
    main()
  File "/home/tom/projects/Studium/Studienarbeit/cutting/TransNetV2/inference/transnetv2.py", line 173, in main
    model.predict_video(file)
  File "/home/tom/projects/Studium/Studienarbeit/cutting/TransNetV2/inference/transnetv2.py", line 83, in predict_video
    video_stream, err = ffmpeg.input(video_fn).output(
  File "/home/tom/projects/Studium/Studienarbeit/cutting/env/lib/python3.9/site-packages/ffmpeg/_run.py", line 325, in run
    raise Error('ffmpeg', out, err)
ffmpeg._run.Error: ffmpeg error (see stderr output for detail)

I get a ffmpeg error when I run the following command:

python transnetv2.py /mnt/e/Studium/Studienarbeit/Videos/2021/reg/17/c597512d-b37c-11eb-ba8a-ecb6fe06b3b0/highlightsVideo/video/video.mp4 [--visualize]

Thats my ffmpeg:

ffmpeg version 4.2.4-1ubuntu0.1 Copyright (c) 2000-2020 the FFmpeg developers
  built with gcc 9 (Ubuntu 9.3.0-10ubuntu2)
  configuration: --prefix=/usr --extra-version=1ubuntu0.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --

incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample -

-enable-avisynth --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-

libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-

libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --

enable-libopus --enable-libpulse --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --

enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --

enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 

--enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-

libiec61883 --enable-nvenc --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared

Is there any problem with my ffmpeg build?

opened by Schubert-Tom 4

How to configure the transnetv2.gin and reproduce the F1 of 77.9 on the ClipShots test set?

Hi Tomáš,

Your work helps me a lot in understanding your great paper! Thank you so much!

I download the ClipShots training and training transitions datasets, and process them according to https://github.com/soCzech/TransNetV2/blob/master/training/consolidate_datasets.py and https://github.com/soCzech/TransNetV2/blob/master/training/create_dataset.py

I download the ClipShots test dataset and process it accordingly.

I also download the IACC.3 dataset and process it with the type of "train" .

I add the ClipShots training, training transitions and IACC.3 in the "options.trn_files" of https://github.com/soCzech/TransNetV2/blob/master/configs/transnetv2.gin, and add ClipShots test in the "options.tst_files". I also change "options.n_epochs" to 50 as indicated in the paper.

However, I can only obtain F1 of 0.74. Could you please give more training details and instructions on how to reproduce 77.9 on the test set?

What are the meanings of file names in "options.tst_files" and how to generate these files?

I also use the pretrained weights in https://github.com/soCzech/TransNetV2/tree/master/inference/transnetv2-weights to test the ClipShots test dataset by revising "options.restore" and "options.test_only" to True in https://github.com/soCzech/TransNetV2/blob/master/configs/transnetv2.gin. I can only get F1 of 0.2545 and cannot reproduce 77.9.

I appreciate your great help so much!

Wentao

opened by wentaozhu 4
OpError: not an sstable (bad magic number)
Hi, I tried to run the code as follows.

from transnetv2 import TransNetV2 model = TransNetV2()

For the first time, it shows parse error. I redownload the .pb model. And it show the OpError: not an sstable (bad magic number). Not sure what happened.

I also tried TransNet https://github.com/soCzech/TransNet. It works pretty well. And I like the visualization. I wonder could you please save the TransNet and the weights as a .pb model? Since I want to use it with opencv and c++. Do you have any suggestions that how I can use it with opencv dnn? Suppose I have a video, what should I do to prepare the input for the model? Thank you very much.
opened by fantasycz 3
path to scenes gt for -mapping in create_train_dataset

Hi, First of all, thank you very much for this work. I have a question that may be a bit silly, but I couldn't find an answer. Indeed, I would like to run the code corresponding to create_train_dataset here , however I can't understand how to fill the mapping_fn parameter : could you tell me what is /path/to/scenes/gt please ? :) I really don't have any idea which file to put...

opened by jeannedionisi 2
How to get synthesis dataset

Hi, in paper, you mentioned that the synthesis data is used and boost performance. But, I didn't find the code to render transitions. Could you please provide the code to render transitions.

opened by zhanghaoxin1994 2
OSError: Cannot parse file b'./saved_model.pb': Error parsing message.

after download all the file and trained model/weights manually, I run the transnetv2.py in the inference, but raise an exception: OSError: Cannot parse file b'/path/to/saved_model.pb': Error parsing message. the version of tensorflow is v2.1.0.

and I also run the docker command to build an image flow the readme.md in inference, but after I built the dockerfile successfully, I run the command in the readme.md to test a video, but still raise the same exception.

how can I solve this problem? thanks

opened by LuyuProgram 2
Working with longer dissolves and 60fps videos?
Hi Tomas,

Thank you so much for putting this code online. You guys did an excellent job on a model that is still considered a SOTA two years after it was posted!

I'm testing the model on a certain dataset and wanted to ask a few questions:

Are you able to share the .h5 file of your training? I want to try to finetune the model with my data and see if it can help me in getting better accuracy compared to training only with my dataset.

I'm working on 60fps and sometimes my dissolves tend to get to ~70 frames. What would you suggest I try to change if I want the model to augment with longer transitions? Or generally, would you suggest any chance for higher fps videos and transitions?

I'll really appreciate your help! Happy holidays!
opened by fschvart 1
training.py can't load models.py

Hi,

Thank you so much for putting this model out. Excellent work! I'm trying to train the model and I stumble upon a problem, that points out to the models.py file When I try to run training.py with the gin file, this is the error message I get:

Traceback (most recent call last): File "C:\videoseg\TransNetV2\training\training.py", line 10, in import models File "C:\videoseg\TransNetV2\training\models.py", line 168, in @gin.configurable(blacklist=["name"]) TypeError: configurable() got an unexpected keyword argument 'blacklist'

Do you have a sense of what could be the problem? Thanks!

opened by fschvart 1
About weight transfer

The open-source model is saved in pb format, and the training model is saved in hd5 format. How is the hd5 format converted to the pb format? Is it saved directly to pb format during training, or is it converted afterward?

opened by emmating12 1

Owner

Tomáš Souček

Currently Research Fellow at Charles University, Machine Learning Researcher at SANEZOO and Avast

GitHub https://arxiv.org/abs/2008.04838

BMN: Boundary-Matching Network

BMN: Boundary-Matching Network A pytorch-version implementation codes of paper: "BMN: Boundary-Matching Network for Temporal Action Proposal Generatio

260 Dec 6, 2022

A public available dataset for road boundary detection in aerial images

Topo-boundary This is the official github repo of paper Topo-boundary: A Benchmark Dataset on Topological Road-boundary Detection Using Aerial Images

79 Jan 4, 2023

Generic Event Boundary Detection: A Benchmark for Event Segmentation

Generic Event Boundary Detection: A Benchmark for Event Segmentation We release our data annotation & baseline codes for detecting generic event bound

47 Nov 22, 2022

BADet: Boundary-Aware 3D Object Detection from Point Clouds (Pattern Recognition 2022)

BADet: Boundary-Aware 3D Object Detection from Point Clouds (Pattern Recognition

17 Dec 12, 2022

[CVPR21] LightTrack: Finding Lightweight Neural Network for Object Tracking via One-Shot Architecture Search

LightTrack: Finding Lightweight Neural Networks for Object Tracking via One-Shot Architecture Search The official implementation of the paper LightTra

290 Dec 24, 2022

GPU-accelerated PyTorch implementation of Zero-shot User Intent Detection via Capsule Neural Networks

GPU-accelerated PyTorch implementation of Zero-shot User Intent Detection via Capsule Neural Networks This repository implements a capsule model Inten

15 Dec 24, 2022

Boundary IoU API (Beta version)

Boundary IoU API (Beta version) Bowen Cheng, Ross Girshick, Piotr Dollár, Alexander C. Berg, Alexander Kirillov [arXiv] [Project] [BibTeX] This API is

177 Dec 29, 2022

Code for CVPR2021 paper "Learning Salient Boundary Feature for Anchor-free Temporal Action Localization"

AFSD: Learning Salient Boundary Feature for Anchor-free Temporal Action Localization This is an official implementation in PyTorch of AFSD. Our paper

146 Dec 24, 2022

code for `Look Closer to Segment Better: Boundary Patch Refinement for Instance Segmentation`

Look Closer to Segment Better: Boundary Patch Refinement for Instance Segmentation (CVPR 2021) Introduction PBR is a conceptually simple yet effective

143 Jan 5, 2023

Out-of-boundary View Synthesis towards Full-frame Video Stabilization

Out-of-boundary View Synthesis towards Full-frame Video Stabilization Introduction | Update | Results Demo | Introduction This repository contains the

25 Oct 10, 2022

A pytorch-version implementation codes of paper: "BSN++: Complementary Boundary Regressor with Scale-Balanced Relation Modeling for Temporal Action Proposal Generation"

BSN++: Complementary Boundary Regressor with Scale-Balanced Relation Modeling for Temporal Action Proposal Generation A pytorch-version implementation

11 Oct 8, 2022

Qimera: Data-free Quantization with Synthetic Boundary Supporting Samples

Qimera: Data-free Quantization with Synthetic Boundary Supporting Samples This repository is the official implementation of paper [Qimera: Data-free Q

21 Nov 3, 2022

[AAAI-2021] Visual Boundary Knowledge Translation for Foreground Segmentation

Trans-Net Code for (Visual Boundary Knowledge Translation for Foreground Segmentation, AAAI2021). [https://ojs.aaai.org/index.php/AAAI/article/view/16

2 Mar 4, 2022

Boundary-preserving Mask R-CNN (ECCV 2020)

BMaskR-CNN This code is developed on Detectron2 Boundary-preserving Mask R-CNN ECCV 2020 Tianheng Cheng, Xinggang Wang, Lichao Huang, Wenyu Liu Video

178 Nov 28, 2022

A Weakly Supervised Amodal Segmenter with Boundary Uncertainty Estimation

Paper Khoi Nguyen, Sinisa Todorovic "A Weakly Supervised Amodal Segmenter with Boundary Uncertainty Estimation", accepted to ICCV 2021 Our code is mai

5 Aug 14, 2022

Finite difference solution of 2D Poisson equation. Can handle Dirichlet, Neumann and mixed boundary conditions.

Poisson-solver-2D Finite difference solution of 2D Poisson equation Current version can handle Dirichlet, Neumann, and mixed (combination of Dirichlet

34 Dec 23, 2022

An official PyTorch Implementation of Boundary-aware Self-supervised Learning for Video Scene Segmentation (BaSSL)

72 Dec 28, 2022

This is a model made out of Neural Network specifically a Convolutional Neural Network model

This is a model made out of Neural Network specifically a Convolutional Neural Network model. This was done with a pre-built dataset from the tensorflow and keras packages. There are other alternative libraries that can be used for this purpose, one of which is the PyTorch library.

9 Oct 18, 2022

This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CNPs), Neural Processes (NPs), Attentive Neural Processes (ANPs).

The Neural Process Family This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CN

892 Dec 28, 2022