TransNet V2: Shot Boundary Detection Neural Network

Overview

TransNet V2: Shot Boundary Detection Neural Network

This repository contains code for TransNet V2: An effective deep network architecture for fast shot transition detection.

Our reevaluation of other publicly available state-of-the-art shot boundary methods (F1 scores):

Model ClipShots BBC Planet Earth RAI
TransNet V2 (this repo) 77.9 96.2 93.9
TransNet (github) 73.5 92.9 94.3
Hassanien et al. (github) 75.9 92.6 93.9
Tang et al., ResNet baseline (github) 76.1 89.3 92.8

🎥 USE IT ON YOUR VIDEO

➡️ See inference folder and its README file. ⬅️

REPLICATE THE WORK

Note the datasets for training are tens of gigabytes in size, hundreds of gigabytes when exported.

You do not need to train the network, use code and instructions in inference folder to detect shots in your videos.

This repository contains all that is needed to run any experiment for TransNet V2 network including network training and dataset creation. All experiments should be runnable in this NVIDIA DOCKER file.

In general these steps need to be done in order to replicate our work (in training folder):

  1. Download RAI and BBC Planet Earth test datasets (link). Download ClipShots train/test dataset (link). Optionally get IACC.3 dataset.
  2. Edit and run consolidate_datasets.py in order to transform ground truth from all the datasets into one common format.
  3. Take some videos from ClipShotsTrain aside as a validation dataset.
  4. Run create_dataset.py to create all train/validation/test datasets.
  5. Run training.py ../configs/transnetv2.gin to train a model.
  6. Run evaluate.py /path/to/run_log_dir epoch_no /path/to/test_dataset for proper evaluation.

CREDITS

If found useful, please cite us;)

Comments
  • DecodeError: Error parsing message

    DecodeError: Error parsing message

    Hello, I'm trying to run the model but I get the following error

    020-06-22 00:20:45.880699: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
    [TransNetV2] Using weights from transnetv2-weights/.
    Traceback (most recent call last):
      File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/saved_model/loader_impl.py", line 98, in parse_saved_model
        saved_model.ParseFromString(file_content)
    google.protobuf.message.DecodeError: Error parsing message
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "transnetv2.py", line 188, in <module>
        main()
      File "transnetv2.py", line 160, in main
        model = TransNetV2(args.weights)
      File "transnetv2.py", line 17, in __init__
        self._model = tf.saved_model.load(model_dir)
      File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/saved_model/load.py", line 578, in load
        return load_internal(export_dir, tags)
      File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/saved_model/load.py", line 588, in load_internal
        loader_impl.parse_saved_model_with_debug_info(export_dir))
      File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/saved_model/loader_impl.py", line 56, in parse_saved_model_with_debug_info
        saved_model = _parse_saved_model(export_dir)
      File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/saved_model/loader_impl.py", line 101, in parse_saved_model
        raise IOError("Cannot parse file %s: %s." % (path_to_pb, str(e)))
    OSError: Cannot parse file b'transnetv2-weights/saved_model.pb': Error parsing message.
    

    Any idea on what could be wrong? thanks

    opened by jorgeecr 8
  • About weight download

    About weight download

    Is there a link where I can manually download the weight, because this weight always times out when my server pulls it. I look forward to hearing from you soon. Thank you very much.

    opened by Scr-w 5
  • ffmpeg._run.Error

    ffmpeg._run.Error

    Traceback (most recent call last):
      File "/home/tom/projects/Studium/Studienarbeit/cutting/TransNetV2/inference/transnetv2.py", line 193, in <module>
        main()
      File "/home/tom/projects/Studium/Studienarbeit/cutting/TransNetV2/inference/transnetv2.py", line 173, in main
        model.predict_video(file)
      File "/home/tom/projects/Studium/Studienarbeit/cutting/TransNetV2/inference/transnetv2.py", line 83, in predict_video
        video_stream, err = ffmpeg.input(video_fn).output(
      File "/home/tom/projects/Studium/Studienarbeit/cutting/env/lib/python3.9/site-packages/ffmpeg/_run.py", line 325, in run
        raise Error('ffmpeg', out, err)
    ffmpeg._run.Error: ffmpeg error (see stderr output for detail)
    

    I get a ffmpeg error when I run the following command:

    python transnetv2.py /mnt/e/Studium/Studienarbeit/Videos/2021/reg/17/c597512d-b37c-11eb-ba8a-ecb6fe06b3b0/highlightsVideo/video/video.mp4 [--visualize]

    Thats my ffmpeg:

    ffmpeg version 4.2.4-1ubuntu0.1 Copyright (c) 2000-2020 the FFmpeg developers
      built with gcc 9 (Ubuntu 9.3.0-10ubuntu2)
      configuration: --prefix=/usr --extra-version=1ubuntu0.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --
    
    incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample -
    
    -enable-avisynth --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-
    
    libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-
    
    libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --
    
    enable-libopus --enable-libpulse --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --
    
    enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --
    
    enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 
    
    --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-
    
    libiec61883 --enable-nvenc --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
    
    
    

    Is there any problem with my ffmpeg build?

    opened by Schubert-Tom 4
  • How to configure the transnetv2.gin and reproduce the F1 of 77.9 on the ClipShots test set?

    How to configure the transnetv2.gin and reproduce the F1 of 77.9 on the ClipShots test set?

    Hi Tomáš,

    Your work helps me a lot in understanding your great paper! Thank you so much!

    I download the ClipShots training and training transitions datasets, and process them according to https://github.com/soCzech/TransNetV2/blob/master/training/consolidate_datasets.py and https://github.com/soCzech/TransNetV2/blob/master/training/create_dataset.py

    I download the ClipShots test dataset and process it accordingly.

    I also download the IACC.3 dataset and process it with the type of "train" .

    I add the ClipShots training, training transitions and IACC.3 in the "options.trn_files" of https://github.com/soCzech/TransNetV2/blob/master/configs/transnetv2.gin, and add ClipShots test in the "options.tst_files". I also change "options.n_epochs" to 50 as indicated in the paper.

    However, I can only obtain F1 of 0.74. Could you please give more training details and instructions on how to reproduce 77.9 on the test set?

    What are the meanings of file names in "options.tst_files" and how to generate these files?

    I also use the pretrained weights in https://github.com/soCzech/TransNetV2/tree/master/inference/transnetv2-weights to test the ClipShots test dataset by revising "options.restore" and "options.test_only" to True in https://github.com/soCzech/TransNetV2/blob/master/configs/transnetv2.gin. I can only get F1 of 0.2545 and cannot reproduce 77.9.

    I appreciate your great help so much!

    Wentao

    opened by wentaozhu 4
  • OpError: not an sstable (bad magic number)

    OpError: not an sstable (bad magic number)

    Hi, I tried to run the code as follows.

    from transnetv2 import TransNetV2
    model = TransNetV2()
    

    For the first time, it shows parse error. I redownload the .pb model. And it show the OpError: not an sstable (bad magic number). Not sure what happened.

    I also tried TransNet https://github.com/soCzech/TransNet. It works pretty well. And I like the visualization. I wonder could you please save the TransNet and the weights as a .pb model? Since I want to use it with opencv and c++. Do you have any suggestions that how I can use it with opencv dnn? Suppose I have a video, what should I do to prepare the input for the model? Thank you very much.

    opened by fantasycz 3
  • path to scenes gt for -mapping in create_train_dataset

    path to scenes gt for -mapping in create_train_dataset

    Hi, First of all, thank you very much for this work. I have a question that may be a bit silly, but I couldn't find an answer. Indeed, I would like to run the code corresponding to create_train_dataset here , however I can't understand how to fill the mapping_fn parameter : could you tell me what is /path/to/scenes/gt please ? :) I really don't have any idea which file to put...

    opened by jeannedionisi 2
  • How to get synthesis dataset

    How to get synthesis dataset

    Hi, in paper, you mentioned that the synthesis data is used and boost performance. But, I didn't find the code to render transitions. Could you please provide the code to render transitions.

    opened by zhanghaoxin1994 2
  • OSError: Cannot parse file b'./saved_model.pb': Error parsing message.

    OSError: Cannot parse file b'./saved_model.pb': Error parsing message.

    after download all the file and trained model/weights manually, I run the transnetv2.py in the inference, but raise an exception: OSError: Cannot parse file b'/path/to/saved_model.pb': Error parsing message. the version of tensorflow is v2.1.0.

    and I also run the docker command to build an image flow the readme.md in inference, but after I built the dockerfile successfully, I run the command in the readme.md to test a video, but still raise the same exception.

    how can I solve this problem? thanks

    opened by LuyuProgram 2
  • Working with longer dissolves and 60fps videos?

    Working with longer dissolves and 60fps videos?

    Hi Tomas,

    Thank you so much for putting this code online. You guys did an excellent job on a model that is still considered a SOTA two years after it was posted!

    I'm testing the model on a certain dataset and wanted to ask a few questions:

    1. Are you able to share the .h5 file of your training? I want to try to finetune the model with my data and see if it can help me in getting better accuracy compared to training only with my dataset.
    2. I'm working on 60fps and sometimes my dissolves tend to get to ~70 frames. What would you suggest I try to change if I want the model to augment with longer transitions? Or generally, would you suggest any chance for higher fps videos and transitions?

    I'll really appreciate your help! Happy holidays!

    opened by fschvart 1
  • training.py can't load models.py

    training.py can't load models.py

    Hi,

    Thank you so much for putting this model out. Excellent work! I'm trying to train the model and I stumble upon a problem, that points out to the models.py file When I try to run training.py with the gin file, this is the error message I get:

    Traceback (most recent call last): File "C:\videoseg\TransNetV2\training\training.py", line 10, in import models File "C:\videoseg\TransNetV2\training\models.py", line 168, in @gin.configurable(blacklist=["name"]) TypeError: configurable() got an unexpected keyword argument 'blacklist'

    Do you have a sense of what could be the problem? Thanks!

    opened by fschvart 1
  • About weight transfer

    About weight transfer

    The open-source model is saved in pb format, and the training model is saved in hd5 format. How is the hd5 format converted to the pb format? Is it saved directly to pb format during training, or is it converted afterward?

    opened by emmating12 1
Owner
Tomáš Souček
Currently Research Fellow at Charles University, Machine Learning Researcher at SANEZOO and Avast
Tomáš Souček
BMN: Boundary-Matching Network

BMN: Boundary-Matching Network A pytorch-version implementation codes of paper: "BMN: Boundary-Matching Network for Temporal Action Proposal Generatio

qinxin 260 Dec 6, 2022
A public available dataset for road boundary detection in aerial images

Topo-boundary This is the official github repo of paper Topo-boundary: A Benchmark Dataset on Topological Road-boundary Detection Using Aerial Images

Zhenhua Xu 79 Jan 4, 2023
Generic Event Boundary Detection: A Benchmark for Event Segmentation

Generic Event Boundary Detection: A Benchmark for Event Segmentation We release our data annotation & baseline codes for detecting generic event bound

null 47 Nov 22, 2022
BADet: Boundary-Aware 3D Object Detection from Point Clouds (Pattern Recognition 2022)

BADet: Boundary-Aware 3D Object Detection from Point Clouds (Pattern Recognition

Rui Qian 17 Dec 12, 2022
[CVPR21] LightTrack: Finding Lightweight Neural Network for Object Tracking via One-Shot Architecture Search

LightTrack: Finding Lightweight Neural Networks for Object Tracking via One-Shot Architecture Search The official implementation of the paper LightTra

Multimedia Research 290 Dec 24, 2022
GPU-accelerated PyTorch implementation of Zero-shot User Intent Detection via Capsule Neural Networks

GPU-accelerated PyTorch implementation of Zero-shot User Intent Detection via Capsule Neural Networks This repository implements a capsule model Inten

Joel Huang 15 Dec 24, 2022
Boundary IoU API (Beta version)

Boundary IoU API (Beta version) Bowen Cheng, Ross Girshick, Piotr Dollár, Alexander C. Berg, Alexander Kirillov [arXiv] [Project] [BibTeX] This API is

Bowen Cheng 177 Dec 29, 2022
Code for CVPR2021 paper "Learning Salient Boundary Feature for Anchor-free Temporal Action Localization"

AFSD: Learning Salient Boundary Feature for Anchor-free Temporal Action Localization This is an official implementation in PyTorch of AFSD. Our paper

Tencent YouTu Research 146 Dec 24, 2022
code for `Look Closer to Segment Better: Boundary Patch Refinement for Instance Segmentation`

Look Closer to Segment Better: Boundary Patch Refinement for Instance Segmentation (CVPR 2021) Introduction PBR is a conceptually simple yet effective

H.Chen 143 Jan 5, 2023
Out-of-boundary View Synthesis towards Full-frame Video Stabilization

Out-of-boundary View Synthesis towards Full-frame Video Stabilization Introduction | Update | Results Demo | Introduction This repository contains the

null 25 Oct 10, 2022
A pytorch-version implementation codes of paper: "BSN++: Complementary Boundary Regressor with Scale-Balanced Relation Modeling for Temporal Action Proposal Generation"

BSN++: Complementary Boundary Regressor with Scale-Balanced Relation Modeling for Temporal Action Proposal Generation A pytorch-version implementation

null 11 Oct 8, 2022
Qimera: Data-free Quantization with Synthetic Boundary Supporting Samples

Qimera: Data-free Quantization with Synthetic Boundary Supporting Samples This repository is the official implementation of paper [Qimera: Data-free Q

Kanghyun Choi 21 Nov 3, 2022
[AAAI-2021] Visual Boundary Knowledge Translation for Foreground Segmentation

Trans-Net Code for (Visual Boundary Knowledge Translation for Foreground Segmentation, AAAI2021). [https://ojs.aaai.org/index.php/AAAI/article/view/16

ZJU-VIPA 2 Mar 4, 2022
Boundary-preserving Mask R-CNN (ECCV 2020)

BMaskR-CNN This code is developed on Detectron2 Boundary-preserving Mask R-CNN ECCV 2020 Tianheng Cheng, Xinggang Wang, Lichao Huang, Wenyu Liu Video

Hust Visual Learning Team 178 Nov 28, 2022
A Weakly Supervised Amodal Segmenter with Boundary Uncertainty Estimation

Paper Khoi Nguyen, Sinisa Todorovic "A Weakly Supervised Amodal Segmenter with Boundary Uncertainty Estimation", accepted to ICCV 2021 Our code is mai

Khoi Nguyen 5 Aug 14, 2022
Finite difference solution of 2D Poisson equation. Can handle Dirichlet, Neumann and mixed boundary conditions.

Poisson-solver-2D Finite difference solution of 2D Poisson equation Current version can handle Dirichlet, Neumann, and mixed (combination of Dirichlet

Mohammad Asif Zaman 34 Dec 23, 2022
An official PyTorch Implementation of Boundary-aware Self-supervised Learning for Video Scene Segmentation (BaSSL)

An official PyTorch Implementation of Boundary-aware Self-supervised Learning for Video Scene Segmentation (BaSSL)

Kakao Brain 72 Dec 28, 2022
This is a model made out of Neural Network specifically a Convolutional Neural Network model

This is a model made out of Neural Network specifically a Convolutional Neural Network model. This was done with a pre-built dataset from the tensorflow and keras packages. There are other alternative libraries that can be used for this purpose, one of which is the PyTorch library.

null 9 Oct 18, 2022
This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CNPs), Neural Processes (NPs), Attentive Neural Processes (ANPs).

The Neural Process Family This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CN

DeepMind 892 Dec 28, 2022