Python script for extracting audio from video files and creating Mel spectrograms

Alexandros Stergiou

Last update: Oct 28, 2021

Related tags

Overview

video2spectrogram

About

This package is meant to automate the process of extracting audio files from videos and saving the plots computed from these audio frequencies in the Mel scale (Sectrogram). Videos are processed in parallel with the audio extracted by ffmpeg stored in .wav files which are then used to create spectrograms stored as .JPEG and can be used by any audio-based method.

Currently supported video formats include .mp4,mpeg-4,.avi,.wmv. If you have a different extension, you can simply change the script to include them (in the video2spectrogram/get_spectrogram.py)

Package requirements

librosa
numpy
matplotlib

Make sure that the above packages are installed before running any functions.

ffmpeg: You will need to have installed ffmpeg in order to perform the audio extraction from the video files.

Multiprocessing: The code uses multiprocessing for improving speeds, thus the total time required for the conversion varies across different processors. The code has been tested on an AMD Ryzen 3950X with an average conversion time of 4 minutes for ~1K videos (with an average resolution of 480p and length of 5s.)

Dataset structure

The package assumes a fixed video dataset structure:


   
        
  │
  └──
    
     
  │     │
  │     │─── 
     
      
  │     │─── 
      
       
  │     │─── ...
  │    ...      
  │
  └───
       
         │ │ │ │─── 
        
          │ │─── 
         
           │ │─── ... ... ...

Usage

The main code is at the get_spectrograme.py file. To run the convertor simply call the convert function with the base directory of the dataset and the destination directory for where to save the audio. Additional arguments that can be used:

verbose_lvl: Integer for verbosity.
save_wav: Boolean to determine if the created wav files are to be stored and not deleted.
ar: Integer for the ffmpeg option for specifying the audio sampling frequency.
res_h: Integer for the height of the spectrogram image to be saved.
res_w: Integer for the width of the spectrogram image to be saved.
dpi: Integer for the display's dot's per inch. Needs to be set to avoid inconsistencies to the res argument.

from video2spectrogram import convert
#or
from get_spectrogram import convert

convert(my_dataset_dir, my_target_dir)

Installation through git

Please make sure, Git is installed in your machine:

$ sudo apt-get update
$ sudo apt-get install git
$ git clone https://github.com/alexandrosstergiou/video2spectrogram.git
$ cd dataset2database
$ pip install .

You can then use it as any other package installed through pip.

Installation through pip

The latest stable release is also available for download through pip

$ pip install video2spectrogram

Turn any live video stream or locally stored video into a dataset of interesting samples for ML training, or any other type of analysis.

Sieve Video Data Collection Example Find samples that are interesting within hours of raw video, for free and completely automatically using Sieve API

72 Aug 1, 2022

Python script for extracting audio from video files and creating Mel spectrograms

Related tags

Overview

video2spectrogram

About

Package requirements

Dataset structure

Usage

Installation through git

Installation through pip

You might also like...

Streamlink is a CLI utility which pipes video streams from various services into a video player

Filtering user-generated video content(SberZvukTechDays)Filtering user-generated video content(SberZvukTechDays)

Telegram Video Chat Video Streaming bot 🇱🇰

Play Video & Music on Telegram Group Video Chat

Turn any live video stream or locally stored video into a dataset of interesting samples for ML training, or any other type of analysis.

Video-to-GIF-Converter - A small code snippet that can be used to convert any video to a gif

Video-stream - A telegram video stream bot repo

Terminal-Video-Player - A program that can display video in the terminal using ascii characters

TkVideoplayer - This is a simple library to play video files in tkinter.

Releases(v0.1)

v0.1(Oct 28, 2021)

Owner

Alexandros Stergiou

MoviePy is a Python library for video editing, can read and write all the most common audio and video formats

Convert Video Files To Text And Audio

OpenShot Video Editor is an award-winning free and open-source video editor for Linux, Mac, and Windows, and is dedicated to delivering high quality video editing and animation solutions to the world.

Text2Video's purpose is to help people create videos quickly and easily by simply typing out the video’s script and a description of images to include in the video.

video streaming userbot (vsu) based on pytgcalls for streaming video trought the telegram video chat group.

Real-time video and audio streams over the network, with Streamlit.

A tool to fuck a video/audio quality using FFmpeg

Takes a video as an input and creates a video which is suitable to upload on Youtube Shorts and Tik Tok (1080x1920 resolution).

A youtube video link or id to video thumbnail python package.

This plugin generates json files used by deovr allowing you to play 2d and 3d video's using the player