Python I/O for STEM audio files

Fabian-Robert Stöter

Last update: Dec 23, 2022

Related tags

Audio python ffmpeg multitrack native-instruments stems

Overview

stempeg = stems + ffmpeg

Python package to read and write STEM audio files. Technically, stems are audio containers that combine multiple audio streams and metadata in a single audio file. This makes it ideal to playback multitrack audio, where users can select the audio sub-stream during playback (e.g. supported by VLC).

Under the hood, stempeg uses ffmpeg for reading and writing multistream audio, optionally MP4Box is used to create STEM files that are compatible with Native Instruments hardware and software.

Features

robust and fast interface for ffmpeg to read and write any supported format from/to numpy.
reading supports seeking and duration.
control container and codec as well as bitrate when compressed audio is written.
store multi-track audio within audio formats by aggregate streams into channels (concatenation of pairs of stereo channels).
support for internal ffmpeg resampling furing read and write.
create mp4 stems compatible to Native Instruments traktor.
using multiprocessing to speed up reading substreams and write multiple files.

Installation

1. Installation of ffmpeg Library

stempeg relies on ffmpeg (>= 3.2 is suggested).

The Installation if ffmpeg differ among operating systems. If you use anaconda you can install ffmpeg on Windows/Mac/Linux using the following command:

conda install -c conda-forge ffmpeg

Note that for better quality encoding it is recommended to install ffmpeg with libfdk-aac codec support as following:

MacOS: use homebrew: brew install ffmpeg --with-fdk-aac
Ubuntu/Debian Linux: See installation script here.
Docker: docker pull jrottenberg/ffmpeg

1a. (optional) Installation of MP4Box

If you plan to write stem files with full compatibility with Native Instruments Traktor DJ hardware and software, you need to install MP4Box.

MacOS: use homebrew: brew install gpac
Ubuntu/Debian Linux: apt-get install gpac

Further installation instructions for all operating systems can be found here.

2. Installation of the stempeg package

A) Installation via PyPI using pip

pip install stempeg

B) Installation via conda

conda install -c conda-forge stempeg

Usage

Reading audio

Stempeg can read multi-stream and single stream audio files, thus, it can replace your normal audio loaders for 1d or 2d (mono/stereo) arrays.

By default read_stems, assumes that multiple substreams can exit (default reader=stempeg.StreamsReader()). To support multi-stream, even when the audio container doesn't support multiple streams (e.g. WAV), streams can be mapped to multiple pairs of channels. In that case, reader=stempeg.ChannelsReader(), can be passed. Also see: stempeg.ChannelsWriter.

import stempeg
S, rate = stempeg.read_stems(stempeg.example_stem_path())

S is a numpy tensor that includes the time domain signals scaled to [-1..1]. The shape is (stems, samples, channels). An detailed documentation of the read_stems can be viewed here. Note, a small stems excerpt from The Easton Ellises, licensed under Creative Commons CC BY-NC-SA 3.0 is included and can be accessed using stempeg.example_stem_path().

Reading individual streams

Individual substreams of the stem file can be read by passing the corresponding stem id (starting from 0):

S, rate = stempeg.read_stems(stempeg.example_stem_path(), stem_id=[0, 1])

Read excerpts (set seek position)

Excerpts from the stem instead of the full file can be read by providing start (start) and duration (duration) in seconds to read_stems:

S, _ = stempeg.read_stems(stempeg.example_stem_path(), start=1, duration=1.5)
# read from second 1.0 to second 2.5

Writing audio

As seen in the flow chart above, stempeg supports multiple ways to write multi-track audio.

Write multi-channel audio

stempeg.write_audio can be used for single-stream, multi-channel audio files. Stempeg wraps a number of ffmpeg parameter to resample the output sample rate and adjust the audio codec, if necessary.

stempeg.write_audio(path="out.mp4", data=S, sample_rate=44100.0, output_sample_rate=48000.0, codec='aac', bitrate=256000)

Writing multi-stream audio

Writing stem files from a numpy tensor can done with.

stempeg.write_stems(path="output.stem.mp4", data=S, sample_rate=44100, writer=stempeg.StreamsWriter())

As seen in the flow chart above, stempeg supports multiple ways to write multi-stream audio. Each of the method has different number of parameters. To select a method one of the following setting and be passed:

stempeg.FilesWriter Stems will be saved into multiple files. For the naming, basename(path) is ignored and just the parent of path and its extension is used.
stempeg.ChannelsWriter Stems will be saved as multiple channels.
stempeg.StreamsWriter (default). Stems will be saved into a single a multi-stream file.
stempeg.NIStemsWriter Stem will be saved into a single multistream audio. Additionally Native Instruments Stems compabible Metadata is added. This requires the installation of MP4Box.

⚠️ Warning: Muxing stems using ffmpeg leads to multi-stream files not compatible with Native Instrument Hardware or Software. Please use MP4Box if you use the stempeg.NISTemsWriter()

For more information on writing stems, see stempeg.write_stems. An example that documents the advanced features of the writer, see readwrite.py.

Use the command line tools

stempeg provides a convenient cli tool to convert a stem to multiple wavfiles. The -s switch sets the start, the -t switch sets the duration.

stem2wav The Easton Ellises - Falcon 69.stem.mp4 -s 1.0 -t 2.5

F.A.Q

How can I improve the reading performance?

read_stems is called repeatedly, it always does two system calls, one for getting the file info and one for the actual reading speed this up you could provide the Info object to read_stems if the number of streams, the number of channels and the sample rate is identical.

file_path = stempeg.example_stem_path()
info = stempeg.Info(file_path)
S, _ = stempeg.read_stems(file_path, info=info)

How can the quality of the encoded stems be increased

For Encoding it is recommended to use the Fraunhofer AAC encoder (libfdk_aac) which is not included in the default ffmpeg builds. Note that the conda version currently does not include fdk-aac. If libfdk_aac is not installed stempeg will use the default aac codec which will result in slightly inferior audio quality.

Comments

stempeg 2.0

This addresses #27 and implements a new ffmpeg backend. I choose ffmpeg-python for reading and writing. Here the audio is piped directly to stdin instead of writing temporarly files with pysoundfile and converting them in a separate process call.

Part of the code was copied from spleeters audio backend. First benchmarks of the input piping indicate that this method is twice as fast as my previous "tmpfile based method".

Saving stems still requires to save temporarly files since the complex filter cannot be carried out using python-ffmpeg. This enabled a new API. Here the idea was to not come up with presets and do all the checks to cover all use cases but instead let users have to do this themselves. This means more errors for users, but its way easier to maintain. E.g. if a user wants to write multistream audio as .wav files, an error will be thrown, since this container does not support multiple streams. The user would instead have to use streams_as_multichannel.

This PR furthermore introduces a significant number of new features:

Audio Loading

Loading audio now uses the same API as in spleeters audio loading backend
A target samplerate can be specified to resample audio on-the-fly and return the resampled audio
An option stems_from_multichannel was added to load stems that are aggregated into multichannel audio (concatenation of pairs of stereo channels), see more info on audio writing
substream titles can be read from the Info object.

Audio Writing

stems can now be saved as substreams, aggregated into channels or saved as multiple files.
titles for each substream can now be embedded into metadata
in addition to write_stems (which is a preset to achieve compatibility with NI stems), we also have write_streams (supports writing as multichannel or multiple files). And, in case, stempeg is used for just stereo files, write_audio can be used (Again this is API compatible to spleeter).

The procedure for writing stream files may be quite complex as it varies depending of the specified output container format. Basically there are two possible stream saving options:

1.) container supports multiple streams (mp4/m4a, opus, mka) 2.) container does not support multiple streams (wav, mp3, flac)

For 1.) we provide two options:

1a.) streams will be saved as substreams aka when streams_as_multichannel=False (default) 1b.) streams will be aggregated into channels and saved as multichannel file. Here the audio tensor of shape=(streams, samples, 2) will be converted to a single-stream multichannel audio (samples, streams*2). This option is activated using streams_as_multichannel=True 1c.) streams will be saved as multiple files when streams_as_files is active

For 2.), when the container does not support multiple streams there are also two options:

2a) streams_as_multichannel has to be set to True (See 1b) otherwise an error will be raised. Note that this only works for wav and flac). * file ending of path determines the container (but not the codec!). 2b) streams_as_files so that multiple files will be created when streams_as_files is active

Example / Use Cases

"""Opens a stem file and saves (re-encodes) back to a stem file
"""
import argparse
import stempeg
import subprocess as sp
import numpy as np
from os import path as op


if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument(
        'input',
    )
    args = parser.parse_args()

    # load stems
    stems, rate = stempeg.read_stems(args.input)

    # load stems,
    # resample to 96000 Hz,
    # use multiprocessing
    stems, rate = stempeg.read_stems(
        args.input,
        sample_rate=96000,
        multiprocess=True
    )

    # --> stems now has `shape=(stem x samples x channels)``

    # save stems from tensor as multi-stream mp4
    stempeg.write_stems(
        "test.stem.m4a",
        stems,
        sample_rate=96000
    )

    # save stems as dict for convenience
    stems = {
        "mix": stems[0],
        "drums": stems[1],
        "bass": stems[2],
        "other": stems[3],
        "vocals": stems[4],
    }
    # keys will be automatically used

    # from dict as files
    stempeg.write_stems(
        "test.stem.m4a",
        data=stems,
        sample_rate=96000
    )

    # `write_stems` is a preset for the following settings
    # here the output signal is resampled to 44100 Hz and AAC codec is used
    stempeg.write_stems(
        "test.stem.m4a",
        stems,
        sample_rate=96000,
        writer=stempeg.StreamsWriter(
            codec="aac",
            output_sample_rate=44100,
            bitrate="256000",
            stem_names=['mix', 'drums', 'bass', 'other', 'vocals']
        )
    )

    # Native Instruments compatible stems
    stempeg.write_stems(
        "test_traktor.stem.m4a",
        stems,
        sample_rate=96000,
        writer=stempeg.NIStemsWriter(
            stems_metadata=[
                {"color": "#009E73", "name": "Drums"},
                {"color": "#D55E00", "name": "Bass"},
                {"color": "#CC79A7", "name": "Other"},
                {"color": "#56B4E9", "name": "Vocals"}
            ]
        )
    )

    # lets write as multistream opus (supports only 48000 khz)
    stempeg.write_stems(
        "test.stem.opus",
        stems,
        sample_rate=96000,
        writer=stempeg.StreamsWriter(
            output_sample_rate=48000,
            codec="opus"
        )
    )

    # writing to wav requires to convert streams to multichannel
    stempeg.write_stems(
        "test.wav",
        stems,
        sample_rate=96000,
        writer=stempeg.ChannelsWriter(
            output_sample_rate=48000
        )
    )

    # # stempeg also supports to load merged-multichannel streams using
    stems, rate = stempeg.read_stems(
        "test.wav",
        reader=stempeg.ChannelsReader(nb_channels=2)
    )

    # mp3 does not support multiple channels,
    # therefore we have to use `stempeg.FilesWriter`
    # outputs are named ["output/0.mp3", "output/1.mp3"]
    # for named files, provide a dict or use `stem_names`
    # also apply multiprocessing
    stempeg.write_stems(
        ("output", ".mp3"),
        stems,
        sample_rate=rate,
        writer=stempeg.FilesWriter(
            multiprocess=True,
            output_sample_rate=48000,
            stem_names=["mix", "drums", "bass", "other", "vocals"]
        )
    )

enhancement

opened by faroit 28

Is this not working on windows?

import glob, os
import stempeg
import os.path

train_path = "path_to_train/"
os.chdir(train_path)
for file in glob.glob("*.stem.mp4"):
    file_path = train_path + file
    print(os.path.isfile(file_path))
    S, rate = stempeg.read_stems(file_path)

Even isfile returns true, read_stems throws 'FileNotFoundError: [WinError 2] '

opened by westside 17

Ffprobe command returns non-zero exit status 3221225478

I am running it on anaconda. It seems to work perfectly on colab. However on anaconda it fails.

The behavior is weird as well. I ran the command on bash and it runs correctly.

I have a loop which runs through all the stem files and it breaks after executing random iterations giving the error stated below. I believe this could be an multiprocessing issue. Could it be that that file is already being used by another process?

File "", line 1, in runfile('C:/Users/w1572032/.spyder-py3/temp.py', wdir='C:/Users/w1572032/.spyder-py3')

File "C:\ProgramData\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 705, in runfile execfile(filename, namespace)

File "C:\ProgramData\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile exec(compile(f.read(), filename, 'exec'), namespace)

File "C:/Users/w1572032/.spyder-py3/temp.py", line 28, in t = np.copy(track.targets['vocals'].audio.T)

File "C:\ProgramData\Anaconda3\lib\site-packages\musdb\audio_classes.py", line 113, in audio audio = source.audio

File "C:\ProgramData\Anaconda3\lib\site-packages\musdb\audio_classes.py", line 47, in audio filename=self.path, stem_id=self.stem_id

File "C:\ProgramData\Anaconda3\lib\site-packages\stempeg\read.py", line 90, in read_stems FFinfo = FFMPEGInfo(filename)

File "C:\ProgramData\Anaconda3\lib\site-packages\stempeg\read.py", line 19, in init self.json_info = read_info(self.filename)

File "C:\ProgramData\Anaconda3\lib\site-packages\stempeg\read.py", line 54, in read_info out = sp.check_output(cmd)

File "C:\ProgramData\Anaconda3\lib\subprocess.py", line 336, in check_output **kwargs).stdout

File "C:\ProgramData\Anaconda3\lib\subprocess.py", line 418, in run output=stdout, stderr=stderr)

CalledProcessError: Command '['ffprobe', 'C:\Users\w1572032\Desktop\musdb18\train\BigTroubles - Phantom.stem.mp4', '-v', 'error', '-print_format', 'json', '-show_format', '-show_streams']' returned non-zero exit status 3221225478.

opened by vinspatel 9
OSX quicklook support

🥳 osx seems to support stem files and has a UI to select the stem right from the quicklook window:

However, in seems that is uses some specific metadata to read the stem track name. Currently I don't know how to do that with ffmpeg, but it would be great to find out if there is way to support this.
enhancement help wanted

opened by faroit 8
Stems write - Format not recognised
Hello,

As you stated in the documentation the stems write doesn't always work well. I am using this command with ffmpeg to create a STEM file:

ffmpeg -i ~/mix.wav -i ~drums.wav -i ~/vocals.wav -map 0 -map 1 -map 2 -c:a libfdk_aac -metadata:s:0 title=mix -metadata:s:1 title=drums -metadata:s:2 title=vocals ~/output.stem.mp4

I then tried to read it back using the musdb library and it works well. I was wondering if this could be included in your library to finally make it work properly.

I unfortunately do not have much time to work more on this and ask for a pull request but I made a simple implementation if could be of any help. Also check this homebrew-ffmpeg if the right codecs are not installed properly in the official ffmpeg distribution.
opened by shoegazerstella 8
Freeze when loading mp4 muli-stem file

I am using the musdb package and convert the mp4 files containing multiple audio sources to wave files, as shown here:

https://github.com/f90/Wave-U-Net/blob/master/Datasets.py#L132

But randomly during conversion (so with potentially any file), conversion just freezes forever. After interrupting the process I can read the following error:

Traceback (most recent call last): File "/opt/local/pycharm/helpers/pydev/pydevd.py", line 1668, in main() File "/opt/local/pycharm/helpers/pydev/pydevd.py", line 1662, in main globals = debugger.run(setup['file'], None, None, is_module) File "/opt/local/pycharm/helpers/pydev/pydevd.py", line 1072, in run pydev_imports.execfile(file, globals, locals) # execute the script File "/mnt/daten/PycharmProjects/Wave-U-Net/Training.py", line 326, in @ex.automain File "/home/daniel/tf-env-waveunet/local/lib/python2.7/site-packages/sacred/experiment.py", line 137, in automain self.run_commandline() File "/home/daniel/tf-env-waveunet/local/lib/python2.7/site-packages/sacred/experiment.py", line 260, in run_commandline return self.run(cmd_name, config_updates, named_configs, {}, args) File "/home/daniel/tf-env-waveunet/local/lib/python2.7/site-packages/sacred/experiment.py", line 209, in run run() File "/home/daniel/tf-env-waveunet/local/lib/python2.7/site-packages/sacred/run.py", line 221, in call self.result = self.main_function(*args) File "/home/daniel/tf-env-waveunet/local/lib/python2.7/site-packages/sacred/config/captured_function.py", line 46, in captured_function result = wrapped(*args, **kwargs) File "/mnt/daten/PycharmProjects/Wave-U-Net/Training.py", line 348, in dsd_100_experiment dsd_train, dsd_test = Datasets.getMUSDB(model_config["musdb_path"]) # List of (mix, acc, bass, drums, other, vocal) tuples File "/mnt/daten/PycharmProjects/Wave-U-Net/Datasets.py", line 149, in getMUSDB vocal_audio = track.targets["vocals"].audio File "/home/daniel/tf-env-waveunet/local/lib/python2.7/site-packages/musdb/audio_classes.py", line 113, in audio audio = source.audio File "/home/daniel/tf-env-waveunet/local/lib/python2.7/site-packages/musdb/audio_classes.py", line 47, in audio filename=self.path, stem_id=self.stem_id File "/home/daniel/tf-env-waveunet/local/lib/python2.7/site-packages/stempeg/read.py", line 91, in read_stems FFinfo = FFMPEGInfo(filename) File "/home/daniel/tf-env-waveunet/local/lib/python2.7/site-packages/stempeg/read.py", line 19, in init self.json_info = read_info(self.filename) File "/home/daniel/tf-env-waveunet/local/lib/python2.7/site-packages/stempeg/read.py", line 55, in read_info out = sp.check_output(cmd) File "/usr/lib/python2.7/subprocess.py", line 567, in check_output process = Popen(stdout=PIPE, *popenargs, **kwargs) File "/usr/lib/python2.7/subprocess.py", line 711, in init errread, errwrite) File "/usr/lib/python2.7/subprocess.py", line 1319, in _execute_child data = _eintr_retry_call(os.read, errpipe_read, 1048576) File "/usr/lib/python2.7/subprocess.py", line 476, in _eintr_retry_call return func(*args) KeyboardInterrupt

Process finished with exit code 1

It seems that the ffmpeg/ffprobe process that identifies the stems within the mp4 file never returns, or returns empty output, or sth of that sort, so that the stempeg library waits forever for a response at sp.check_output. It doesnt look like there is a timeout for waiting for the ffmpeg output either. Plus ffmpeg is called with -v error, maybe that is suppressing errors that we should react to?

Any idea of how to fix this?

opened by f90 7
fix dithering when exporting to float

In the last version of stempeg the method to load PCM through python-ffmpeg resulted in small differences to what I obtained with the previous version of stempeg.

This PR reverts the casting procedure for int16 to float32 + normalization [-1, 1]. This is important since dithering errors did significantly influence separation scores in regression tests.

With this PR, ffmpeg pipes int16 into the numpy buffer and is converted and normalized to float in numpy. This got the same results as I used to have before where I used temporarly wav files, converting to float32 using soundfile. Furthermore, when using int16 pipes, conversion is slightly faster.

Also ping @romi1502 and @mmoussallam since this function originally derived from spleeters code. You might want to change it there as well.

opened by faroit 6
A loading error in Win System.
my data has a name format like 'xxxx - xxxx.stem.m64'. but stempeg cannot reconginize the blank space before "-". So it will raise a error said there is no file.

The dataset actually is MUSDB18-7 set

I solve it by following codes which actually deletes the front blank space. But I hope there is a better way to solve it.

index = track_name.index('-') track_name = track_name[:index-1] + track_name[index:]
opened by igo312 6
Centralize detection of ffmpeg executables

In some distributions (e.g. NixOS) ffmpeg is not necessarily in PATH. Unifying detection of the ffmpeg and ffprobe executables makes it easier for packagers to patch the package to accomodate such situations.

opened by bgamari 5
Tests failing with wrong shapes

Hi author(s),

I'm trying to run the tests included in this package, but the assert statements on the shapes of the stems are failing. The tests expect a shape of (5, 265216, 2) but the file has a shape of (5, 267264, 2).

Is this a bug or have the files been updated without updating the tests?

Thanks!

opened by jaidevd 5
allow ffmpeg format to be optional

carefully reviews the proposals made by @romi1502 in #39 and reverts the fix. To allow regression tests to pass for dependencies such as musdb or museval, the old behaviour can be used with

stempeg.read_stems(..., ffmpeg_format="s16le")

opened by faroit 3

warnings.warning() does not exist

Bug Description: When using stempeg as part of musdb, I encountered the following error:

        stem_durations = np.array([t.shape[0] for t in stems])
        if not (stem_durations == stem_durations[0]).all():
>           warnings.warning("Stems differ in length and were shortend")
E           AttributeError: module 'warnings' has no attribute 'warning'

/usr/local/lib/python3.9/site-packages/stempeg/read.py:299: AttributeError

warning() does not exist after checking the warnings package.

Suggested Solution: warnings.warning() -> warnings.warn() since warn() exists.

opened by jeswan 1

16 bit flac output conversion?

Is there a way to convert the 4 stem output files from the new Open-Unmix UMX using Stempeg to output 16 bit flac files instead of the 24 bit flac files I am currently getting using it?

Thank, Rog
enhancement help wanted

opened by Mixerrog 3
Support reading from file-like objects

supporting file-like objects to read and decode in-memory data would be a useful enhancement. There may be problems, as suggested here, though: https://github.com/kkroening/ffmpeg-python/issues/292
enhancement

opened by faroit 0

Releases(v0.2.3)

v0.2.3(Jan 30, 2021)
Version 0.2 is a rewrite of stempeg that focusses on speed and performance but also adding a number of additional features. Furthermore, stempeg now can read and write stem files in three different ways to utilize best the different audio containers. For example, as pcm/wav doesn't support multiple audio streams, instead, stempeg can read and write into streams aggregated into multiple pairs of stereo channels.

Audio Loading

Underlying reading backend is now based on python-ffmpeg.

With this new backend, the creation of any temporary files is reduced, thus audio is directly piped into numpy via stdio. This leads to loading time improvement of 20%-30%.

A target sample rate can be specified to resample audio on-the-fly using ffmpeg.

An optional stems_from_multichannel was added to load stems that are aggregated into multichannel audio (concatenation of pairs of stereo channels), see more info on audio writing.

substream titles metadata can be read from the Info object.

Loading audio now uses the same API as in spleeters audio loading backend.

Audio Writing

This new version stabilizes writing support adding writer methods to be passed to stempeg.write_stems() to save multi-stream audio. The choice of the writing method mainly depends on the audio container and codec. E.g. some containers supports multiple stems (mp4/m4a, opus, mka) where as others does do not (wav, mp3...).

stempeg.FilesWriter saves stems into multiple files. This writer can be boosted in performance using multiprocess=True. Which writes the stems in parallel.

stempeg.ChannelsWriter saves as multiple channels. Stems will be multiplexed into channels and saved as a single multichannel file. E.g. an audio tensor of shape=(stems, samples, 2) will be converted to a single-stem multichannel audio (samples, stems*2).

stempeg.StreamsWriter saves into a single a multi-stream file.

stempeg.NIStemsWriter saves into a single multistream audio. Finally one can create stems files that are fully compatible with Native Instruments stems. For this, MP4Box has to be installed. See more info here.

Furthermore the following features were added:

Names for each substream can now be embedded into metadata.

stempeg can be used to just write normal audio files (mono and multichannel) using write_audio which also is fully API compatible to spleeters audio backend.

For more information see the updated documentation Thanks to @mmoussallam, @romi1502, @Rhymen, @nlswrnr, and @axeldelafosse
Source code(tar.gz)
Source code(zip)
v0.1.8(Jul 9, 2019)

The seeking issue (#21) was not fully fixed. This release should address the remaining issues when using the chunked loading using very small float numbers as start parameter
Source code(tar.gz)
Source code(zip)
v0.1.7(Jul 8, 2019)

Fixes a bug (#18) that occurs when start or duration is using very small float numbers (1e-6) that are literally converted into strings maintaining the scientific notation.

Also addresses #21 and add an additional check for ffmpeg and ffprobe before actually reading any files
Source code(tar.gz)
Source code(zip)
v0.1.6(Mar 13, 2019)
a demo track is now part of stempeg for convenience

stem, rate = stempeg.read_stems(stempeg.example_stem_path())
Source code(tar.gz)
Source code(zip)
v0.1.5(Mar 13, 2019)
added seeking

added the ability to provide file info to reduce the number of calls

Source code(tar.gz)
Source code(zip)
v0.1.4(Nov 10, 2018)

There was a bug in the earlier versions of stempeg that didn't respect the set out_type in the stem reader. This was fixed and the output defaults to np.float64.

Thanks to

@hexafraction
Source code(tar.gz)
Source code(zip)
v0.1.3(Feb 18, 2018)

Add some code and warnings to detect the ffmpeg version and warn users when a version older than 3.0 is used since that is adding additional silence to the output files when encoding.

Also addressing #3
Source code(tar.gz)
Source code(zip)
v0.1.2(Dec 20, 2017)

this release fixes #1 by checking the available ffmpeg encoders and picking aac ist libfdk_aac is not available.

Also the ffmpeg error now are visible
Source code(tar.gz)
Source code(zip)
v0.1.1(Dec 17, 2017)

Source code(tar.gz)
Source code(zip)

Owner

Fabian-Robert Stöter

Audio-ML researcher

GitHub https://faroit.github.io/stempeg

cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding for Python

audioread Decode audio files using whichever backend is available. The library currently supports: Gstreamer via PyGObject. Core Audio on Mac OS X via

419 Dec 26, 2022

cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding for Python

audioread Decode audio files using whichever backend is available. The library currently supports: Gstreamer via PyGObject. Core Audio on Mac OS X via

359 Feb 15, 2021

Audio spatialization over WebRTC and JACK Audio Connection Kit

Audio spatialization over WebRTC Spatify provides a framework for building multichannel installations using WebRTC.

34 Jun 29, 2022

Audio augmentations library for PyTorch for audio in the time-domain

Audio augmentations library for PyTorch for audio in the time-domain, with support for stochastic data augmentations as used often in self-supervised / contrastive learning.

166 Jan 8, 2023

praudio provides audio preprocessing framework for Deep Learning audio applications

praudio provides objects and a script for performing complex preprocessing operations on entire audio datasets with one command.

105 Dec 26, 2022

Automatically move or copy files based on metadata associated with the files. For example, file your photos based on EXIF metadata or use MP3 tags to file your music files.

14 Nov 2, 2022

Using python to generate a bat script of repetitive lines of code that differ in some way but can sort out a group of audio files according to their common names

Batch Sorting Using python to generate a bat script of repetitive lines of code that differ in some way but can sort out a group of audio files accord

1 Oct 29, 2021

Python I/O for STEM audio files

Related tags

Overview

stempeg = stems + ffmpeg

Features

Installation

1. Installation of ffmpeg Library

1a. (optional) Installation of MP4Box

2. Installation of the stempeg package

Usage

Reading audio

Reading individual streams

Read excerpts (set seek position)

Writing audio

Write multi-channel audio

Writing multi-stream audio

Use the command line tools

F.A.Q

How can I improve the reading performance?

How can the quality of the encoded stems be increased

Comments

Audio Loading

Audio Writing

Example / Use Cases

Releases(v0.2.3)

v0.2.3(Jan 30, 2021)

Audio Loading

Audio Writing

v0.1.8(Jul 9, 2019)

v0.1.7(Jul 8, 2019)

v0.1.6(Mar 13, 2019)

v0.1.5(Mar 13, 2019)

v0.1.4(Nov 10, 2018)

v0.1.3(Feb 18, 2018)

v0.1.2(Dec 20, 2017)

v0.1.1(Dec 17, 2017)

Owner

Fabian-Robert Stöter

cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding for Python

cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding for Python

Audio spatialization over WebRTC and JACK Audio Connection Kit

Audio augmentations library for PyTorch for audio in the time-domain

praudio provides audio preprocessing framework for Deep Learning audio applications

Automatically move or copy files based on metadata associated with the files. For example, file your photos based on EXIF metadata or use MP3 tags to file your music files.

Using python to generate a bat script of repetitive lines of code that differ in some way but can sort out a group of audio files according to their common names

Carnatic Notes Predictor for audio files

This bot can stream audio or video files and urls in telegram voice chats

Audio fingerprinting and recognition in Python

Python library for audio and music analysis

Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications

Scalable audio processing framework written in Python with a RESTful API

Python module for handling audio metadata

Python library for handling audio datasets.

A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.

Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications

C++ library for audio and music analysis, description and synthesis, including Python bindings

Python audio and music signal processing library