cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding for Python

Related tags

Audio audio python
Overview

audioread

https://secure.travis-ci.org/beetbox/audioread.png

Decode audio files using whichever backend is available. The library currently supports:

Use the library like so:

with audioread.audio_open(filename) as f:
    print(f.channels, f.samplerate, f.duration)
    for buf in f:
        do_something(buf)

Buffers in the file can be accessed by iterating over the object returned from audio_open. Each buffer is a bytes-like object (buffer, bytes, or bytearray) containing raw 16-bit little-endian signed integer PCM data. (Currently, these PCM format parameters are not configurable, but this could be added to most of the backends.)

Additional values are available as fields on the audio file object:

  • channels is the number of audio channels (an integer).
  • samplerate is given in Hz (an integer).
  • duration is the length of the audio in seconds (a float).

The audio_open function transparently selects a backend that can read the file. (Each backend is implemented in a module inside the audioread package.) If no backends succeed in opening the file, a DecodeError exception is raised. This exception is only used when the file type is unsupported by the backends; if the file doesn't exist, a standard IOError will be raised.

A second optional parameter to audio_open specifies which backends to try (instead of trying them all, which is the default). You can use the available_backends function to get a list backends that are usable on the current system.

Audioread is "universal" and supports both Python 2 (2.6+) and Python 3 (3.2+).

Example

The included decode.py script demonstrates using this package to convert compressed audio files to WAV files.

Version History

2.1.9
Work correctly with GStreamer 1.18 and later (thanks to @ssssam)
2.1.8
Fix an unhandled OSError when FFmpeg is not installed.
2.1.7
Properly close some filehandles in the FFmpeg backend (thanks to @RyanMarcus and @ssssam). The maddec backend now always produces bytes objects, like the other backends (thanks to @ssssam). Resolve an audio data memory leak in the GStreamer backend (thanks again to @ssssam). You can now optionally specify which specific backends audio_open should try (thanks once again to @ssssam). On Windows, avoid opening a console window to run FFmpeg (thanks to @flokX).
2.1.6
Fix a "no such process" crash in the FFmpeg backend on Windows Subsystem for Linux (thanks to @llamasoft). Avoid suppressing SIGINT in the GStreamer backend on older versions of PyGObject (thanks to @lazka).
2.1.5
Properly clean up the file handle when a backend fails to decode a file. Fix parsing of "N.M" channel counts in the FFmpeg backend (thanks to @piem). Avoid a crash in the raw backend when a file uses an unsupported number of bits per sample (namely, 24-bit samples in Python < 3.4). Add a __version__ value to the package.
2.1.4
Fix a bug in the FFmpeg backend where, after closing a file, the program's standard input stream would be "broken" and wouldn't receive any input.
2.1.3
Avoid some warnings in the GStreamer backend when using modern versions of GLib. We now require at least GLib 2.32.
2.1.2
Fix a file descriptor leak when opening and closing many files using GStreamer.
2.1.1
Just fix ReST formatting in the README.
2.1.0
The FFmpeg backend can now also use Libav's avconv command. Fix a warning by requiring GStreamer >= 1.0. Fix some Python 3 crashes with the new GStreamer backend (thanks to @xix-xeaon).
2.0.0
The GStreamer backend now uses GStreamer 1.x via the new gobject-introspection API (and is compatible with Python 3).
1.2.2
When running FFmpeg on Windows, disable its crash dialog. Thanks to jcsaaddupuy.
1.2.1
Fix an unhandled exception when opening non-raw audio files (thanks to aostanin). Fix Python 3 compatibility for the raw-file backend.
1.2.0
Add support for FFmpeg on Windows (thanks to Jean-Christophe Saad-Dupuy).
1.1.0
Add support for Sun/NeXT Au files via the standard-library sunau module (thanks to Dan Ellis).
1.0.3
Use the rawread (standard-library) backend for .wav files.
1.0.2
Send SIGKILL, not SIGTERM, to ffmpeg processes to avoid occasional hangs.
1.0.1
When GStreamer fails to report a duration, raise an exception instead of silently setting the duration field to None.
1.0.0
Catch GStreamer's exception when necessary components, such as uridecodebin, are missing. The GStreamer backend now accepts relative paths. Fix a hang in GStreamer when the stream finishes before it begins (when reading broken files). Initial support for Python 3.
0.8
All decoding errors are now subclasses of DecodeError.
0.7
Fix opening WAV and AIFF files via Unicode filenames.
0.6
Make FFmpeg timeout more robust. Dump FFmpeg output on timeout. Fix a nondeterministic hang in the Gstreamer backend. Fix a file descriptor leak in the MAD backend.
0.5
Fix crash when FFmpeg fails to report a duration. Fix a hang when FFmpeg fills up its stderr output buffer. Add a timeout to ffmpeg tool execution (currently 10 seconds for each 4096-byte read); a ReadTimeoutError exception is raised if the tool times out.
0.4
Fix channel count detection for FFmpeg backend.
0.3
Fix a problem with the Gstreamer backend where audio files could be left open even after the GstAudioFile was "closed".
0.2
Fix a hang in the GStreamer backend that occurs occasionally on some platforms.
0.1
Initial release.

Et Cetera

audioread is by Adrian Sampson. It is made available under the MIT license. An alternative to this module is decoder.py.

Comments
  • [WIP] port gstdec to gstreamer 1.x

    [WIP] port gstdec to gstreamer 1.x

    This PR is still a work in progress.

    Still need to figure out how to get the buffer data. I've been testing this with decode.py and I end up with:

    TypeError: can't convert return value to desired type

    Note: I don't actually know much about how Gstreamer works internally, so definitely take a good look.

    opened by jrobeson 12
  • fix: do not display windows error popup when ffmpeg crashes

    fix: do not display windows error popup when ffmpeg crashes

    Hi. On another poject using ffmpeg, I came accross a nasty behavior on windows when, for any reason, a process spawned by subprocess.popen crashes.

    By default, windows will display an error pop-up like this one, instead of lettting the process die silently (in a user perspective)

    (in this case, this is ffprobe crashing when trying to read an h264 video, but the result would be the same with an ffmpeg crash if/when it would happen)

    screen shot 2015-03-26 at 11 13 55

    Windows will display one pop-up per crash (and if this happen on a large number of files, this would be total a mess for the user)

    This PR aims to avoid this behavior on windows by setting error mode to SEM_NOGPFAULTERRORBOX and pass SEM_NOOPENFILEERRORBOX as a a flag to popen. This solution is based on this stack overflow answer.

    opened by jcsaaddupuy 11
  • Hanging in close()

    Hanging in close()

    Mac OS X 10.9, ffmpeg 2.1 from MacPorts, audioread 1.0.1 from pypi.

    If I have the following small file (with the directory of MP3s being the one from this repository - https://github.com/mysociety/sayit ):

    import audioread.ffdec
    import os
    
    def get_audio_duration(in_filename):
        f = audioread.ffdec.FFmpegAudioFile(in_filename)
        return round(f.duration)
    
    root = 'speeches/fixtures/expected_outputs/mp3'
    for mp3 in os.listdir(root):
        mp3 = os.path.join(root, mp3)
        print mp3, get_audio_duration(mp3)
    

    Then sometimes it will run fine, but frequently it will hang on one of the files. As far as I have been able to work out, it is hanging on the wait() inside close(), but I'm not sure what I should do in order to debug it further. Running ffmpeg manually, I can't see any problems. I can Ctrl-C, in which case I get a message: Exception KeyboardInterrupt: KeyboardInterrupt() in <bound method FFmpegAudioFile.__del__ of <audioread.ffdec.FFmpegAudioFile object at 0x1087cff90>> ignored and the file continues running without issue (printing the duration, in this case), but leaves an ffmpeg process lying about that I have to manually kill -9.

    Hope that's useful, do let me know if you'd like any further information to help debug or fix this.

    opened by dracos 11
  • Gstreamer/PyGObject breaks SIGINT handling

    Gstreamer/PyGObject breaks SIGINT handling

    GLib.MainLoop.__init__ installs a SIGINT handler which calls GLib.MainLoop.quit(), and then reraise a KeyboardInterrupt in that thread. However, the main thread will not notice that, and continues whatever it is doing, which mostly means that it will hang as it is waiting for something. What you would want is a KeyboardInterrupt in the main thread, as this is the original behavior when you press Ctrl+C.

    I came up with this workaround / ugly monkey patch to just disable the SIGINT handler. However, a better fix would be in PyGObject to raise the KeyboardInterrupt in the main thread.

    def monkeyfix_glib():
      """
      Fixes some stupid bugs such that SIGINT is not working.
      This is used by audioread, and indirectly by librosa for loading audio.
      https://stackoverflow.com/questions/16410852/
      """
      try:
        import gi
      except ImportError:
        return
      try:
        from gi.repository import GLib
      except ImportError:
        from gi.overrides import GLib
      # Do nothing.
      # The original behavior would install a SIGINT handler which calls GLib.MainLoop.quit(),
      # and then reraise a KeyboardInterrupt in that thread.
      # However, we want and expect to get the KeyboardInterrupt in the main thread.
      GLib.MainLoop.__init__ = lambda *args, **kwargs: None
    
    opened by albertz 8
  • Implementation into AWS, no backend for MP3

    Implementation into AWS, no backend for MP3

    Trying to deploy this package to an AWS Lambda function with some problems. Works great with AIF, and probably some other file types, but really looking for MP3 support.

    So, I've created a Python file called importFile, that's heavily based on the example provided. It works for all audio file types as expected, locally. redactedImportFile.txt. The general idea being, pull a file from a cloud location, convert it, and push it to a different cloud location.

    Now, when I install all packages into a deployment folder using "pip install --target ." , zip them, and deploy to Lambda, something doesn't work. It's pretty hard to debug the no backend issue, and I'm not sure if I'm missing something obvious here.

    What I've tried:

    • Creating virtual environment, installing dependencies, and deploying
    • Installing dependencies to folder, zipping with importFile.py, and deploying
    • Comparing pip freeze on my local machine to the packages I've installed in the virtual environment/deployment folder

    What's worked

    • Nothing

    Dependencies I have installed for deployment

    • wave
    • pygobject
    • gstreamer
    • audioread
    • boto3
    • ffmpeg

    I have ffmpeg and gstreamer installed with hopes that one of the two backends would work.

    Here is an output of my CloudWatch logs, the 18:36 time-stamp being an AIF file uploaded and the 18:42 time-stamp being an MP3 file uploaded.

    image

    I've read through some of the other no backend issues but none similar to this. If this isn't an audioread issue, then I do apologize. Thanks for your time

    opened by Nashluffy 7
  • gi dependency breaks due to package namespace collision

    gi dependency breaks due to package namespace collision

    The gstreamer backend tries to import the gi package. This package appears to have some kind of a namespace collision on pypi, and will cause failures if https://pypi.python.org/pypi/gi is installed.

    Specifically, before it can even fail due to missing API, it will crash on python3 because the gi.py module uses a print statement instead of a function call.

    I'm not sure there's a good solution to this, since you can't use install-requires for optional dependencies here.

    EDIT: reference librosa group thread here: https://groups.google.com/forum/#!topic/librosa/pKT3z2NYKIE

    opened by bmcfee 7
  • ffmpeg backend mucks with standard input on `close()`

    ffmpeg backend mucks with standard input on `close()`

    As discovered in https://github.com/beetbox/beets/issues/2039, we appear to cause problems on f.close() in with the FFmpeg backend. Specifically, subsequent calls to read from standard input (i.e., raw_input() calls) no longer receive any keyboard input.

    This is easy to reproduce with this tiny test script:

    import audioread.ffdec
    f = audioread.ffdec.FFmpegAudioFile('xxx.m4a')
    f.close()
    raw_input('prompt: ')
    

    I was able to narrow down the problem to e31af0b5febba7aa07bd24c0065ba5faf93eacc0, which was our fix for #9. If I change that line to send SIGTERM instead of SIGKILL to the ffmpeg process, everything works fine.

    opened by sampsyo 7
  • Deprecation warnings in gstdec.py

    Deprecation warnings in gstdec.py

    /usr/local/lib/python2.7/dist-packages/audioread/gstdec.py:126: PyGIDeprecationWarning: Since version 3.11, calling threads_init is no longer needed. See: https://wiki.gnome.org/PyGObject/Threading
      GObject.threads_init()
    /usr/local/lib/python2.7/dist-packages/audioread/gstdec.py:146: PyGIDeprecationWarning: MainLoop is deprecated; use GLib.MainLoop instead
      self.loop = GObject.MainLoop()
    
    opened by Nerten 7
  • Getting segmentation fault after using audioread for MP4s

    Getting segmentation fault after using audioread for MP4s

    I had been using audioread to get the audio from MP4 videos from different cameras, in order to synchronize them.

    I have noticed that I am now getting a segmentation fault when all is done, and have debugged it to be something in audioread. Some minimal code to get this is included below. The expected behavior is it should open the MP4 file, do its thing, and exit; what I get is it does its thing but exits with a segmentation fault. I have commented out all remaining imports (cv2 and numpy) and I still get the segmentation fault. Platform is up-to-date Linux Ubuntu 14.04; the problem does not occur on a Mac. audioread is using gstdec to go through the MP4 and I tried both the

    with audioread.audio_open() as f

    and the current

    try: (etc) finally: f.close()

    syntax listed in the audioread docs. Perhaps there is a problem in how the stream is closed, or in gstreamer?

    Any suggestions? -Dennis

    !/usr/bin/env python

    import cv2

    import audioread

    import numpy as np

    if name == "main":

    # THIS PART IS OK
    #blue = cv2.VideoCapture("blue.MP4")
    #print ("Blue fps: {0}".format(blue.get(cv2.cv.CV_CAP_PROP_FPS)))
    #blue = None
    
    # THIS PART RUNS OK BUT.... 
    temp = bytearray()
    f = audioread.audio_open("blue.MP4")
    try:
        samplerate = f.samplerate
        duration = f.duration
        channels = f.channels
        for block in f:
            #print np.frombuffer(block,dtype=np.int16)
            temp.extend(block)
    finally:
        f.close()
        f = None
    
    #signal = np.frombuffer(temp,dtype=np.dtype('<i2')) #.reshape(-1,channels)
    print f
    print samplerate,duration,channels
    #print signal
    
    # IT GIVES SEG FAULT HERE AT END OF RUNNING, UPON EXITING
    
    opened by devangel77b 7
  • audioread.NoBackendError

    audioread.NoBackendError

    Traceback (most recent call last): File "train.py", line 196, in train(train_A_dir = train_A_dir, train_B_dir = train_B_dir, model_dir = model_dir, model_name = model_name, random_seed = random_seed, validation_A_dir = validation_A_dir, validation_B_dir = validation_B_dir, output_dir = output_dir, tensorboard_log_dir = tensorboard_log_dir) File "train.py", line 32, in train wavs_A = load_wavs(wav_dir = train_A_dir, sr = sampling_rate) File "/Users/kunalkumar/preprocess.py", line 11, in load_wavs wav, _ = librosa.load(file_path, sr = sr, mono = True) File "/Users/kunalkumar/venv/lib/python3.7/site-packages/librosa/core/audio.py", line 112, in load with audioread.audio_open(os.path.realpath(path)) as input_file: File "/usr/local/Cellar/python/3.7.2_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/audioread/init.py", line 116, in audio_open raise NoBackendError() audioread.NoBackendError

    Don't know what to do?

    opened by ghost 6
  • Add a simple testsuite

    Add a simple testsuite

    I wrote a simple testsuite using py.test while investigating https://github.com/beetbox/audioread/pull/78.

    Currently it only tests the audioread.audio_open() method. I'd like to add a separate set of tests for each backend, as I think there are some bugs in both the GStreamer and libmad backends that are causing everything to end up being run through ffmpeg. I don't know if I'll get time for that though.

    opened by ssssam 6
  • converting data to a numpy array

    converting data to a numpy array

    The docs would be much more useful to me if this introductory sample

    with audioread.audio_open(filename) as f:
        print(f.channels, f.samplerate, f.duration)
        for buf in f:
            do_something(buf)
    

    included some notion of how to convert buf to a numpy array. If you have a minute. (and, if that's easy to do)

    opened by gmabey 1
  • Incorrect number of channels with ffmpeg

    Incorrect number of channels with ffmpeg

    Version: 2.1.9

    It is possible to fool audioread into determining there are 0 channels in the audio file when it does actually have an audio channel.

    This occurs when metadata in the file contains the string "audio:"

    Test case:

    $ ffmpeg -i test/data/test-2.mp3 -metadata description="audio: broken" out.mp3
    $ python -c 'import audioread; print(audioread.audio_open("out.mp3", backends=[audioread.ffdec.FFmpegAudioFile]).channels)'
    0
    

    audioread assumes the first line on stderr containing "audio:" is ffmpeg outputting stream information https://github.com/beetbox/audioread/blob/5afc8a6dcb8ab801d19d67dc77fe8824ad04acb5/audioread/ffdec.py#L231

    As seen in the following output, the description containing "audio: broken" occurs before "Stream #0:0: Audio: mp3, 44100 Hz, stereo, fltp, 128 kb/s"

    $ ffmpeg -i out.mp3 -f s16le - > /dev/null
    ffmpeg version 4.2.4-1ubuntu0.1 Copyright (c) 2000-2020 the FFmpeg developers
      built with gcc 9 (Ubuntu 9.3.0-10ubuntu2)
      configuration: --prefix=/usr --extra-version=1ubuntu0.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-nvenc --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
      libavutil      56. 31.100 / 56. 31.100
      libavcodec     58. 54.100 / 58. 54.100
      libavformat    58. 29.100 / 58. 29.100
      libavdevice    58.  8.100 / 58.  8.100
      libavfilter     7. 57.100 /  7. 57.100
      libavresample   4.  0.  0 /  4.  0.  0
      libswscale      5.  5.100 /  5.  5.100
      libswresample   3.  5.100 /  3.  5.100
      libpostproc    55.  5.100 / 55.  5.100
    Input #0, mp3, from 'out.mp3':
      Metadata:
        description     : audio: broken
        encoder         : Lavf58.29.100
      Duration: 00:00:02.04, start: 0.025057, bitrate: 129 kb/s
        Stream #0:0: Audio: mp3, 44100 Hz, stereo, fltp, 128 kb/s
        Metadata:
          encoder         : Lavc58.54
    Stream mapping:
      Stream #0:0 -> #0:0 (mp3 (mp3float) -> pcm_s16le (native))
    Press [q] to stop, [?] for help
    Output #0, s16le, to 'pipe:':
      Metadata:
        description     : audio: broken
        encoder         : Lavf58.29.100
        Stream #0:0: Audio: pcm_s16le, 44100 Hz, stereo, s16, 1411 kb/s
        Metadata:
          encoder         : Lavc58.54.100 pcm_s16le
    size=     345kB time=00:00:02.00 bitrate=1411.2kbits/s speed= 410x    
    
    opened by simon816 1
  • Simplify ffdec.py by using  Popen.communicate() method

    Simplify ffdec.py by using Popen.communicate() method

    Hi, thanks for this very useful library! I was looking into ffdec.py since I need faster loading of mp3 and m4a files. I believe that the module could be improved and simplified by using the Popen.communicate() method. This seems to be the recommended way of retrieving output from a subprocess.

    The current implementation only allows to read the data in blocks which is suboptimal since a user might not be able to adapt the block size. (E.g. librosa just calls audio_open() which has no way of setting a block size.)

    I did a speed comparison that shows that this way of reading data is slower than it needs to be, especially for large files: https://gist.github.com/Bomme/d9aee452c8c1e68fb5fac743df6b2a07

    If you decide to drop Python 2 support (https://github.com/beetbox/audioread/issues/112) the timeout handling might be easier. And for later versions of Python 3 the https://docs.python.org/3/library/subprocess.html#windows-popen-helpers might come in handy.

    opened by Bomme 4
  • Pitch distorted (wrong sample rate?) when loading MP3 with GSt backend

    Pitch distorted (wrong sample rate?) when loading MP3 with GSt backend

    Audioread version: 2.1.9

    When decoding an MP3 file using the GStreamer backend, pitch gets disorted. See example attached, which has the original MP3 file, a WAV file decoded by the decode.py script, and a WAV file decoded by ffmpeg -i.

    When forcing audioread to use the ffmpeg backend, it works correctly.

    Archive.zip

    opened by jonashaag 0
  • Bug: Error when reading file with librosa, audioread reports `float division by zero` error

    Bug: Error when reading file with librosa, audioread reports `float division by zero` error

    Hello!

    I was using the following piece of code,

    audio, sample_rate = librosa.load(file_name, res_type='kaiser_fast')
    

    But I was thrown the following error,

    --> 103     audio, sample_rate = librosa.load(file_name, res_type='kaiser_fast')
        104     mfccs = librosa.feature.mfcc(y=audio, sr=sample_rate, n_mfcc=40)
        105     mfccs_processed = np.mean(mfccs.T,axis=0)
    
    /opt/conda/lib/python3.7/site-packages/librosa/core/audio.py in load(path, sr, mono, offset, duration, dtype, res_type)
        170 
        171     if sr is not None:
    --> 172         y = resample(y, sr_native, sr, res_type=res_type)
        173 
        174     else:
    
    /opt/conda/lib/python3.7/site-packages/librosa/core/audio.py in resample(y, orig_sr, target_sr, res_type, fix, scale, **kwargs)
        551         return y
        552 
    --> 553     ratio = float(target_sr) / orig_sr
        554 
        555     n_samples = int(np.ceil(y.shape[-1] * ratio))
    
    ZeroDivisionError: float division by zero
    

    I was trying to load a .mp3 using librosa, while using audioread as a backend. I am not sure why the error is ocuring, but I'll use a try - catch block to avoid it. A similar issue is here, https://github.com/librosa/librosa/issues/765, but the solution to the problem by the author has not been stated.

    Help would be appreciated. Thank you!

    opened by Rubix982 10
Owner
beetbox
purveyors of fine open-source tools for music nerds
beetbox
Guide & Examples to create deeplearning gstreamer plugins and use them in your pipeline

upai-gst-dl-plugins Guide & Examples to create deeplearning gstreamer plugins and use them in your pipeline Introduction Thanks to the work done by @j

UPAI.IO 11 Dec 11, 2022
MelGAN test on audio decoding

Official repository for the paper MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis The original work URL: https://github.com

Jurio 1 Apr 29, 2022
Pythonic bindings for FFmpeg's libraries.

PyAV PyAV is a Pythonic binding for the FFmpeg libraries. We aim to provide all of the power and control of the underlying library, but manage the gri

PyAV 1.8k Jan 3, 2023
Datamoshing with FFmpeg

ffmosher Datamoshing with FFmpeg Drag and drop video onto mosh.bat to create a datamoshed video. To datamosh an image, please ensure the file is in a

null 18 Sep 11, 2022
Audio augmentations library for PyTorch for audio in the time-domain

Audio augmentations library for PyTorch for audio in the time-domain, with support for stochastic data augmentations as used often in self-supervised / contrastive learning.

Janne 166 Jan 8, 2023
convert-to-opus-cli is a Python CLI program for converting audio files to opus audio format.

convert-to-opus-cli convert-to-opus-cli is a Python CLI program for converting audio files to opus audio format. Installation Must have installed ffmp

null 4 Dec 21, 2022
Audio spatialization over WebRTC and JACK Audio Connection Kit

Audio spatialization over WebRTC Spatify provides a framework for building multichannel installations using WebRTC.

Bruno Gola 34 Jun 29, 2022
praudio provides audio preprocessing framework for Deep Learning audio applications

praudio provides objects and a script for performing complex preprocessing operations on entire audio datasets with one command.

Valerio Velardo 105 Dec 26, 2022
Python library for audio and music analysis

librosa A python package for music and audio analysis. Documentation See https://librosa.org/doc/ for a complete reference manual and introductory tut

librosa 5.6k Jan 6, 2023
Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications

A Python library for audio feature extraction, classification, segmentation and applications This doc contains general info. Click here for the comple

Theodoros Giannakopoulos 5.1k Jan 2, 2023
Python library for handling audio datasets.

AUDIOMATE Audiomate is a library for easy access to audio datasets. It provides the datastructures for accessing/loading different datasets in a gener

Matthias 121 Nov 27, 2022
A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.

Audiomentations A Python library for audio data augmentation. Inspired by albumentations. Useful for deep learning. Runs on CPU. Supports mono audio a

Iver Jordal 1.2k Jan 7, 2023
Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications

A Python library for audio feature extraction, classification, segmentation and applications This doc contains general info. Click here for the comple

Theodoros Giannakopoulos 3.8k Feb 17, 2021
C++ library for audio and music analysis, description and synthesis, including Python bindings

Essentia Essentia is an open-source C++ library for audio analysis and audio-based music information retrieval released under the Affero GPL license.

Music Technology Group - Universitat Pompeu Fabra 2.3k Jan 3, 2023
Python audio and music signal processing library

madmom Madmom is an audio signal processing library written in Python with a strong focus on music information retrieval (MIR) tasks. The library is i

Institute of Computational Perception 1k Dec 26, 2022
pedalboard is a Python library for adding effects to audio.

pedalboard is a Python library for adding effects to audio. It supports a number of common audio effects out of the box, and also allows the use of VST3® and Audio Unit plugin formats for third-party effects.

Spotify 3.9k Jan 2, 2023
Audio library for modelling loudness

Loudness Loudness is a C++ library with Python bindings for modelling perceived loudness. The library consists of processing modules which can be casc

Dominic Ward 33 Oct 2, 2022
a library for audio and music analysis

aubio aubio is a library to label music and sounds. It listens to audio signals and attempts to detect events. For instance, when a drum is hit, at wh

aubio 2.9k Dec 30, 2022
LibXtract is a simple, portable, lightweight library of audio feature extraction functions.

LibXtract LibXtract is a simple, portable, lightweight library of audio feature extraction functions. The purpose of the library is to provide a relat

Jamie Bullock 215 Nov 16, 2022