A Python wrapper for the high-quality vocoder "World"

Overview

PyWORLD - A Python wrapper of WORLD Vocoder

Linux Windows
Build Status Build Status

WORLD Vocoder is a fast and high-quality vocoder which parameterizes speech into three components:

  1. f0: Pitch contour
  2. sp: Harmonic spectral envelope
  3. ap: Aperiodic spectral envelope (relative to the harmonic spectral envelope)

It can also (re)synthesize speech using these features (see examples below).

For more information, please visit Dr. Morise's WORLD repository and the official website of WORLD Vocoder

APIs

Vocoder Functions

import pyworld as pw
_f0, t = pw.dio(x, fs)    # raw pitch extractor
f0 = pw.stonemask(x, _f0, t, fs)  # pitch refinement
sp = pw.cheaptrick(x, f0, t, fs)  # extract smoothed spectrogram
ap = pw.d4c(x, f0, t, fs)         # extract aperiodicity

y = pw.synthesize(f0, sp, ap, fs) # synthesize an utterance using the parameters

Utility

# Convert speech into features (using default arguments)
f0, sp, ap = pw.wav2world(x, fs)

You can change the default arguments of the function, too. See more info using help.

Installation

Using Pip

pip install pyworld

Building from Source

git clone https://github.com/JeremyCCHsu/Python-Wrapper-for-World-Vocoder.git
cd Python-Wrapper-for-World-Vocoder
git submodule update --init
pip install -U pip
pip install -r requirements.txt
pip install .

It will automatically git clone Morise's World Vocoder (C++ version).
(It seems to me that using virtualenv or conda is the best practice.)

Installation Validation

You can validate installation by running

cd demo
python demo.py

to see if you get results in test/ direcotry. (Please avoid writing and executing codes in the Python-Wrapper-for-World-Vocoder folder for now.)

Environment/Dependencies

  • Operating systems
    • Linux Ubuntu 14.04+
    • Windows (thanks to wuaalb)
    • WSL
  • Python
    • 2.7 (Windows is currently not supported)
    • 3.7/3.6/3.5

You can install dependencies these by pip install -r requirements.txt

Notice

  • WORLD vocoder is designed for speech sampled ≥ 16 kHz. Applying WORLD to 8 kHz speech will fail. See a possible workaround here.
  • When the SNR is low, extracting pitch using harvest instead of dio is a better option.

Troubleshooting

  1. Upgrade your Cython version to 0.24.
    (I failed to build it on Cython 0.20.1post0)
    It'll require you to download Cython form http://cython.org/
    Unzip it, and python setup.py install it.
    (I tried pip install Cython but the upgrade didn't seem correct)
    (Again, add --user if you don't have root access.)
  2. Upon executing demo/demo.py, the following code might be needed in some environments (e.g. when you're working on a remote Linux server):
import matplotlib
matplotlib.use('Agg')
  1. If you encounter library not found: sndfile error upon executing demo.py,
    you might have to install it by apt-get install libsoundfile1.
    You can also replace pysoundfile with scipy or librosa, but some modification is needed:

    • librosa:
      • load(fiilename, dtype=np.float64)
      • output.write_wav(filename, wav, fs)
      • remember to pass dtype argument to ensure that the method gives you a double.
    • scipy:
      • You'll have to write a customized utility function based on the following methods
      • scipy.io.wavfile.read (but this gives you short)
      • scipy.io.wavfile.write
  2. If you have installation issue on Windows, I probably could not provide much help because my development environment is Ubuntu and Windows Subsystem for Linux (read this if you are interested in installing it).

Other Installation Suggestions

  1. Use pip install . is safer and you can easily uninstall pyworld by pip uninstall pyworld
  • For Mac users: You might need to do MACOSX_DEPLOYMENT_TARGET=10.9 pip install . See issue.
  1. Another way to install pyworld is via
    python setup.py install
    • Add --user if you don't have root access
    • Add --record install.txt to track the installation dir
  2. If you just want to try out some experiments, execute
    python setup.py build_ext --inplace
    Then you can use PyWorld from this directory.
    You can also copy the resulting pyworld.so (pyworld.{arch}.pyd on Windows) file to ~/.local/lib/python2.7/site-packages (or corresponding Windows directory) so that you can use it everywhere like an installed package.
    Alternatively you can copy/symlink the compiled files using pip, e.g. pip install -e .

Acknowledgement

Thank all contributors (tats-u, wuaalb, r9y9, rikrd, kudan2510) for making this repo better and sotelo whose world.py inspired this repo.

Comments
  • Feature extraction: pyworld doesn't yield to same results as world/analysis (Merlin)

    Feature extraction: pyworld doesn't yield to same results as world/analysis (Merlin)

    Hi, I was comparing the results of the feature extraction with pyworld and the ones I get from the analysis.cpp routine interfacing world library in Merlin (https://github.com/CSTR-Edinburgh/merlin/blob/master/tools/WORLD/test/analysis.cpp). I have used exactly the same settings (minf0 = 71.0, maxf0=800.0, q1=-0.15, allowed_range = 0.1, d4c_threshold=0.) but I get somewhat some different results and don't understand why. Here's the experiment I made: demo_copy_synthesis

    opened by cveaux 8
  • Install error in Mac OSX

    Install error in Mac OSX

    Hi, I try to install it in Mac OSX but failed. Is it possible to install it in Mac ?

    Mac:World robot$ python setup.py install running install running bdist_egg running egg_info writing pyworld.egg-info/PKG-INFO writing top-level names to pyworld.egg-info/top_level.txt writing dependency_links to pyworld.egg-info/dependency_links.txt reading manifest file 'pyworld.egg-info/SOURCES.txt' writing manifest file 'pyworld.egg-info/SOURCES.txt' installing library code to build/bdist.macosx-10.5-x86_64/egg running install_lib running build_ext skipping 'pyworld.cpp' Cython extension (up-to-date) building 'pyworld' extension gcc -fno-strict-aliasing -I//anaconda/include -arch x86_64 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I//anaconda/lib/python2.7/site-packages/numpy/core/include -I/Users/robot/work/vocoder/World/src -I//anaconda/include/python2.7 -c pyworld.cpp -o build/temp.macosx-10.5-x86_64-2.7/pyworld.o In file included from pyworld.cpp:252: In file included from //anaconda/lib/python2.7/site-packages/numpy/core/include/numpy/arrayobject.h:4: In file included from //anaconda/lib/python2.7/site-packages/numpy/core/include/numpy/ndarrayobject.h:18: In file included from //anaconda/lib/python2.7/site-packages/numpy/core/include/numpy/ndarraytypes.h:1781: //anaconda/lib/python2.7/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:15:2: warning: "Using deprecated NumPy API, disable it by " "#defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-W#warnings] #warning "Using deprecated NumPy API, disable it by "
    ^ pyworld.cpp:1407:9: warning: conversion from string literal to 'char *' is deprecated [-Wc++11-compat-deprecated-writable-strings] PyErr_BadInternalCall(); ^ //anaconda/include/python2.7/pyerrors.h:221:56: note: expanded from macro 'PyErr_BadInternalCall' #define PyErr_BadInternalCall() _PyErr_BadInternalCall(FILE, LINE) ^ :114:1: note: expanded from here "pyworld.cpp" ^ pyworld.cpp:3307:3: error: no matching function for call to 'InitializeCheapTrickOption' InitializeCheapTrickOption((&__pyx_v_option)); ^~~~~~~~~~~~~~~~~~~~~~~~~~ ./src/world/cheaptrick.h:51:6: note: candidate function not viable: requires 2 arguments, but 1 was provided void InitializeCheapTrickOption(int fs, CheapTrickOption *option); ^ pyworld.cpp:3811:3: error: no matching function for call to 'InitializeCheapTrickOption' InitializeCheapTrickOption((&__pyx_v_opt)); ^~~~~~~~~~~~~~~~~~~~~~~~~~ ./src/world/cheaptrick.h:51:6: note: candidate function not viable: requires 2 arguments, but 1 was provided void InitializeCheapTrickOption(int fs, CheapTrickOption *option); ^ 2 warnings and 2 errors generated. error: command 'gcc' failed with exit status 1

    opened by robotnc 8
  • Error installing pyworld 0.2.10

    Error installing pyworld 0.2.10

    Hi, when I currently installed pyworld 0.2.10 via pip3 install pyworld==0.2.10 on Ubuntu 18.04, I encountered this unusual error log:

    ERROR: Command errored out with exit status 1:
       command: /usr/local/bin/python3 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-lfq7dqd5/pyworld/setup.py'"'"'; __file__='"'"'/tmp/pip-install-lfq7dqd5/pyworld/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-5yioohoh
           cwd: /tmp/pip-install-lfq7dqd5/pyworld/
      Complete output (19 lines):
      running bdist_wheel
      running build
      running build_py
      creating build
      creating build/lib.linux-x86_64-3.7
      creating build/lib.linux-x86_64-3.7/pyworld
      copying pyworld/__init__.py -> build/lib.linux-x86_64-3.7/pyworld
      running build_ext
      building 'pyworld.pyworld' extension
      creating build/temp.linux-x86_64-3.7
      creating build/temp.linux-x86_64-3.7/pyworld
      creating build/temp.linux-x86_64-3.7/lib
      creating build/temp.linux-x86_64-3.7/lib/World
      creating build/temp.linux-x86_64-3.7/lib/World/src
      gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -Ilib/World/src -I/usr/local/include/python3.7m -I/tmp/pip-install-lfq7dqd5/pyworld/.eggs/numpy-1.19.1-py3.7-linux-x86_64.egg/numpy/core/include -c pyworld/pyworld.cpp -o build/temp.linux-x86_64-3.7/pyworld/pyworld.o
      gcc: error: pyworld/pyworld.cpp: No such file or directory
      gcc: fatal error: no input files
      compilation terminated.
      error: command 'gcc' failed with exit status 1
      ----------------------------------------
      ERROR: Failed building wheel for pyworld
      Running setup.py clean for pyworld
    Successfully built configargparse PyYAML pysndfx
    Failed to build pyworld
    DEPRECATION: Could not build wheels for pyworld which do not use PEP 517. pip will fall back to legacy 'setup.py install' for these. pip 21.0 will remove support for this functionality. A possible replacement is to fix the wheel build issue reported above. You can find discussion regarding this at https://github.com/pypa/pip/issues/8368.
    

    Any help is really appreciated. Thank you !

    opened by tranctan 7
  • How to map to sound time series?

    How to map to sound time series?

    Hi, I don't have a lot of experience with audio processing, what's unclear to me is how to map these features to the corresponding sound sequence?

    I'm feeding these to an RNN along with the corresponding sound chunk. Do you know how I would set that up?

    Say I have a sound file with 32,000 values (16mhz for 2 seconds). I'm feeding the RNN a sequence of 1024 items at a time. BUT I'm grouping them by frames where each frame has 16 sound steps.

    So

    wav = load(...)
    wav.shape  
    # [1 x 32000]
    
    sub_seq = wav[0:1024]
    sub_seq = suq_seq.reshape(1, 64, 16)
    
    f0_contour, spectral_envelope, aperiodicity = pw.wav2world(wav, sample_rate=16000, frame_period=16)
    
    f0_contour.shape
    # [157]
    # ??? unclear how to match to the (1, 64, 16) piece of sound
    

    Thanks for an awesome package!!

    opened by williamFalcon 7
  • Updated world vocoder, bumped pyworld to v 0.3.2

    Updated world vocoder, bumped pyworld to v 0.3.2

    Tested this thoroughly, as i use Pyworld in the RHVoice project. I see that it fixes #58 This pull request updates the pyworld to be synced from the latest version from february this year.

    opened by zstanecic 6
  • F0 with same length of audio file

    F0 with same length of audio file

    Hi Jeremy,

    Thank you for the works making world accessible in python. I have a question: Is it possible to obtain the F0 contour with the length same as original audio file? For example, using vaiueo2d.wav, the original signal length is 17500 samples. By using pw.dio with default parameters (frame_period=5), we got 159 samples for 790 ms. I mean, how can I get the same length of F0 contour as an original audio file (aka F0 trajectories)?

    FYI, if I make the frame_period shorter (to get more length), the quality of F0 changing to be worse/rough.

    opened by bagustris 5
  • Succeeded in building wheels for Py3.6 in Windows with Appveyor

    Succeeded in building wheels for Py3.6 in Windows with Appveyor

    32bit result: https://ci.appveyor.com/project/tats-u/python-wrapper-for-world-vocoder/build/job/21gxwvwekd30jqpt 64bit result: https://ci.appveyor.com/project/tats-u/python-wrapper-for-world-vocoder/build/job/t49shnpdns6jlfpr

    But I don't know why builds in Py2.7 fail.

    32bit: https://ci.appveyor.com/project/tats-u/python-wrapper-for-world-vocoder/build/job/ik36v19i8ge49ghr 64bit: https://ci.appveyor.com/project/tats-u/python-wrapper-for-world-vocoder/build/job/5xyyirro5b9llt1x

    opened by tats-u 5
  • Does not work in Anaconda

    Does not work in Anaconda

    I found pyworld doesn't work for Python in Anaconda (https://www.continuum.io/downloads) in Ubuntu 17.04, presumably because the libstdc++ of Anaconda is of GCC 4.8 and too old for pyworld, which assumedly requires that of GCC 4.9 or later. Python 2.7 and 3.6 says the same error message.

    Example in Python 2.7:

    $ conda create -n envname python=2.7 anaconda
    $ source activate envname
    $ pip install pyworld
    $ python -c 'import pyworld'
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/home/tatsu/anaconda3/envs/sprocket27/lib/python2.7/site-packages/pyworld/__init__.py", line 7, in <module>
        from .pyworld import *
    ImportError: /home/tatsu/anaconda3/envs/sprocket27/lib/python2.7/site-packages/pyworld/pyworld.so: undefined symbol: __cxa_throw_bad_array_new_length
    $ conda list | fgrep gcc
    libgcc                    4.8.5                         2
    
    opened by tats-u 5
  • ```numpy < 1.20``` breaks p3.10 compatibility

    ```numpy < 1.20``` breaks p3.10 compatibility

    Loosening numpy < 1.20 requirements can fix the problem if don't cause something else.

    https://github.com/JeremyCCHsu/Python-Wrapper-for-World-Vocoder/blob/cb69e0198be58e5bd68617062589dc131d56a6d7/pyproject.toml#L6

    opened by erogol 4
  • How can I save the generated audio as a  int16 WAV

    How can I save the generated audio as a int16 WAV

    The audio sequence generated by the vocoder is 64-bit floating point. When I want to save them as int16, they all return to zero. Is there any way to solve this problem?

    opened by MeiGM 4
  • cannot install this on Ubuntu: error: command 'g++' failed with exit status 1

    cannot install this on Ubuntu: error: command 'g++' failed with exit status 1

    When I install this pyworld with pip install pyworld or git clone this repo, I get error: command 'g++' failed with exit status 1.

    g++ -pthread -shared -B /home/yons/anaconda3/compiler_compat -L/home/yons/anarld.o build/temp.linux-x86_64-3.6/lib/World/src/dio.o build/temp.linux-x86_64-3.6o build/temp.linux-x86_64-3.6/lib/World/src/synthesis.o build/temp.linux-x86_64-3World/src/synthesisrealtime.o build/temp.linux-x86_64-3.6/lib/World/src/codec.o b6_64-3.6/pyworld/pyworld.cpython-36m-x86_64-linux-gnu.so anaconda3/compiler_compat/ld: cannot find -lm anaconda3/compiler_compat/ld: cannot find -lpthread anaconda3/compiler_compat/ld: cannot find -lc collect2: Error: ld return 1 error: command 'g++' failed with exit status 1

    I have no idea to do this. Thanks for your help.

    opened by cnlinxi 4
  • Fixes dash-seperator deprecation issue

    Fixes dash-seperator deprecation issue

    Closes #81

    📑 Description

    Usage of dash-separated deprecated in the latest version of pip install. Python 3.9.6 pip 22.3.1

    • [ ] Not Completed
    • [x] Completed

    ✅ Checks

    • [x] My pull request adheres to the code style of this project
    • [] My code requires changes to the documentation
    • [ ] I have updated the documentation as required
    • [x] All the tests have passed
    opened by keviveks 0
  • Usage of dash-separated 'description-file' deprecated in latest version

    Usage of dash-separated 'description-file' deprecated in latest version

    When try installing pyworld through the latest version of pip, throws an exception "Usage of dash-separated deprecated in the latest version"

    Python 3.9.6 pip 22.3.1

    Screenshot 2022-12-25 at 4 08 25 PM
    opened by keviveks 0
  • There should be no need for a `stonemask` after `harvest`.

    There should be no need for a `stonemask` after `harvest`.

    In the sample of demo.py, stonemask is performed after harvest, but since stonemask is supposed to be used only for dio, it may be less correct if it is.

    https://github.com/JeremyCCHsu/Python-Wrapper-for-World-Vocoder/blob/3a7c99a32c717deb8e66bde64b5e60b1a4afce79/demo/demo.py#L88-L91

    opened by Hiroshiba 0
  • Speed up Pyworld

    Speed up Pyworld

    Hi, may I ask if you any suggestions for speeding up PyWorld? I have been thinking about this but have not figured out which is a proper way.

    Looking forward to your comments. :D

    opened by tranctan 5
Owner
Jeremy Hsu
A PhD student drowning in the ocean of generative models.
Jeremy Hsu
python wrapper for rubberband

pyrubberband A python wrapper for rubberband. For now, this just provides lightweight wrappers for pitch-shifting and time-stretching. All processing

Brian McFee 106 Nov 28, 2022
A Python wrapper around the Soundcloud API

soundcloud-python A friendly wrapper around the Soundcloud API. Installation To install soundcloud-python, simply: pip install soundcloud Or if you'r

SoundCloud 84 Dec 31, 2022
A python wrapper for REAPER

pyreaper A python wrapper for REAPER (Robust Epoch And Pitch EstimatoR) Installation pip install pyreaper Demonstration notebnook http://nbviewer.jupy

Ryuichi Yamamoto 56 Dec 27, 2022
Manipulate audio with a simple and easy high level interface

Pydub Pydub lets you do stuff to audio in a way that isn't stupid. Stuff you might be looking for: Installing Pydub API Documentation Dependencies Pla

James Robert 6.6k Jan 1, 2023
DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

Project DeepSpeech DeepSpeech is an open-source Speech-To-Text engine, using a model trained by machine learning techniques based on Baidu's Deep Spee

Mozilla 20.8k Jan 3, 2023
FPGA based USB 2.0 high speed audio interface featuring multiple optical ADAT inputs and outputs

ADAT USB Audio Interface FPGA based USB 2.0 High Speed audio interface featuring multiple optical ADAT inputs and outputs Status / current limitations

Hans Baier 78 Dec 31, 2022
Gateware for the Terasic/Arrow DECA board, to become a USB2 high speed audio interface

DECA USB Audio Interface DECA based USB 2.0 High Speed audio interface Status / current limitations enumerates as class compliant audio device on Linu

Hans Baier 16 Mar 21, 2022
cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding for Python

audioread Decode audio files using whichever backend is available. The library currently supports: Gstreamer via PyGObject. Core Audio on Mac OS X via

beetbox 419 Dec 26, 2022
Audio fingerprinting and recognition in Python

dejavu Audio fingerprinting and recognition algorithm implemented in Python, see the explanation here: How it works Dejavu can memorize audio by liste

Will Drevo 6k Jan 6, 2023
Python library for audio and music analysis

librosa A python package for music and audio analysis. Documentation See https://librosa.org/doc/ for a complete reference manual and introductory tut

librosa 5.6k Jan 6, 2023
Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications

A Python library for audio feature extraction, classification, segmentation and applications This doc contains general info. Click here for the comple

Theodoros Giannakopoulos 5.1k Jan 2, 2023
Scalable audio processing framework written in Python with a RESTful API

TimeSide : scalable audio processing framework and server written in Python TimeSide is a python framework enabling low and high level audio analysis,

Parisson 340 Jan 4, 2023
nicfit 425 Jan 1, 2023
Python module for handling audio metadata

Mutagen is a Python module to handle audio metadata. It supports ASF, FLAC, MP4, Monkey's Audio, MP3, Musepack, Ogg Opus, Ogg FLAC, Ogg Speex, Ogg The

Quod Libet 1.1k Dec 31, 2022
Read music meta data and length of MP3, OGG, OPUS, MP4, M4A, FLAC, WMA and Wave files with python 2 or 3

tinytag tinytag is a library for reading music meta data of MP3, OGG, OPUS, MP4, M4A, FLAC, WMA and Wave files with python Install pip install tinytag

Tom Wallroth 577 Dec 26, 2022
Telegram Voice-Chat Bot Written In Python Using Pyrogram.

Telegram Voice-Chat Bot Telegram Voice-Chat Bot To Play Music From Various Sources In Your Group Support All linux based os. Windows Mac Diagram Requi

TheHamkerCat 314 Dec 29, 2022
Expressive Digital Signal Processing (DSP) package for Python

AudioLazy Development Last release PyPI status Real-Time Expressive Digital Signal Processing (DSP) Package for Python! Laziness and object representa

Danilo de Jesus da Silva Bellini 642 Dec 26, 2022
cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding for Python

audioread Decode audio files using whichever backend is available. The library currently supports: Gstreamer via PyGObject. Core Audio on Mac OS X via

beetbox 359 Feb 15, 2021
Python I/O for STEM audio files

stempeg = stems + ffmpeg Python package to read and write STEM audio files. Technically, stems are audio containers that combine multiple audio stream

Fabian-Robert Stöter 72 Dec 23, 2022