A Python wrapper for the high-quality vocoder "World"

Jeremy Hsu

Last update: Dec 15, 2022

Related tags

Audio Python-Wrapper-for-World-Vocoder

Overview

PyWORLD - A Python wrapper of WORLD Vocoder

`Linux`	`Windows`

WORLD Vocoder is a fast and high-quality vocoder which parameterizes speech into three components:

f0: Pitch contour
sp: Harmonic spectral envelope
ap: Aperiodic spectral envelope (relative to the harmonic spectral envelope)

It can also (re)synthesize speech using these features (see examples below).

For more information, please visit Dr. Morise's WORLD repository and the official website of WORLD Vocoder

APIs

Vocoder Functions

import pyworld as pw
_f0, t = pw.dio(x, fs)    # raw pitch extractor
f0 = pw.stonemask(x, _f0, t, fs)  # pitch refinement
sp = pw.cheaptrick(x, f0, t, fs)  # extract smoothed spectrogram
ap = pw.d4c(x, f0, t, fs)         # extract aperiodicity

y = pw.synthesize(f0, sp, ap, fs) # synthesize an utterance using the parameters

Utility

# Convert speech into features (using default arguments)
f0, sp, ap = pw.wav2world(x, fs)

You can change the default arguments of the function, too. See more info using help.

Installation

Using Pip

pip install pyworld

Building from Source

git clone https://github.com/JeremyCCHsu/Python-Wrapper-for-World-Vocoder.git
cd Python-Wrapper-for-World-Vocoder
git submodule update --init
pip install -U pip
pip install -r requirements.txt
pip install .

It will automatically git clone Morise's World Vocoder (C++ version).
(It seems to me that using virtualenv or conda is the best practice.)

Installation Validation

You can validate installation by running

cd demo
python demo.py

to see if you get results in test/ direcotry. (Please avoid writing and executing codes in the Python-Wrapper-for-World-Vocoder folder for now.)

Environment/Dependencies

Operating systems
- Linux Ubuntu 14.04+
- Windows (thanks to wuaalb)
- WSL
Python
- 2.7 (Windows is currently not supported)
- 3.7/3.6/3.5

You can install dependencies these by pip install -r requirements.txt

Notice

WORLD vocoder is designed for speech sampled ≥ 16 kHz. Applying WORLD to 8 kHz speech will fail. See a possible workaround here.
When the SNR is low, extracting pitch using harvest instead of dio is a better option.

Troubleshooting

Upgrade your Cython version to 0.24.
(I failed to build it on Cython 0.20.1post0)
It'll require you to download Cython form http://cython.org/
Unzip it, and python setup.py install it.
(I tried pip install Cython but the upgrade didn't seem correct)
(Again, add --user if you don't have root access.)
Upon executing demo/demo.py, the following code might be needed in some environments (e.g. when you're working on a remote Linux server):

import matplotlib
matplotlib.use('Agg')

If you encounter library not found: sndfile error upon executing demo.py,
you might have to install it by apt-get install libsoundfile1.
You can also replace pysoundfile with scipy or librosa, but some modification is needed:
- librosa:
  - load(fiilename, dtype=np.float64)
  - output.write_wav(filename, wav, fs)
  - remember to pass dtype argument to ensure that the method gives you a double.
- scipy:
  - You'll have to write a customized utility function based on the following methods
  - scipy.io.wavfile.read (but this gives you short)
  - scipy.io.wavfile.write
If you have installation issue on Windows, I probably could not provide much help because my development environment is Ubuntu and Windows Subsystem for Linux (read this if you are interested in installing it).

Other Installation Suggestions

Use pip install . is safer and you can easily uninstall pyworld by pip uninstall pyworld

For Mac users: You might need to do MACOSX_DEPLOYMENT_TARGET=10.9 pip install . See issue.

Another way to install pyworld is via
python setup.py install
- Add --user if you don't have root access
- Add --record install.txt to track the installation dir
If you just want to try out some experiments, execute
python setup.py build_ext --inplace
Then you can use PyWorld from this directory.
You can also copy the resulting pyworld.so (pyworld.{arch}.pyd on Windows) file to ~/.local/lib/python2.7/site-packages (or corresponding Windows directory) so that you can use it everywhere like an installed package.
Alternatively you can copy/symlink the compiled files using pip, e.g. pip install -e .

Acknowledgement

Thank all contributors (tats-u, wuaalb, r9y9, rikrd, kudan2510) for making this repo better and sotelo whose world.py inspired this repo.

Comments

Feature extraction: pyworld doesn't yield to same results as world/analysis (Merlin)

Hi, I was comparing the results of the feature extraction with pyworld and the ones I get from the analysis.cpp routine interfacing world library in Merlin (https://github.com/CSTR-Edinburgh/merlin/blob/master/tools/WORLD/test/analysis.cpp). I have used exactly the same settings (minf0 = 71.0, maxf0=800.0, q1=-0.15, allowed_range = 0.1, d4c_threshold=0.) but I get somewhat some different results and don't understand why. Here's the experiment I made:

opened by cveaux 8
Install error in Mac OSX

Hi, I try to install it in Mac OSX but failed. Is it possible to install it in Mac ?

Mac:World robot$ python setup.py install running install running bdist_egg running egg_info writing pyworld.egg-info/PKG-INFO writing top-level names to pyworld.egg-info/top_level.txt writing dependency_links to pyworld.egg-info/dependency_links.txt reading manifest file 'pyworld.egg-info/SOURCES.txt' writing manifest file 'pyworld.egg-info/SOURCES.txt' installing library code to build/bdist.macosx-10.5-x86_64/egg running install_lib running build_ext skipping 'pyworld.cpp' Cython extension (up-to-date) building 'pyworld' extension gcc -fno-strict-aliasing -I//anaconda/include -arch x86_64 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I//anaconda/lib/python2.7/site-packages/numpy/core/include -I/Users/robot/work/vocoder/World/src -I//anaconda/include/python2.7 -c pyworld.cpp -o build/temp.macosx-10.5-x86_64-2.7/pyworld.o In file included from pyworld.cpp:252: In file included from //anaconda/lib/python2.7/site-packages/numpy/core/include/numpy/arrayobject.h:4: In file included from //anaconda/lib/python2.7/site-packages/numpy/core/include/numpy/ndarrayobject.h:18: In file included from //anaconda/lib/python2.7/site-packages/numpy/core/include/numpy/ndarraytypes.h:1781: //anaconda/lib/python2.7/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:15:2: warning: "Using deprecated NumPy API, disable it by " "#defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-W#warnings] #warning "Using deprecated NumPy API, disable it by "
^ pyworld.cpp:1407:9: warning: conversion from string literal to 'char *' is deprecated [-Wc++11-compat-deprecated-writable-strings] PyErr_BadInternalCall(); ^ //anaconda/include/python2.7/pyerrors.h:221:56: note: expanded from macro 'PyErr_BadInternalCall' #define PyErr_BadInternalCall() _PyErr_BadInternalCall(FILE, LINE) ^ :114:1: note: expanded from here "pyworld.cpp" ^ pyworld.cpp:3307:3: error: no matching function for call to 'InitializeCheapTrickOption' InitializeCheapTrickOption((&__pyx_v_option)); ^~~~~~~~~~~~~~~~~~~~~~~~~~ ./src/world/cheaptrick.h:51:6: note: candidate function not viable: requires 2 arguments, but 1 was provided void InitializeCheapTrickOption(int fs, CheapTrickOption *option); ^ pyworld.cpp:3811:3: error: no matching function for call to 'InitializeCheapTrickOption' InitializeCheapTrickOption((&__pyx_v_opt)); ^~~~~~~~~~~~~~~~~~~~~~~~~~ ./src/world/cheaptrick.h:51:6: note: candidate function not viable: requires 2 arguments, but 1 was provided void InitializeCheapTrickOption(int fs, CheapTrickOption *option); ^ 2 warnings and 2 errors generated. error: command 'gcc' failed with exit status 1

opened by robotnc 8

Error installing pyworld 0.2.10

Hi, when I currently installed pyworld 0.2.10 via pip3 install pyworld==0.2.10 on Ubuntu 18.04, I encountered this unusual error log:

ERROR: Command errored out with exit status 1:
   command: /usr/local/bin/python3 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-lfq7dqd5/pyworld/setup.py'"'"'; __file__='"'"'/tmp/pip-install-lfq7dqd5/pyworld/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-5yioohoh
       cwd: /tmp/pip-install-lfq7dqd5/pyworld/
  Complete output (19 lines):
  running bdist_wheel
  running build
  running build_py
  creating build
  creating build/lib.linux-x86_64-3.7
  creating build/lib.linux-x86_64-3.7/pyworld
  copying pyworld/__init__.py -> build/lib.linux-x86_64-3.7/pyworld
  running build_ext
  building 'pyworld.pyworld' extension
  creating build/temp.linux-x86_64-3.7
  creating build/temp.linux-x86_64-3.7/pyworld
  creating build/temp.linux-x86_64-3.7/lib
  creating build/temp.linux-x86_64-3.7/lib/World
  creating build/temp.linux-x86_64-3.7/lib/World/src
  gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -Ilib/World/src -I/usr/local/include/python3.7m -I/tmp/pip-install-lfq7dqd5/pyworld/.eggs/numpy-1.19.1-py3.7-linux-x86_64.egg/numpy/core/include -c pyworld/pyworld.cpp -o build/temp.linux-x86_64-3.7/pyworld/pyworld.o
  gcc: error: pyworld/pyworld.cpp: No such file or directory
  gcc: fatal error: no input files
  compilation terminated.
  error: command 'gcc' failed with exit status 1
  ----------------------------------------
  ERROR: Failed building wheel for pyworld
  Running setup.py clean for pyworld
Successfully built configargparse PyYAML pysndfx
Failed to build pyworld
DEPRECATION: Could not build wheels for pyworld which do not use PEP 517. pip will fall back to legacy 'setup.py install' for these. pip 21.0 will remove support for this functionality. A possible replacement is to fix the wheel build issue reported above. You can find discussion regarding this at https://github.com/pypa/pip/issues/8368.

Any help is really appreciated. Thank you !

opened by tranctan 7

How to map to sound time series?
Hi, I don't have a lot of experience with audio processing, what's unclear to me is how to map these features to the corresponding sound sequence?

I'm feeding these to an RNN along with the corresponding sound chunk. Do you know how I would set that up?

Say I have a sound file with 32,000 values (16mhz for 2 seconds). I'm feeding the RNN a sequence of 1024 items at a time. BUT I'm grouping them by frames where each frame has 16 sound steps.

So

wav = load(...) wav.shape # [1 x 32000] sub_seq = wav[0:1024] sub_seq = suq_seq.reshape(1, 64, 16) f0_contour, spectral_envelope, aperiodicity = pw.wav2world(wav, sample_rate=16000, frame_period=16) f0_contour.shape # [157] # ??? unclear how to match to the (1, 64, 16) piece of sound

Thanks for an awesome package!!
opened by williamFalcon 7
Updated world vocoder, bumped pyworld to v 0.3.2

Tested this thoroughly, as i use Pyworld in the RHVoice project. I see that it fixes #58 This pull request updates the pyworld to be synced from the latest version from february this year.

opened by zstanecic 6
F0 with same length of audio file

Hi Jeremy,

Thank you for the works making world accessible in python. I have a question: Is it possible to obtain the F0 contour with the length same as original audio file? For example, using vaiueo2d.wav, the original signal length is 17500 samples. By using pw.dio with default parameters (frame_period=5), we got 159 samples for 790 ms. I mean, how can I get the same length of F0 contour as an original audio file (aka F0 trajectories)?

FYI, if I make the frame_period shorter (to get more length), the quality of F0 changing to be worse/rough.

opened by bagustris 5
Succeeded in building wheels for Py3.6 in Windows with Appveyor

32bit result: https://ci.appveyor.com/project/tats-u/python-wrapper-for-world-vocoder/build/job/21gxwvwekd30jqpt 64bit result: https://ci.appveyor.com/project/tats-u/python-wrapper-for-world-vocoder/build/job/t49shnpdns6jlfpr

But I don't know why builds in Py2.7 fail.

32bit: https://ci.appveyor.com/project/tats-u/python-wrapper-for-world-vocoder/build/job/ik36v19i8ge49ghr 64bit: https://ci.appveyor.com/project/tats-u/python-wrapper-for-world-vocoder/build/job/5xyyirro5b9llt1x

opened by tats-u 5

Does not work in Anaconda

I found pyworld doesn't work for Python in Anaconda (https://www.continuum.io/downloads) in Ubuntu 17.04, presumably because the libstdc++ of Anaconda is of GCC 4.8 and too old for pyworld, which assumedly requires that of GCC 4.9 or later. Python 2.7 and 3.6 says the same error message.

Example in Python 2.7:

$ conda create -n envname python=2.7 anaconda
$ source activate envname
$ pip install pyworld
$ python -c 'import pyworld'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/tatsu/anaconda3/envs/sprocket27/lib/python2.7/site-packages/pyworld/__init__.py", line 7, in <module>
    from .pyworld import *
ImportError: /home/tatsu/anaconda3/envs/sprocket27/lib/python2.7/site-packages/pyworld/pyworld.so: undefined symbol: __cxa_throw_bad_array_new_length
$ conda list | fgrep gcc
libgcc                    4.8.5                         2

opened by tats-u 5

```numpy < 1.20``` breaks p3.10 compatibility

Loosening numpy < 1.20 requirements can fix the problem if don't cause something else.

https://github.com/JeremyCCHsu/Python-Wrapper-for-World-Vocoder/blob/cb69e0198be58e5bd68617062589dc131d56a6d7/pyproject.toml#L6

opened by erogol 4
How can I save the generated audio as a int16 WAV

The audio sequence generated by the vocoder is 64-bit floating point. When I want to save them as int16, they all return to zero. Is there any way to solve this problem?

opened by MeiGM 4
cannot install this on Ubuntu: error: command 'g++' failed with exit status 1

When I install this pyworld with pip install pyworld or git clone this repo, I get error: command 'g++' failed with exit status 1.

g++ -pthread -shared -B /home/yons/anaconda3/compiler_compat -L/home/yons/anarld.o build/temp.linux-x86_64-3.6/lib/World/src/dio.o build/temp.linux-x86_64-3.6o build/temp.linux-x86_64-3.6/lib/World/src/synthesis.o build/temp.linux-x86_64-3World/src/synthesisrealtime.o build/temp.linux-x86_64-3.6/lib/World/src/codec.o b6_64-3.6/pyworld/pyworld.cpython-36m-x86_64-linux-gnu.so anaconda3/compiler_compat/ld: cannot find -lm anaconda3/compiler_compat/ld: cannot find -lpthread anaconda3/compiler_compat/ld: cannot find -lc collect2: Error： ld return 1 error: command 'g++' failed with exit status 1

I have no idea to do this. Thanks for your help.

opened by cnlinxi 4
Fixes dash-seperator deprecation issue
Closes #81

📑 Description

Usage of dash-separated deprecated in the latest version of pip install. Python 3.9.6 pip 22.3.1

[ ] Not Completed

[x] Completed

✅ Checks

[x] My pull request adheres to the code style of this project

[] My code requires changes to the documentation

[ ] I have updated the documentation as required

[x] All the tests have passed
opened by keviveks 0
Usage of dash-separated 'description-file' deprecated in latest version

When try installing pyworld through the latest version of pip, throws an exception "Usage of dash-separated deprecated in the latest version"

Python 3.9.6 pip 22.3.1

opened by keviveks 0
There should be no need for a `stonemask` after `harvest`.

In the sample of demo.py, stonemask is performed after harvest, but since stonemask is supposed to be used only for dio, it may be less correct if it is.

https://github.com/JeremyCCHsu/Python-Wrapper-for-World-Vocoder/blob/3a7c99a32c717deb8e66bde64b5e60b1a4afce79/demo/demo.py#L88-L91

opened by Hiroshiba 0
Speed up Pyworld

Hi, may I ask if you any suggestions for speeding up PyWorld? I have been thinking about this but have not figured out which is a proper way.

Looking forward to your comments. :D

opened by tranctan 5

Owner

Jeremy Hsu

A PhD student drowning in the ocean of generative models.

GitHub

python wrapper for rubberband

pyrubberband A python wrapper for rubberband. For now, this just provides lightweight wrappers for pitch-shifting and time-stretching. All processing

106 Nov 28, 2022

A Python wrapper around the Soundcloud API

soundcloud-python A friendly wrapper around the Soundcloud API. Installation To install soundcloud-python, simply: pip install soundcloud Or if you'r

84 Dec 31, 2022

A python wrapper for REAPER

pyreaper A python wrapper for REAPER (Robust Epoch And Pitch EstimatoR) Installation pip install pyreaper Demonstration notebnook http://nbviewer.jupy

56 Dec 27, 2022

Manipulate audio with a simple and easy high level interface

Pydub Pydub lets you do stuff to audio in a way that isn't stupid. Stuff you might be looking for: Installing Pydub API Documentation Dependencies Pla

6.6k Jan 1, 2023

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

Project DeepSpeech DeepSpeech is an open-source Speech-To-Text engine, using a model trained by machine learning techniques based on Baidu's Deep Spee

20.8k Jan 3, 2023

FPGA based USB 2.0 high speed audio interface featuring multiple optical ADAT inputs and outputs

ADAT USB Audio Interface FPGA based USB 2.0 High Speed audio interface featuring multiple optical ADAT inputs and outputs Status / current limitations

78 Dec 31, 2022

Gateware for the Terasic/Arrow DECA board, to become a USB2 high speed audio interface

DECA USB Audio Interface DECA based USB 2.0 High Speed audio interface Status / current limitations enumerates as class compliant audio device on Linu

16 Mar 21, 2022

cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding for Python

audioread Decode audio files using whichever backend is available. The library currently supports: Gstreamer via PyGObject. Core Audio on Mac OS X via

419 Dec 26, 2022

Audio fingerprinting and recognition in Python

dejavu Audio fingerprinting and recognition algorithm implemented in Python, see the explanation here: How it works Dejavu can memorize audio by liste

6k Jan 6, 2023

Python library for audio and music analysis

librosa A python package for music and audio analysis. Documentation See https://librosa.org/doc/ for a complete reference manual and introductory tut

5.6k Jan 6, 2023

Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications

A Python library for audio feature extraction, classification, segmentation and applications This doc contains general info. Click here for the comple

5.1k Jan 2, 2023

Scalable audio processing framework written in Python with a RESTful API

TimeSide : scalable audio processing framework and server written in Python TimeSide is a python framework enabling low and high level audio analysis,

340 Jan 4, 2023

eyeD3 is a Python module and command line program for processing ID3 tags. Information about mp3 files (i.e bit rate, sample frequency, play time, etc.) is also provided. The formats supported are ID3v1 (1.0/1.1) and ID3v2 (2.3/2.4).

Status About eyeD3 is a Python tool for working with audio files, specifically MP3 files containing ID3 metadata (i.e. song info). It provides a comma

425 Jan 1, 2023

A Python wrapper for the high-quality vocoder "World"

Related tags

Overview

PyWORLD - A Python wrapper of WORLD Vocoder

APIs

Vocoder Functions

Utility

Installation

Using Pip

Building from Source

Installation Validation

Environment/Dependencies

Notice

Troubleshooting

Other Installation Suggestions

Acknowledgement

Comments

📑 Description

✅ Checks

Owner

Jeremy Hsu

python wrapper for rubberband

A Python wrapper around the Soundcloud API

A python wrapper for REAPER

Manipulate audio with a simple and easy high level interface

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

FPGA based USB 2.0 high speed audio interface featuring multiple optical ADAT inputs and outputs

Gateware for the Terasic/Arrow DECA board, to become a USB2 high speed audio interface

cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding for Python

Audio fingerprinting and recognition in Python

Python library for audio and music analysis

Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications

Scalable audio processing framework written in Python with a RESTful API

eyeD3 is a Python module and command line program for processing ID3 tags. Information about mp3 files (i.e bit rate, sample frequency, play time, etc.) is also provided. The formats supported are ID3v1 (1.0/1.1) and ID3v2 (2.3/2.4).

Python module for handling audio metadata

Read music meta data and length of MP3, OGG, OPUS, MP4, M4A, FLAC, WMA and Wave files with python 2 or 3

Telegram Voice-Chat Bot Written In Python Using Pyrogram.

Expressive Digital Signal Processing (DSP) package for Python

cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding for Python

Python I/O for STEM audio files