Common Fate Transform and Model for Python
This package is a python implementation of the Common Fate Transform and Model to be used for audio source separation as described in an ICASSP 2016 paper "Common Fate Model for Unison source Separation".
Common Fate Transform
The Common Fate Transform is based on a signal representation that divides a complex spectrogram into a grid of patches of arbitrary size. These complex patches are then processed by a two-dimensional discrete Fourier transform, forming a tensor representation which reveals spectral and temporal modulation textures.
Common Fate Model
An adapted factorization model similar to the PARAFAC/CANDECOMP factorisation allows to decompose the common fate transform tesnor into different time-varying harmonic sources based on their particular common modulation profile: hence the name Common Fate Model.
Usage
See the full API documentation at http://aliutkus.github.io/commonfate.
Applying the Common Fate Transform
import commonfate
# # forward transform
# STFT Parameters
framelength = 1024
hopsize = 256
X = commonfate.transform.forward(signal, framelength, hopsize)
# Patch Parameters
W = (32, 48)
mhop = (16, 24)
Z = commonfate.transform.forward(X, W, mhop, real=False)
# inverse transform of cft
Y = commonfate.transform.inverse(
Z, fdim=2, hop=mhop, shape=X.shape, real=False
)
# back to time domain
y = commonfate.transform.inverse(
Y, fdim=1, hop=hopsize, shape=x.shape
)
Fitting the Common Fate Model
import commonfate
# initialiase and fit the common fate model
cfm = commonfate.model.CFM(z, nb_components=10, nb_iter=100).fit()
# get the fitted factors
(A, H, C) = cfm.factors
# returns the of z approximation using the fitted factors
z_hat = cfm.approx()
Decompose an audio signal using CFT and CFM
commonfate has a built-in wrapper which computes the Common Fate Transform, fits the model according to the Common Fate Model and return the synthesised time domain signal components obtained through wiener / soft mask filtering.
The following example requires to install pysoundfile.
import commonfate
import soundfile as sf
# loading signal
(audio, fs) = sf.read(filename, always_2d=True)
# decomposes the audio signal into
# (nb_components, nb_samples, nb_channels)
components = decompose.process(
audio,
nb_iter=100,
nb_components=10,
n_fft=1024,
n_hop=256,
cft_patch=(32, 48),
cft_hop=(16, 24)
)
# write out the third component to wave file
sf.write(
"comp_3.wav",
components[2, ...],
fs
)
Optimisations
The current common fate model implementation makes heavily use of the Einstein Notation. We use the numpy einsum
module which can be slow on large tensors. To speed up the computation time we recommend to install Daniel Smith's opt_einsum
package.
Installation via pip
pip install -e 'git+https://github.com/dgasmith/opt_einsum.git#egg=opt_einsum'
commonfate automatically detects if the package is installed.
References
You can download and read the paper here. If you use this package, please reference to the following publication:
@inproceedings{stoeter2016cfm,
TITLE = {{Common Fate Model for Unison source Separation}},
AUTHOR = {St{\"o}ter, Fabian-Robert and Liutkus, Antoine and Badeau, Roland and Edler, Bernd and Magron, Paul},
BOOKTITLE = {{41st International Conference on Acoustics, Speech and Signal Processing (ICASSP)}},
ADDRESS = {Shanghai, China},
PUBLISHER = {{IEEE}},
SERIES = {Proceedings of the 41st International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
YEAR = {2016},
KEYWORDS = {Non-Negative tensor factorization ; Sound source separation ; Common Fate Model},
}