Non-Autoregressive Predictive Coding
This repository contains the implementation of Non-Autoregressive Predictive Coding (NPC) as described in the preprint paper submitted to ICASSP 2021.
A quick example for training NPC
python main.py --config config/self_supervised/npc_example.yml \
--task self-learning
-
For more complete examples including downstream tasks, please see the example script.
-
For preparing data, please visit preprocess.
-
For detailed hyperparameters setting and description, please checkout example config file of NPC.
-
For all run-time options, use
-h
flag. -
Implementation of Autoregressive Predictive Coding (APC, 2019, Chung et al.) and Vector-Quantized APC (VQ-APC, 2020, Chung et al.) are also available using similar training/downstream execution with example config files here.
Some notes
-
We found the unmasked feature produced by the last ConvBlock layer a better representation. In the phone classification tasks, switching to the unmasked feature (PER 25.6%) provided a 1.6% improvement over the masked feature (PER 27.2%). Currently, this is not included in the preprint version and will be updated to the paper in the future. Please refer to downstream examples to activate this option.
-
APC/VQ-APC are implemented with the following modifications for improvement (for the unmodified version, please visit the official implementation of APC / VQAPC)
-
Multi-group VQ available for VQ-APC, but with VQ on last layer only
-
Using utterance-wised CMVN surface feature(just as NPC did)
-
Using Gumbel Softmax from official API of pytorch
-
-
See package requirement for toolkits used,
tensorboard
can be used to access log files in--logdir
.
Contact
Feel free to contact me for questions or feedbacks, my email can be found in the paper or my personal page.
Citation
If you find our work and/or this repository helpful, please do consider citing us
@article{liu2020nonautoregressive,
title = {Non-Autoregressive Predictive Coding for Learning Speech Representations from Local Dependencies},
author = {Liu, Alexander and Chung, Yu-An and Glass, James},
journal = {arXiv preprint arXiv:2011.00406},
year = {2020}
}