-
Jean-Marc Valin authoredJean-Marc Valin authored
README.md 6.44 KiB
LPCNet
Low complexity implementation of the WaveRNN-based LPCNet algorithm, as described in:
- J.-M. Valin, J. Skoglund, LPCNet: Improving Neural Speech Synthesis Through Linear Prediction, Proc. International Conference on Acoustics, Speech and Signal Processing (ICASSP), arXiv:1810.11846, 2019.
- J.-M. Valin, U. Isik, P. Smaragdis, A. Krishnaswamy, Neural Speech Synthesis on a Shoestring: Improving the Efficiency of LPCNet, Proc. ICASSP, arxiv:2106.04129, 2022.
- K. Subramani, J.-M. Valin, U. Isik, P. Smaragdis, A. Krishnaswamy, End-to-end LPCNet: A Neural Vocoder With Fully-Differentiable LPC Estimation, Proc. INTERSPEECH, arxiv:2106.04129, 2022.
For coding/PLC applications of LPCNet, see:
- J.-M. Valin, J. Skoglund, A Real-Time Wideband Neural Vocoder at 1.6 kb/s Using LPCNet, Proc. INTERSPEECH, arxiv:1903.12087, 2019.
- J. Skoglund, J.-M. Valin, Improving Opus Low Bit Rate Quality with Neural Speech Synthesis, Proc. INTERSPEECH, arxiv:1905.04628, 2020.
- J.-M. Valin, A. Mustafa, C. Montgomery, T.B. Terriberry, M. Klingbeil, P. Smaragdis, A. Krishnaswamy, Real-Time Packet Loss Concealment With Mixed Generative and Predictive Model, Proc. INTERSPEECH, arxiv:2205.05785, 2022.
Introduction
Work in progress software for researching low CPU complexity algorithms for speech synthesis and compression by applying Linear Prediction techniques to WaveRNN. High quality speech can be synthesised on regular CPUs (around 3 GFLOP) with SIMD support (SSE2, SSSE3, AVX, AVX2/FMA, NEON currently supported). The code also supports very low bitrate compression at 1.6 kb/s.
The BSD licensed software is written in C and Python/Keras. For training, a GTX 1080 Ti or better is recommended.
This software is an open source starting point for LPCNet/WaveRNN-based speech synthesis and coding.