Voice conversion software - Voice conversion (VC) is a technique to convert a speaker identity of a source speaker into that of a target speaker. This software enables the users to develop a traditional VC system based on a Gaussian mixture model (GMM) and a vocoder-free VC system based on a differential GMM (DIFFGMM) using a parallel dataset of the source and target speakers.
K. Kobayashi, T. Toda, "sprocket: Open-Source Voice Conversion Software," Proc. Odyssey, pp. 203-210, June 2018. [paper]
T. Toda, "Hands on Voice Conversion," Speech Processing Courses in Crete (SPCC), July 2018. [slide]
This software was developed to make it possible for the users to easily build the VC systems by only preparing a parallel dataset of the desired source and target speakers and executing example scripts. The following VC methods were implemented as the typical VC methods.
To make it possible to easily develop VC-based applications using Python (Python3), the VC library is also supplied, including several interfaces, such as acoustic feature analysis/synthesis, acoustic feature modeling, acoustic feature conversion, and waveform modification. For the details of the VC library, please see sprocket documents in (coming soon).
Please use Python3.
pip install numpy==1.15.4 cython # for dependency pip install sprocket-vc
See VC example
For any questions or issues please visit:
Copyright (c) 2020 Kazuhiro KOBAYASHI
Released under the MIT license