IRWIN AND JOAN JACOBS CENTER FOR COMMUNICATION AND INFORMATION TECHNOLOGIES Sequential Voice Conversion Using Grid-Based Approximation
نویسندگان
چکیده
The goal of voice conversion is to modify a source speaker’s speech to sound as if spoken by a target speaker. Common conversion methods are based on Gaussian Mixture Modeling (GMM), which require exhaustive training (typically lasting hours), often leading to ill-conditioning, if the dataset used is too small. Additionally, the training process is based on a one-to-one match between the source and target vectors, requiring time alignment. We propose a new conversion method that is trained in seconds, using either small or large scale datasets (50-200 sentences). It requires a parallel dataset but without time alignment. The proposed Grid-Based (GB) method is based on sequential Bayesian tracking, by which the conversion process is expressed as a sequential estimation problem of tracking the target spectrum based on the observed source spectrum. The converted MFCC vectors are sequentially evaluated using a weighted sum of the target training set used as grid-points. To improve the perceived quality of the synthesized signals, we use a postprocessing block for enhancing the global variance. Objective and subjective evaluations show that the enhanced-GB method is comparable to classic GMM-based methods in terms of quality and comparable to their enhanced versions in terms of individuality.
منابع مشابه
IRWIN AND JOAN JACOBS CENTER FOR COMMUNICATION AND INFORMATION TECHNOLOGIES Fork Sequential Consistency is Blocking
We consider an untrusted server storing shared data on behalf of clients. We show that no storage access protocol can on the one hand preserve sequential consistency and wait-freedom when the server is correct, and on the other hand always preserve fork sequential consistency.
متن کاملIRWIN AND JOAN JACOBS CENTER FOR COMMUNICATION AND INFORMATION TECHNOLOGIES Erasure/List Exponents for Slepian-Wolf Decoding
متن کامل
IRWIN AND JOAN JACOBS CENTER FOR COMMUNICATION AND INFORMATION TECHNOLOGIES Universal Decoding for Gaussian Intersymbol Interference Channels
متن کامل