Many-to-Many Voice Transformer Network
نویسندگان
چکیده
This paper proposes a voice conversion (VC) method based on sequence-to-sequence (S2S) learning framework, which enables simultaneous of the characteristics, pitch contour, and duration input speech. We previously proposed an S2S-based VC using transformer network architecture called (VTN). The original VTN was designed to learn only mapping speech feature sequences from one speaker another. Here, main idea we propose is extension that can simultaneously mappings among multiple speakers. extension, many-to-many VTN, us fully use available training data collected speakers by capturing common latent features be shared across different It also allows introduce loss identity ensure sequence will remain unchanged when source target indices are same. Using this particular for model has been found extremely effective in improving performance at test time. conducted experiments our obtained higher sound quality similarity than baseline methods. model, with slight modification its architecture, handle any-to-many tasks reasonably well.
منابع مشابه
Many-to-many eigenvoice conversion with reference voice
In this paper, we propose many-to-many voice conversion (VC) techniques to convert an arbitrary source speaker’s voice into an arbitrary target speaker’s voice. We have proposed one-tomany eigenvoice conversion (EVC) and many-to-one EVC. In the EVC, an eigenvoice Gaussian mixture model (EV-GMM) is trained in advance using multiple parallel data sets of a reference speaker and many pre-stored sp...
متن کاملMany-to-many voice conversion based on multiple non-negative matrix factorization
We present in this paper an exemplar-based Voice Conversion (VC) method using Non-negative Matrix Factorization (NMF), which is different from conventional statistical VC. NMF-based VC has advantages of noise robustness and naturalness of converted voice compared to Gaussian Mixture Model (GMM)based VC. However, because NMF-based VC is based on parallel training data of source and target speake...
متن کاملEvaluation of a singing voice conversion method based on many-to-many eigenvoice conversion
In this paper, we evaluate our proposed singing voice conversion method from various perspectives. To enable singers to freely control their voice timbre of singing voice, we have proposed a singing voice conversion method based on many-tomany eigenvoice conversion (EVC) that enables to convert the voice timbre of an arbitrary source singer into that of another arbitrary target singer using a p...
متن کاملLinear transformation approaches to many-to-one voice conversion
In this paper, we present linear transformation algorithms for many to one voice conversion (VC). Many to one VC is a tech nique for converting an arbitrary source speaker’s voice into the target speaker’s voice. A conversion model previously devel oped between many prestored source speakers and the target speaker is adapted into a new source speaker in an unsuper vised manner. In this study, w...
متن کاملMany-to-Many Graph Matching
Postcode 06560 City Sogutozu State Ankara Country Turkey Author Degree Dr.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE/ACM transactions on audio, speech, and language processing
سال: 2021
ISSN: ['2329-9304', '2329-9290']
DOI: https://doi.org/10.1109/taslp.2020.3047262