Deep Least Squares Regression for Speaker Adaptation
نویسندگان
چکیده
Recently, speaker adaptation methods in deep neural networks (DNNs) have been widely studied for automatic speech recognition. However, almost all adaptation methods for DNNs have to consider various heuristic conditions such as mini-batch sizes, learning rate scheduling, stopping criteria, and initialization conditions because of the inherent property of a stochastic gradient descent (SGD)-based training process. Unfortunately, those heuristic conditions are hard to be properly tuned. To alleviate those difficulties, in this paper, we propose a least squares regression-based speaker adaptation method in a DNN framework utilizing posterior mean of each class. Also, we show how the proposed method can provide a unique solution which is quite easy and fast to calculate without SGD. The proposed method was evaluated in the TED-LIUM corpus. Experimental results showed that the proposed method achieved up to a 4.6% relative improvement against a speaker independent DNN. In addition, we report further performance improvement of the proposed method with speaker-adapted features.
منابع مشابه
Rapid Speaker Adaptation Using
This paper examines an approach to speaker adaptation called speaker cluster weighting (SCW) for rapid adaptation in the Jupiter weather information system. SCW extends the ideas of previous speaker cluster techniques by allowing the speaker cluster models (learned from training data) to be adaptively weighted to match the current speaker. We explore strategies for automatic speaker clustering ...
متن کاملRapid speaker adaptation using speaker clustering
This paper examines an approach to speaker adaptation called speaker cluster weighting (SCW) for rapid adaptation in the Jupiter weather information system. SCW extends the ideas of previous speaker cluster techniques by allowing the speaker cluster models (learned from training data) to be adaptively weighted to match the current speaker. We explore strategies for automatic speaker clustering ...
متن کاملRapid Speaker Adaptation With Speaker Clustering
This thesis addresses the problem of rapid speaker adaptation. This is the task of altering the parameters of a speaker dependent speech recognition system so as to make that system look more like a speaker dependent system using a very small amount (<10 seconds) of speaker specific data. The approach to speaker adaptation taken in this work is called speaker cluster weighting (SCW). SCW extend...
متن کاملEfficient Mllr
The need for close to real time speech recognition has recently driven interest in fast LVCSR systems. Due to the time constraint, such systems often discard, where possible, sub-processes of the entire recognition process which demand relatively large amounts of computation and yield relatively small accuracy gains. This report focusses on such speed-accuracy tradeoffs with regard to speaker a...
متن کاملUsing class weighting in inter-class MLLR
A new adaptation method called inter-class MLLR has recently been introduced. Inter-class MLLR utilizes relationships among different transformation functions to achieve more reliable estimates of MLLR parameters across multiple classes, and it produces lower word error rates (WER) than conventional MLLR in circumstances where very little speaker-specific adaptation data are available. This pap...
متن کامل