Pitch adaptive features for LVCSR
نویسندگان
چکیده
We have investigated the use of a pitch adaptive spectral representation on large vocabulary speech recognition, in conjunction with speaker normalisation techniques. We have compared the effect of a smoothed spectrogram to the pitch adaptive spectral analysis by decoupling these two components of STRAIGHT. Experiments performed on a large vocabulary meeting speech recognition task highlight the importance of combining a pitch adaptive spectral representation with a conventional fixed window spectral analysis. We found evidence that STRAIGHT pitch adaptive features are more speaker independent than conventional MFCCs without pitch adaptation, thus they also provide better performances when combined using feature combination techniques such as Heteroscedastic Linear Discriminant Analysis.
منابع مشابه
Hierarchical processing of the modulation spectrum for GALE Mandarin LVCSR system
This paper aims at investigating the use of TANDEM features based on hierarchical processing of the modulation spectrum. The study is done in the framework of the GALE project for recognition of Mandarin Broadcast data. We describe the improvements obtained using the hierarchical processing and the addition of features like pitch and short-term critical band energy. Results are consistent with ...
متن کاملHybrid HMM/BN LVCSR system integrating multiple acoustic features
In current HMM based speech recognition systems, it is difficult to supplement acoustic spectrum features with additional information such as pitch, gender, articulator positions, etc. On the other hand, Dynamic Bayesian Networks (DBN) allow for easy combination of different features and make use of conditional dependencies between them. However, lack of efficient algorithms has prevented their...
متن کاملA Study on Speaker Normalized MLP Features in LVCSR
Different normalization methods are applied in recent Large Vocabulary Continuous Speech Recognition Systems (LVCSR) to reduce the influence of speaker variability on the acoustic models. In this paper we investigate the use of Vocal Tract Length Normalization (VTLN) and Speaker Adaptive Training (SAT) in Multi Layer Perceptron (MLP) feature extraction on an English task. We achieve significant...
متن کاملClassification of Iranian Traditional Music Dastgahs Using Features Based on Pitch Frequency
The Iranian traditional music is composed of seven majors Dastgahs: Chahargah, Homayoun, Mahour, Segah, Shour, Nava, and Rast-Panjgah. In this paper, a new algorithm for the classification of the Iranian traditional music Dastgahs based on pitch frequency is proposed. In this algorithm, the features of Lagrange coefficients of pitch logarithm (LCPL), Fuzzy similarity sets type 2 (FSST2), and th...
متن کاملDifferent Types of Pitch Angle Control Strategies Used in Wind Turbine System Applications
The most common controller in wind turbine is the blade pitch angle control in order to get the desired power. Controlling the pitch angle in wind turbines has a direct impact on the dynamic performance of the machine and fluctuations in the power systems. Due to constant changes in wind speed, the wind turbines are of nonlinear and multivariate system. The design of a controller that can ad...
متن کامل