Impact of durational outlier removal from unit selection catalogs
نویسندگان
چکیده
Outlier removal is a straightforward technique for improving the quality of unit selection catalogs without hand correction. This paper investigates the use of phone durations as a criteria for removing bad units. Scoring conditioned on linguistic context demonstrably better than statistics based on phone class alone. The impact of voice modification is evaluated with a 444K utterance test corpus.
منابع مشابه
Finding outlier light-curves in catalogs of periodic variable stars
We present a methodology to discover outliers in catalogs of periodic light-curves. We use cross-correlation as measure of “similarity” between two individual light-curves and then classify light-curves with lowest average “similarity” as outliers. We performed the analysis on catalogs of variable stars of known type from the MACHO and OGLE projects and established that our method correctly ide...
متن کاملOutlier removal to uncover patterns in adverse drug reaction surveillance - a simple unmasking strategy.
PURPOSE This study aimed to develop an algorithm for uncovering associations masked by extreme reporting rates, characterize the occurrence of masking by influential outliers in two spontaneous reporting databases and evaluate the impact of outlier removal on disproportionality analysis. METHODS We propose an algorithm that identifies influential outliers and carries out parallel analysis aft...
متن کاملA new outlier removal approach for cDNA microarray normalization.
Normalization is a critical step in the analysis of microarray gene expression data. For dual-labeled array, traditional normalization methods assume that the majority of genes are non-differentially expressed and that the number of overexpressed genes approximately equals the number of under-expressed genes. However, these assumptions are inappropriate in some particular conditions. Differenti...
متن کاملRobust detrending, rereferencing, outlier detection, and inpainting for multichannel data
Electroencephalography (EEG), magnetoencephalography (MEG) and related techniques are prone to glitches, slow drift, steps, etc., that contaminate the data and interfere with the analysis and interpretation. These artifacts are usually addressed in a preprocessing phase that attempts to remove them or minimize their impact. This paper offers a set of useful techniques for this purpose: robust d...
متن کاملOutlier Removal in Model-Based Missing Value Imputation for Medical Datasets
Many real-world medical datasets contain some proportion of missing (attribute) values. In general, missing value imputation can be performed to solve this problem, which is to provide estimations for the missing values by a reasoning process based on the (complete) observed data. However, if the observed data contain some noisy information or outliers, the estimations of the missing values may...
متن کامل