Learning Efficient Representations for Keyword Spotting with Triplet Loss
نویسندگان
چکیده
In the past few years, triplet loss-based metric embeddings have become a de-facto standard for several important computer vision problems, most no-tably, person reidentification. On other hand, in area of speech recognition generated by loss are rarely used even classification problems. We fill this gap showing that combination two representation learning techniques: embedding and variant kNN instead cross-entropy significantly (by 26% to 38%) improves accuracy convolutional networks on LibriSpeech-derived LibriWords datasets. To do so, we propose novel phonetic similarity based mining approach. also improve current best published SOTA Google Speech Commands dataset V1 10+2 -class about 34%, achieving 98.55% accuracy, V2 10+2-class 20%, 98.37% 35-class over 50%, 97.0% accuracy.
منابع مشابه
New efficient fillers for unlimited word recognition and keyword spotting
This paper describes our complete results for improved lexical llers as well as two new kinds of llers, gives their results in unlimited speech recognition as well as for keyword spotting and compares them to the acoustic-phonetic ller in the case of keyword spotting. Tests have been conducted on di erent vocabularies derived from ATIS and the Wall Street Journal database. Results for keyword s...
متن کاملAn Efficient Keyword Spotting Techni Language for Filler Mo
The task of keyword spotting is to detect a set of keywords in the input continuous speech. In a keyword spotter, not only the keywords, but also the non-keyword intervals must be modeled. For this purpose, filler (or garbage) models are used. To date, most of the keyword spotters have been based on hidden Markov models (HMM). More specifically, a set of HMM is used as garbage models. In this p...
متن کاملDeep Residual Learning for Small-Footprint Keyword Spotting
We explore the application of deep residual learning and dilated convolutions to the keyword spotting task, using the recently-released Google Speech Commands Dataset as our benchmark. Our best residual network (ResNet) implementation significantly outperforms Google’s previous convolutional neural networks in terms of accuracy. By varying model depth and width, we can achieve compact models th...
متن کاملMorphological Segmentation for Keyword Spotting
• We explore the impact of morphological segmentation on Keyword Spotting (KWS). ! • Handling out-of-vocabulary (OOV) words is a major challenge in KWS we aim to alleviate this problem by utilizing sub-word units.! • We augment a state-of-the-art KWS system with subword units derived from supervised and unsupervised morphological segmentations, and compare with phonetic and syllabic segmentatio...
متن کاملDiscriminative keyword spotting
This paper proposes a new approach for keyword spotting, which is not based on HMMs. The proposed method employs a new discriminative learning procedure, in which the learning phase aims at maximizing the area under the ROC curve, as this quantity is the most common measure to evaluate keyword spotters. The keyword spotter we devise is based on nonlinearly mapping the input acoustic representat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Lecture Notes in Computer Science
سال: 2021
ISSN: ['1611-3349', '0302-9743']
DOI: https://doi.org/10.1007/978-3-030-87802-3_69