Query-by-Example Spoken Term Detection ALBAYZIN 2012 evaluation: overview, systems, results, and discussion
نویسندگان
چکیده
Query-by-Example Spoken Term Detection (QbE STD) aims at retrieving data from a speech data repository given an acoustic query containing the term of interest as input. Nowadays, it has been receiving much interest due to the high volume of information stored in audio or audiovisual format. QbE STD differs from automatic speech recognition (ASR) and keyword spotting (KWS)/spoken term detection (STD) since ASR is interested in all the terms/words that appear in the speech signal and KWS/STD relies on a textual transcription of the search term to retrieve the speech data. This paper presents the systems submitted to the ALBAYZIN 2012 QbE STD evaluation held as a part of ALBAYZIN 2012 evaluation campaign within the context of the IberSPEECH 2012 Conferencea. The evaluation consists of retrieving the speech files that contain the input queries, indicating their start and end timestamps within the appropriate speech file. Evaluation is conducted on a Spanish spontaneous speech database containing a set of talks from MAVIR workshopsb, which amount at about 7 h of speech in total. We present the database metric systems submitted along with all results and some discussion. Four different research groups took part in the evaluation. Evaluation results show the difficulty of this task and the limited performance indicates there is still a lot of room for improvement. The best result is achieved by a dynamic time warping-based search over Gaussian posteriorgrams/posterior phoneme probabilities. This paper also compares the systems aiming at establishing the best technique dealing with that difficult task and looking for defining promising directions for this relatively novel task.
منابع مشابه
The LF Query-by-Example Spoken Term Detection system for the ALBAYZIN 2016 evaluation
Query-by-Example Spoken Term Detection (QbE-STD) is the task of finding occurrences of a spoken query in a repository of audio documents. In the last years, this task has become particularly appealing, mostly due to its flexibility that allows, for instance, to deal with lowresourced languages for which no Automatic Speech Recognition (ASR) system can be built. This paper reports experimental r...
متن کاملMediaEval 2013 Spoken Web Search Task: System Performance Measures
This document discusses how to measure system performance in the Spoken Web Search (SWS) task at MediaEval 2013. The discussion is based on different sources, including the NIST 2006 Spoken Term detection (STD) Evaluation Plan [1], the NIST 2010 Speaker Recognition Evaluation (SRE) Plan [2], the description of the scoring criteria applied in the SWS task at Mediaeval 2012 [3], the Albayzin 2012...
متن کاملThe LF Language Recognition System for Albayzin 2012 Evaluation
This document presents a description of INESC-ID’s Spoken Language Systems Laboratory (LF) systems submitted to the Albayzin 2012 Language Recognition evaluation. The submitted systems differ on the number of sub-systems selected for fusion and the back-end configuration. The basic set of sub-systems considered are four conventional phonotactic sub-systems based on n-gram modelling of phoneme s...
متن کاملALBAYZIN 2016 spoken term detection evaluation: an international open competitive evaluation in Spanish
Within search-on-speech, Spoken Term Detection (STD) aims to retrieve data from a speech repository given a textual representation of a search term. This paper presents an international open evaluation for search-on-speech based on STD in Spanish and an analysis of the results. The evaluation has been designed carefully so that several analyses of the main results can be carried out. The evalua...
متن کاملTUKE MediaEval 2012: Spoken Web Search using DTW and Unsupervised SVM
This working paper provides the basic information about experiments conducted on audio documents within the MediaEval 2012 spoken web search evaluation project. The main purpose of these experiments was to build a robust and language independent system for spoken term detection. Therefore we have proposed query-by-example searching system based on the minimum-cost alignment of DTW algorithm and...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- EURASIP J. Audio, Speech and Music Processing
دوره 2013 شماره
صفحات -
تاریخ انتشار 2013