نتایج جستجو برای: subtitle
تعداد نتایج: 1740 فیلتر نتایج به سال:
In this paper we describe the Japanese-English Subtitle Corpus (JESC). JESC is a large Japanese-English parallel corpus covering the underrepresented domain of conversational dialogue. It consists of more than 3.2 million examples, making it the largest freely available dataset of its kind. The corpus was assembled by crawling and aligning subtitles found on the web. The assembly process incorp...
We present a new major release of the OpenSubtitles collection of parallel corpora. The release is compiled from a large database of movie and TV subtitles and includes a total of 1689 bitexts spanning 2.6 billion sentences across 60 languages. The release also incorporates a number of enhancements in the preprocessing and alignment of the subtitles, such as the automatic correction of OCR erro...
Text condensation aims at shortening the length of an utterance without losing essential textual information. In this paper, we report on the implementation and preliminary evaluation of a sentence condensation tool for Greek using a manually constructed table of 450 lexical paraphrases, and a set of rules that delete syntactic subtrees that carry minor semantic information. Evaluation on two s...
The expansion of digital television and the increasing demand to manipulate audiovisual content underlie the need for tools and systems that will automate the multilingual subtitle generation process. In this setting the MUSA project aims at providing a system which combines speech recognition, advanced text analysis, and machine translation to help generate multilingual subtitles. In its curre...
The new subtitling system will be established along with the digital television broadcasting. The system uses MPEG-2 multiplexing and region-based bitmap graphics (with indexed pixel colors) technologies to transmit subtitle data to set-top box. It provides more interactivity than the existing analogue television although interactivity is limited. This paper gives a description of Digital Video...
Movie subtitle alignment is a potentially useful approach for deriving automatically parallel bilingual/multilingual spoken language data for automatic speech translation. In this paper, we consider the movie subtitle alignment task. We propose a distance metric between utterances of different languages based on lexical features derived from bilingual dictionaries. We use the dynamic time warpi...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید