نتایج جستجو برای: subtitle

تعداد نتایج: 1740  

Journal: :CoRR 2017
Reid Pryzant Yongjoo Chung Daniel Jurafsky Denny Britz

In this paper we describe the Japanese-English Subtitle Corpus (JESC). JESC is a large Japanese-English parallel corpus covering the underrepresented domain of conversational dialogue. It consists of more than 3.2 million examples, making it the largest freely available dataset of its kind. The corpus was assembled by crawling and aligning subtitles found on the web. The assembly process incorp...

2016
Pierre Lison Jörg Tiedemann

We present a new major release of the OpenSubtitles collection of parallel corpora. The release is compiled from a large database of movie and TV subtitles and includes a total of 1689 bitexts spanning 2.6 billion sentences across 60 languages. The release also incorporates a number of enhancements in the preprocessing and alignment of the subtitles, such as the automatic correction of OCR erro...

2008
Prokopis Prokopidis Vassia Karra Aggeliki Papagianopoulou Stelios Piperidis

Text condensation aims at shortening the length of an utterance without losing essential textual information. In this paper, we report on the implementation and preliminary evaluation of a sentence condensation tool for Greek using a manually constructed table of 450 lexical paraphrases, and a set of rules that delete syntactic subtrees that carry minor semantic information. Evaluation on two s...

2005
STELIOS PIPERIDIS IASON DEMIROS PROKOPIS PROKOPIDIS

The expansion of digital television and the increasing demand to manipulate audiovisual content underlie the need for tools and systems that will automate the multilingual subtitle generation process. In this setting the MUSA project aims at providing a system which combines speech recognition, advanced text analysis, and machine translation to help generate multilingual subtitles. In its curre...

2002
CHENGYUAN PENG

The new subtitling system will be established along with the digital television broadcasting. The system uses MPEG-2 multiplexing and region-based bitmap graphics (with indexed pixel colors) technologies to transmit subtitle data to set-top box. It provides more interactivity than the existing analogue television although interactivity is limited. This paper gives a description of Digital Video...

2009
Andreas Tsiartas Prasanta Kumar Ghosh Panayiotis G. Georgiou Shrikanth S. Narayanan

Movie subtitle alignment is a potentially useful approach for deriving automatically parallel bilingual/multilingual spoken language data for automatic speech translation. In this paper, we consider the movie subtitle alignment task. We propose a distance metric between utterances of different languages based on lexical features derived from bilingual dictionaries. We use the dynamic time warpi...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید