Realization of long sentences using chunking
نویسندگان
چکیده
We propose sentence chunking as a way to reduce the time and memory costs of realization of long sentences. During chunking we divide the semantic representation of a sentence into smaller components which can be processed and recombined without loss of information. Our meaning representation of choice is Dependency Minimal Recursion Semantics (DMRS). We show that realizing chunks of a sentence and combining the results of such realizations increases the coverage for long sentences, significantly reduces the resources required and does not affect the quality of the realization.
منابع مشابه
تعیین مرز و نوع عبارات نحوی در متون فارسی
Text tokenization is the process of tokenizing text to meaningful tokens such as words, phrases, sentences, etc. Tokenization of syntactical phrases named as chunking is an important preprocessing needed in many applications such as machine translation information retrieval, text to speech, etc. In this paper chunking of Farsi texts is done using statistical and learning methods and the grammat...
متن کاملFor the Proper Treatment of Long Sentences in a Sentence Pattern- based English-Korean MT System
This paper describes a sentence pattern-based English-Korean machine translation system backed up by a rule-based module as a solution to the translation of long sentences. A rule-based EnglishKorean MT system typically suffers from low translation accuracy for long sentences due to poor parsing performance. In the proposed method we only use chunking information on the phraselevel of the parse...
متن کاملChunking Ability Shapes Sentence Processing at Multiple Levels of Abstraction
Several recent empirical findings have reinforced the notion that a basic learning and memory skill—chunking—plays a fundamental role in language processing. Here, we provide evidence that chunking shapes sentence processing at multiple levels of linguistic abstraction, consistent with a recent theoretical proposal by Christiansen and Chater (2016). Individual differences in chunking ability at...
متن کاملGraph- and surface-level sentence chunking
The computing cost of many NLP tasks increases faster than linearly with the length of the representation of a sentence. For parsing the representation is tokens, while for operations on syntax and semantics it will be more complex. In this paper we propose a new task of sentence chunking: splitting sentence representations into coherent substructures. Its aim is to make further processing of l...
متن کاملAutomatic Evaluation Method for Machine Translation Using Noun-Phrase Chunking
As described in this paper, we propose a new automatic evaluation method for machine translation using noun-phrase chunking. Our method correctly determines the matching words between two sentences using corresponding noun phrases. Moreover, our method determines the similarity between two sentences in terms of the noun-phrase order of appearance. Evaluation experiments were conducted to calcul...
متن کامل