Parse Fitting and Prose Fixing: Getting a Hold on III-Formedness
نویسندگان
چکیده
Processing syntactically ill-formed language is an important mission of the EPISTLE system, lll-formed input is treated by this system in various ways. Misspellings are highlighted by a standard spelling checker; syntactic errors are detected and corrections are suggested; and stylistic infelicities are called to the user's attention. Central to the EPISTLE processing strategy is its technique of fitted parsing. When the rules of a conventional syntactic grammar are unable to produce a parse for an input string, this technique can be used to produce a reasonable approximate parse that can serve as input to the remaining stages of processing. This paper first describes the fitting process and gives examples of ill-formed language situations where it is called into play. We then show how a fitted parse allows EPISTLE to carry on its text-critiquing mission where conventional grammars would fail either because of input problems or because of limitations in the grammars themselves. Some inherent difficulties of the fitting technique are also discussed. In addition, we explore how style critiquing relates to the handling of ill-formed input, and how a fitted parse can be used in style checking.
منابع مشابه
Parsing Heterogeneous Corpora with a Rich Dependency Grammar
Philologist: I need to parse Old French texts of different types (verse, prose, dialects etc.). Do I have to train separate parser models? Computational Linguist: You won’t lose much if you train the parser on all the data you have. P: I can’t do the training myself. What can I expect from existing parser models? C: If the training corpus contained 12th century verse texts, you are best prepare...
متن کاملIntegration of Syntactic, Semantic and Contextual Information in Processing Grammatically Ill-Formed Inputs
This paper describes an integrated method for processing grammatically i l l formed inputs We use partial parses of the input for recov ering from parsing failure In order to select partial parses appropriate for error recovery, cost and reward are assigned to them Cost and reward represent the badness and goodness of a partial parse, respectively The most appropriate partial parse is selected ...
متن کاملKronelope in Ulitskaya\'s Short Stories
The features of the composition of small prose by L. Ulitskaya (on the example of the stories "Bronka", "Happy", "Poor Relatives") are analyzed. Particular attention is paid to the features of picturing time in stories, to the past which is sometimes more important for understanding the motives and characters of L. Ulitskaya’s heroes than the present and even...
متن کاملThe Good, the Bad and the Ugly: Well-Formedness of Live Sequence Charts
The Life Sequence Chart (LSC) language is a conservative extension of the well-known visual formalism of Message Sequence Charts. An LSC specification formally captures requirements on the inter-object behaviour in a system as a set of scenarios. As with many languages, there are LSCs which are syntactically correct but insatisfiable due to internal contradictions. The authors of the original p...
متن کاملGrammar Error Detection with Best Approximated Parse
In this paper, we propose that grammar error detection be disambiguated in generating the connected parse(s) of optimal merit for the full input utterance, in overcoming the cheapest error. The detected error(s) are described as violated grammatical constraints in a framework for ModelTheoretic Syntax (MTS). We present a parsing algorithm for MTS, which only relies on a grammar of well-formedne...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- American Journal of Computational Linguistics
دوره 9 شماره
صفحات -
تاریخ انتشار 1983