Project EPISTLE: A System for the Automatic Analysis of Business Correspondence
نویسنده
چکیده
The developing system described here is planned to provide the business executive with useful applications for the computer processing of correspondence in the office environment. Applications will include the synopsis and abstraction of incoming mail and a variety of critiques of newly-generated letters, all based upon the capability of understanding the natural language text at least to a level corresponding to customary business communication. Successive sections of the paper describe the Background and Prior Work, the planned System Output, and Implementation. I. BACKGROUND AND PRIOR WORK We conclude from these behavioral findings that there are indeed extensive regularities in the characteristics of business letters, determined primarily by the purpose objectives. It is these constraints that most strongly indicate to us the feasibility of developing automatic means for recognizing content-themes and purposes from the letter text (as well as the converse, generating letter text from information about purposes). Other analyses have been undertaken to estimate the linguistic complexity and regularities of the texts. The average letter appears to contain 8 sentences, with an average of 18 words each; in the 400 letter-bodies there are roughly 57900 words and 4500 unique words totaL An ongoing hand analysis of the syntactic structure of sentences in a 50-letter sample reveals a relatively high frequency of subject-verb inversions (about 1 per letter) and complex lengthy complementizers (l-4 per letter). These features, along with very frequent noun phrase and sentence coordination, accompanied by a wide variety of grammatical but unsystematic structure deletions, indicate an exceptionally high level of grammatical complexity of our texts. With respect to overall text syntax we have analyzed 10 letters for text cohesion, using a modification of Halliday and Hasan’s coding scheme E41; 82 percent of the instances of cohesion detected were accounted for by 4 categories: lexical repetitions (29%), pronouns (28%), nominal substitutions (9%, e.g., “one”, “same”), and lexical collocations (words related via their semantics, 16%). In an extension of this discourse structure analysis we are analyzing 50 letters, coding all occurrences of functional nouns in terms of (1) the grammatical case function served and (2) the cohesive relation to prior nouns. Preliminary results indicate consistent patterns of case-shift and type of cohesion as a function of the pragmatic and content themes. The results of these linguistic analyses will help determine the strategy ultimately adopted for selecting surface parses and meaning interpretations.
منابع مشابه
"Natural Language Texts are not necessarily Grammatical and Unambiguous or even Complete."
The EPISTLE system is being developed in a research project for exploring the feasibility of a variety of intelligent applications for the processing of business and office text (!'Z; the authors of are the project workers). Although ultimately intended functions include text generation (e.g., 4), present efforts focus on text analysis: developing the capability to take in essentially unconstra...
متن کاملThe EPISTLE Text-Critiquing System
gent " functions f o r processing business correspondence and other texts in an ofice environment. This paper focuses on the initial objectives of the system: critiquing written material on points of grammar and style. The overall system is described, with some details of the implementation, the user interface, and the three levels of processing, especially the syntactic parsing of sentences wi...
متن کاملAPPLICATION OF DEA FOR SELECTING MOST EFFICIENT INFORMATION SYSTEM PROJECT WITH IMPRECISE DATA
The selection of best Information System (IS) project from many competing proposals is a critical business activity which is very helpful to all organizations. While previous IS project selection methods are useful but have restricted application because they handle only cases with precise data. Indeed, these methods are based on precise data with less emphasis on imprecise data. This paper pro...
متن کاملRisk Analysis in E-commerce via Fuzzy Logic
This paper describes the development of a fuzzy decision support system (FDSS) for the assessment of risk in E-commerce (EC) development. A Web-based prototype FDSS is suggested to assist EC project managers in identifying potential EC risk factors and the corresponding project risks. A risk analysis model for EC development using a fuzzy set approach is proposed and incorporated into the FDSS....
متن کاملCritical Success Factors for Business Intelligence Implementation in an Enterprise Resource Planning System Environment Using DEMATEL: A Case Study at a Cement Manufacture Company in Indonesia
This paper is aimed at evaluating critical success factors in Business Intelligence (BI) implementation in an Enterprise Resource Planning (ERP) environment. The data analysis method used in this paper is the Decision Making Trial and Evaluation Laboratory Model (DEMATEL). The study has been conducted on a cement manufacturing strategic holding company that has implemented ERP since 2010. This ...
متن کامل