A Word Database for Natural Language Processing
نویسندگان
چکیده
The intent of this paper is to show some aspects of a computer dictionary geared towards the natural language component of an expert system. The dictionary is organized as a database to integrate tile various aspects of lexicographic work and, at the same time, enable fast access from a parser. Work on the lexicon was long neglected both in theoretical linguistics and natural language processing projects so we felt that a principled approach was overdue (cf. Sedelow (1985) for a survey of related work). In the past two years, we concentrated therefore on the formulation of criteria for establishing syntactic features which have to be coded in the lexicon, and we will report here on some of our findings. This will be preceded by a brief overview of the aims of our overall project and a short description of the prototype system we are building. We will then describe the design of our lexicographic database including the criteria for selecting sources of the vocabulary and some of our tools for editing and querying.
منابع مشابه
First Language Activation during Second Language Lexical Processing in a Sentential Context
Lexicalization-patterns, the way words are mapped onto concepts, differ from one language to another. This study investigated the influence of first language (L1) lexicalization patterns on the processing of second language (L2) words in sentential contexts by both less proficient and more proficient Persian learners of English. The focus was on cases where two different senses of a polys...
متن کاملWord Class Functions for Syntactic-Semantic Analysis
Appeared in Proceedings of the 2nd International Conference on Recent Advances in Natural Language Processing (RANLP’97), pp. 312–317, 1997. In this paper, Analysis with Word Class Functions (WCFA) is presented as a paradigm for syntactic-semantic analysis of natural language. The main characteristics of this approach are: word-orientation, the central role of word class functions, two phases o...
متن کاملGeneric Text Processing: A Progress Report
A generic natural language system, without modification, can effectively analyze an arbitrary input at least to the level of word sense tagging. Considerable research has addressed the transportability of natural language systems, but not generic text processing capabilities. For example, previous DARPA-sponsored work [1, 2] produced transportable interfaces to database systems. Each new applic...
متن کاملDesign and Implementation of Association Rules Based System for Evaluating WSD
In this paper we presentnew method that usedassociation rules mining techniques in the field of natural language processing. Word sense disambiguation is persistently a central and challenging problem as increasing usage of internet in daily life. Every user has some queries that have to be searched on the internet. Transactional Database is created after preprocessing the text files. Ambiguous...
متن کاملA Query Language for WordNet-Like Lexical Databases
WordNet-like lexical databases are used in many natural language processing tasks, such as word sense disambiguation, information extraction and sentiment analysis. The paper discusses the problem of querying such databases. The types of queries specific to WordNet-like databases are analyzed and previous approaches that were undertaken to query wordnets are discussed. A query language which in...
متن کاملبازشناسی متون فارسی با استفاده از مدل زبانی n-gram و پالایش گرامری
Abstract Text recognition has been one of the growing research topics in recent years. Many of these researches have focused on recognition of letters and sub-words as a basis for identifying larger text structures such as words, phrases and sentences. This thesis presents a new method in which the recognized sub-words are combined in order to provide meaningful words and sentences in Farsi tex...
متن کامل