Portability in the Janus Natural Language Interface
نویسندگان
چکیده
Although natural language technology has achieved a high degree of domain independence through separating domain-independent modules from domain-dependent knowledge bases, portability, as measured by effort to move from one application to another, is still a problem. Here we describe a knowledge acquisition tool (KNACQ) that has sharply decreased our effort in building knowledge bases. The knowledge bases acquired with KNACQ are used by both the understanding components and the generation components of Janus. INTRODUCTION: MOTIVATION Portability is measurable by the person-effort expended to achieve a pre-specified degree of coverage, given an application program. Factoring an NL system into domain-dependent and domain-independent modules is now part of the state of the art; therefore, the challenge in portability is reducing the effort needed to create domain-dependent modules. For us, those are the domain-dependent knowledge bases, e.g., lexical syntax, lexical semantics, domain models, and transformations specific to the target application system. Our experience in installing our natural language interface as part of DARPA's Fleet Command Center Battle Management Program (FCCBMP) iUustrates the kind of portability needed if NL applications (or products) are to become widespread. We demonstrated broad linguistic coverage across 40 fields of a large Oracle database, the Integrated Data Base (IDB), in August 1986. A conclusion was that the state of the art in understanding was adequate. However, the time and cost needed to cover all 400 fields of the IDB in 1986 and the more than 850 fields today would have been prohibitive without a breakthrough in knowledge acquisition and maintenance tools. We have developed a suite of tools to greatly increase our productivity in porting BBN's Janus NL understanding and generation system to new domains. KREME [Abrett, 1987] enables creating, browsing, and maintaining of taxonomic knowledge bases. IRACQ [Ayuso, 1987] supports learning lexical semantics from examples with only one unknown word. Both of those tools were used in preparing the FCCBMP demonstratior~ in 1986. What was missing was a way to rapidly infer the knowledge bases for the overwhelming majority of words used in accessing fields. Then one could bootstrap using IRACQ to acquire more complex lexical items. We have developed and used such a tool called KNACQ (for KNowledge ACQuisition). The efficiency we have experienced results from (1) identifying regularities in expression corresponding to domain model structures and (2) requiring little information from the user to identi~ expressions corresponding to those regularities. 1 This research was supported by the Advanced Research Projects Agency of the Department of Defense and was monitored by ONR under Contracts N00014-85-C-0079 and N00014-85-C-0016. The views and conclusions contained in this document are those of the authors and should not be interpreted as necessarily representing the official policies, either expressed or implied, of the Defense Advanced Research Projects Agency or the U.S. Government.
منابع مشابه
Edite - A Natural Language Interface to Databases A new dimension for an old approach
This article presents the Edite system, a Natural Language Interface for Databases (NLIDB), that tries to explore the advantages of joining natural language processing with the expressiveness of graphical interfaces. In order to guarantee a permanent adaptation of this type of solution to a dynamic domain one should consider two critical fundamental factors: extensibility and portability. An ov...
متن کاملMultiple Underlying Systems: Translating User Requests into Programs to Produce Answers
A user may typically need to combine the strengths of more than one system in order to perform a task. In this paper, we describe a component of the Janus natural language interface that translates intensional logic expressions representing the meaning of a request into executable code for each application program, chooses which combination of application systems to use, and designs the transfe...
متن کاملResearch and Development in Natural Language Understanding
Brief Summary of Objectives: There are three objectives of the contract: to perform research and development in parallel parsing, semantic representation, ill-formed input, discourse, and tools for linguistic knowledge acquisition, and to integrate software components from BBN and elsewhere to produce Janus, DARPA's New Generation Natural Language Interface, and to demonstrate state-of-theart n...
متن کاملDataset for a Neural Natural Language Interface for Databases (NNLIDB)
Progress in natural language interfaces to databases (NLIDB) has been slow mainly due to linguistic issues (such as language ambiguity) and domain portability. Moreover, the lack of a large corpus to be used as a standard benchmark has made datadriven approaches difficult to develop and compare. In this paper, we revisit the problem of NLIDBs and recast it as a sequence translation problem. To ...
متن کاملUsable, Real-Time, Interactive Spoken Language Systems
The primary objective of this project is to develop a robust, high-performance, domain-independent spoken language system. This system, termed HARC (Hear and Respond to Continuous speech), is composed of the BYBLOS speech recognition system and the DELPHI natural language understanding system. The goal is to develop systems that exhibit the following advances: high-accuracy speech understanding...
متن کامل