was present at a colloquium entitled " Human - Machine Communication
نویسنده
چکیده
Optimism is growing that the near future will witness rapid growth in human-computer interaction using voice. System prototypes have recently been built that demonstrate speaker-independent real-time speech recognition, and understanding of naturally spoken utterances with vocabularies of 1000 to 2000 words, and larger. Already, computer manufacturers are building speech recognition subsystems into their new product lines. However, before this technology can be broadly useful, a substantial knowledge base is needed about human spoken language and performance during computerbased spoken interaction. This paper reviews application areas in which spoken interaction can play a significant role, assesses potential benefits of spoken interaction with machines, and compares voice with other modalities of human-computer interaction. It also discusses information that will be needed to build a firm empirical foundation for the design offuture spoken and multimodal interfaces. Finally, it argues for a more systematic and scientific approach to investigating spoken input and performance with future language technology. From the beginning of the computer era, futurists have dreamed of the conversational computer-a machine that we could engage in natural spoken conversation. For instance, Turing's famous test of computational intelligence imagined a computer that could conduct such a fluent English conversation that people could not distinguish it from a human. Despite prolonged research and many notable scientific and technological achievements, there have been few real humancomputer dialogues until recently, and those existing have been keyboard exchanges rather than spoken. This situation has begun to change, however. Steady progress in speech recognition and natural language processing technologies, supported by dramatic advances in computer hardware, has enabled laboratory prototype systems with which one can conduct simple question-answering dialogues. Although far from human-level conversation, this initial capability is generating considerable optimism for the future of humancomputer interaction using voice. This paper aims to identify applications for which spoken interaction is advantageous, to clarify the role of voice with respect to other modalities of human-computer interaction, and to consider obstacles to the successful development and commercialization of spoken language systems. Two general sorts of speech input technology are considered. First, we survey a number of existing applications of speech recognition technologies, for which the system identifies the words spoken, but need not understand the meaning of what is being said. Second, we concentrate on applications that will require a more complete understanding of the speaker's intended meaning, examining future spoken dialogue systems. Finally, we discuss how such speech understanding will play a role in future human-computer interactions, particularly those involving the coordinated use of multiple communication modalities, such as graphics, handwriting, and gesturing. It is argued that progress has been impeded by the lack of adequate scientific knowledge about human spoken interactions, especially with computers. Such a knowledge base is essential to the development of well-founded human-interface guidelines that can assist system designers in producing successful applications incorporating spoken interaction. Given recent technological developments, the field is now in a position to systematically expand that knowledge base. WHEN IS SPEAKING TO COMPUTERS USEFUL? As yet, there is no theory or categorization of tasks and environments that would predict, all else being equal, when voice would be a preferred modality of human-computer communication. Still, a number of situations have been identified in which spoken communication with machines may be advantageous: * When the user's hands or eyes are busy * When only a limited keyboard and/or screen is available * When the user is disabled * When pronunciation is the subject matter of computer use * When natural language interaction is preferred We briefly examine the present and future roles of spoken interaction with computers for these environments. Because spoken natural language interaction is the most difficult to implement, we discuss it extensively in the section Natural Language Interaction. Hand/Eyes-Busy Tasks The classic situation favoring spoken interaction with machines is one in which the user's hands and/or eyes are busy performing some other task. In such circumstances, by using voice to communicate with the machine, people are free to pay attention to their task, rather than breaking away to use a keyboard. For instance, wire installers, who spoke a wire's serial number and then were guided verbally by the computer to install that wire achieved a 20-30% speedup in productivity, with improved accuracy and lower training time, over their prior manual method ofwire identification and installation (1). Although individual field studies are rarely conclusive, many field studies of highly accurate speech recognition systems with hands/eyes-busy tasks have found that spoken input leads to higher task productivity and accuracy. Other hands/eyes-busy applications that have benefited from voice interaction include data entry and machine control in factories and field applications (2), access to information for military command-and-control, cockpit management (3, 4), astronauts' information management during extra-vehicular access in space, dictation of medical diagnoses, maintenance and repair The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. §1734 solely to indicate this fact.
منابع مشابه
The future of voice-processing technology in the world of computers and communications.
This talk, which was the keynote address of the NAS Colloquium on Human-Machine Communication by Voice, discusses the past, present, and future of human-machine communications, especially speech recognition and speech synthesis. Progress in these technologies is reviewed in the context of the general progress in computer and communications technologies.
متن کاملOn the Power of Las Vegas for One-way Communication Complexity, Finite Automata, and Polynomial-time Computations
The study of the computational power of randomized computations is one of the central tasks of complexity theory. The main goal of this paper is the comparison of the power of Las Vegas computation and deterministic respectively nondeterministic computation. We investigate the power of Las Vegas computation for the complexity measures of one-way communication, nite automata and polynomial-time ...
متن کاملOn Interactivity in Arthur-Merlin Communication and Stream Computation
We introduce online interactive proofs (OIP), which are a hierarchy of communication complexity models that involve both randomness and nondeterminism (thus, they belong to the Arthur–Merlin family), but are online in the sense that the basic communication flows from Alice to Bob alone. The complexity classes defined by these OIP models form a natural hierarchy based on the number of rounds of ...
متن کاملSeparating NOF communication complexity classes RP and NP
In the number-on-forehead (NOF) model of communication complexity, k players are trying to evaluate a function F defined on kn bits. The input of F is partitioned into k pieces of n bits each, call them x1, . . . ,xk, and xi is placed, metaphorically, on the forehead of player i. Thus, each player sees (k − 1)n of the kn input bits. The players communicate by writing bits on a shared blackboard...
متن کاملThe role of science in solving the world's emerging water problems.
This article serves as an introduction to the Arthur M. Sackler Colloquium entitled The Role of Science in Solving the Earth’s Emerging Water Problems. The Colloquium was held October 8–10, 2004, at the Arnold and Mabel Beckman Center of the National Academies of Sciences and Engineering in Irvine, CA. Sixteen speakers gave invited presentations in four sessions covering (i) water problems from...
متن کاملHuman errors identification in operation of meat grinder using TAFEI technique
Background: Human error is the most important cause of occupational and non-occupational accidents. Because, it seems necessary to identify, predict and analyze human errors, and also offer appropriate control strategies to reduce errors which cause adverse consequences, the present study was carried out with the aim of identifying human errors while operating meat grinder and offer sugg...
متن کامل