presented at a coUoquium entitled " Human - Machine Communication by Voice , " organized
نویسندگان
چکیده
This paper discusses some of the aspects of task requirements, user expectations, and technological capabilities that influence the design of a voice interface and then identifies several components of user interfaces that are particularly critical in successful voice applications. Examples from several applications are provided to demonstrate how these components are used to produce effective voice interfaces. As speech synthesis and speech recognition technologies improve, applications requiring more complex and more natural human-machine interactions are becoming feasible. A welldesigned user interface is critical to the success of these applications. A carefully crafted user interface can overcome many of the limitations of current technology to produce a successful outcome from the user's point of view, even when the technology works imperfectly. With further technological improvements, the primary role of the user interface will gradually shift from a focus on adapting the user's input to fit the limitations of the technology to facilitating interactive dialogue between human and machine by recognizing and providing appropriate conversational cues. Among the factors that must be considered in designing voice interfaces are (i) the task requirements of the application, (ii) the capabilities and limitations of the technology, and (iii) the characteristics of the user population. This paper discusses how these factors influence user interface design and then describes components of user interfaces that can be used to facilitate efficient and effective human-machine voice-based interactions. Voice interfaces provide an additional input and output modality for human-computer interactions, either as a component of a multi-modal, multimedia system or when other input and output modalities are occupied, unavailable, or not usable by the human (e.g., for users with visual or motor disabilities). One motivation for human-computer interaction by voice is that voice interfaces are considered "more natural" than other types of interfaces (e.g., keyboard, mouse, touch screen). That is, speech interfaces can provide a "look and feel" that is more like communication between humans. The underlying assumption is that by presenting this "more natural" interface to the user the system can take advantage of skills and expectations that the user has developed through everyday communicative experiences to create a more efficient and effective transfer of information between human and machine (1). A successful human-machine interaction, like a successful human-human interaction, is one that accomplishes the task at hand efficiently and easily from the human's perspective. However, current human-computer voice-based interactions do not yet match the richness, complexity, accuracy, or reliability achieved in most human-human interactions either for speech input [i.e., automatic speech recognition (ASR) or speech understanding] or for speech output (digitized or synthetic speech). This deficit is due only in part to imperfect speech technology. Equally important is the fact that, while current automated systems may contain sufficient domain knowledge about an application, they do not sufficiently incorporate other kinds of knowledge that facilitate collaborative interactions. Typically, an automated system is limited both in linguistic and conceptual knowledge. Furthermore, automated systems using voice interfaces also have an impoverished appreciation of conversational dynamics, including the use of prosodic cues to appropriately maintain turn taking and the use of confirmation protocols to establish coherence between the participants. A well-designed voice interface can alleviate the effects of these deficiencies by structuring the interaction to maximize the probability of successfully accomplishing the task. Where technological limitations prohibit the use of natural conversational speech, the primary role of the interface is to induce the user to modify his/her behavior to fit the requirements of the technology. As voice technologies become capable of dealing with more natural input, the user interface will still be critical for facilitating the smooth flow of information between the user and the system by providing appropriate conversational cues and feedback. Well-designed user interfaces are essential to successful applications; a poor user interface can render a system unusable. Designing an effective user interface for a voice application involves consideration of (i) the information requirements of the task, (ii) the limitations and capabilities of the voice technology, and (iii) the expectations, expertise, and preferences of the user. By understanding these factors, the user interface designer can anticipate some of the difficulties and incompatibilities that will affect the success of the application and design the interaction to minimize their impact. For optimal results, user interface design must be an integral and early component in the overall design of a system. User interface design and implementation are most successful as an iterative process, with interfaces tested empirically on groups of representative users, then revised as deficiencies are detected and corrected, and then retested, until system performance is stable and satisfactory. USER INTERFACE CONSIDERATIONS By considering the interdependencies among the task demands of the application, the needs and expectations of the user population, and the capabilities of the technology, an interface designer can more accurately define a control flow for the application that will optimally guide the user through the interaction and handle the errors in communication in a way that facilitates task completion.
منابع مشابه
Scientific bases of human-machine communication by voice.
The scientific bases for human-machine communication by voice are in the fields of psychology, linguistics, acoustics, signal processing, computer science, and integrated circuit technology. The purpose of this paper is to highlight the basic scientific and technological issues in human-machine communication by voice and to point out areas of future research opportunity. The discussion is organ...
متن کاملVoice Driven Emotion Recognizer Mobile Phone: Proposal and Evaluations
This article proposes an application of emotion recognizer system in telecommunications entitled voice driven emotion recognizer mobile phone (VDERM). The design implements a voice-to-image conversion scheme through a voice-to-image converter that extracts emotion features in the voice, recognizes them, and selects the corresponding facial expression images from image bank. Since it only requir...
متن کاملA New Algorithm for Voice Activity Detection Based on Wavelet Packets (RESEARCH NOTE)
Speech constitutes much of the communicated information; most other perceived audio signals do not carry nearly as much information. Indeed, much of the non-speech signals maybe classified as ‘noise’ in human communication. The process of separating conversational speech and noise is termed voice activity detection (VAD). This paper describes a new approach to VAD which is based on the Wavelet ...
متن کاملOn Feasibility of Adaptive Level Hardware Evolution for Emergent Fault Tolerant Communication
A permanent physical fault in communication lines usually leads to a failure. The feasibility of evolution of a self organized communication is studied in this paper to defeat this problem. In this case a communication protocol may emerge between blocks and also can adapt itself to environmental changes like physical faults and defects. In spite of faults, blocks may continue to function since ...
متن کاملSpoken Language Processing in the Framework of Human-Machine Communication at LIMSI
The paper provides an overview of the research conducted at LIMSI in the field of speech processing, but also in the related areas of Human-Machine Communication, including Natural Language Processing, Non Verbal and Multimodal Communication. Also presented are the commercial applications of some of the research projects. When applicable, the discussion is placed in the framework of internation...
متن کامل