Using Avatars for Improving Speaker Identification in Captioning
نویسندگان
چکیده
Captioning is the main method for accessing television and film content by people who are deaf or hard-of-hearing. One major difficulty consistently identified by the community is that of knowing who is speaking particularly for an off screen narrator. A captioning system was created using a participatory design method to improve speaker identification. The final prototype contained avatars and a coloured border for identifying specific speakers. Evaluation results were very positive; however participants also wanted to customize various components such as caption and avatar location.
منابع مشابه
Novel Approach to Live Captioning Through Re-speaking: Tailoring Speech Recognition to Re-speaker's Needs
A novel approach to the live captioning through re-speaking is introduced in this paper. We describe our concept of respeaking using only one re-speaker with enhanced re-speaker tasks fully integrated to the recognition system and captioning software. New techniques for instant correction of recognition output, punctuation mark introduction or new word addition are presented. Our real-time reco...
متن کاملOnline TV Captioning of Czech Parliamentary Sessions
In the paper we introduce the on-line captioning system developed by our teams and used by the Czech Television (CTV), the public service broadcaster in the Czech Republic. The research project is targeted at incorporation of speech technologies into the CTV environment. One of the key missions is the development of captioning system supporting captioning of a “live” acoustic track. It can be e...
متن کاملText Independent Speaker Identification Using Automatic Acoustic Segmentation
This paper describes an acoustic class dependent technique for text independent speaker identification on very short utterances. The technique is based on maximum likelihood estimation of a Gaussian mixture model representation of speaker identity. Gaussian mixtures are noted for their robustness as a parametric model and their ability to form smooth estimates of rather arbitrary underlying den...
متن کاملCaptioning of Live TV Commentaries from the Olympic Games in Sochi: Some Interesting Insights
In this paper, we describe our effort and some interesting insights obtained during captioning more than 70 hours of live TV broadcasts from the Olympic Games in Sochi. The closed captioning was prepared for ČT Sport, the sport channel of the public service broadcaster in the Czech Republic. We will briefly discuss our solution for distributed captioning architecture on live TV programs using r...
متن کاملImproving speaker identification performance in reverberant conditions using lip information
This paper considers the improvment of speaker identification performance in reverberant conditions using additional lip information. Automatic speaker identification (ASI) using speech characteristics alone can be highly successful, however problems occur with mis-matches between training and testing conditions. In particular, we find that ASI performance drops dramatically when given anechoic...
متن کامل