In 2007, Naturel (Naturel, 2007) developed a method which, given a segmented video stream, associated a label with each segment. However, this method did not automatically check the accuracy of the results obtained. In this paper we propose to control these results, by taking each segment, and associating the corresponding phonetic or textual transcription of the soundtrack with descriptions ex...