Automatic Violence Scenes Detection: A multi-modal approach
نویسندگان
چکیده
In this working note, we propose a set of features and a classification scheme for detecting automatically violent scenes in movies. The features are extracted from audio, video, and subtitles modalities of the movies. In violent scenes classification, we found the following features relevant: the short time audio energy, motion component, and shot words rate. We classified the shots into violent and non-violent using näıve Bayesian, Linear Discriminant Analysis (LDA), and Quadratic Discriminant Analysis (QDA) targeting to maximize the precision of the detection in the first two minutes of retrieved content.
منابع مشابه
A multi-scale convolutional neural network for automatic cloud and cloud shadow detection from Gaofen-1 images
The reconstruction of the information contaminated by cloud and cloud shadow is an important step in pre-processing of high-resolution satellite images. The cloud and cloud shadow automatic segmentation could be the first step in the process of reconstructing the information contaminated by cloud and cloud shadow. This stage is a remarkable challenge due to the relatively inefficient performanc...
متن کاملAutomatic multi-modal dialogue scene indexing
An automatic algorithm for indexing dialogue scenes in multimedia content is proposed. The content is segmented into dialogue scenes using the state transitions of a hidden Markov model (HMM). Each shot is classified using both audio and visual information to determine the state/scene transitions for this model. Face detection and silence/speech/music classification are the basic tools which ar...
متن کاملDamage detection of multi-girder bridge superstructure based on the modal strain approaches
The research described in this paper focuses on the application of modal strain techniques on a multi-girder bridge superstructure with the objectives of identifying the presence of damage and detecting false damage diagnosis for such structures. The case study is a one-third scale model of a slab-on-girder composite bridge superstructure, comprised of a steel-free concrete deck with FRP rebars...
متن کاملAutomatic annotation of unique locations from video and text
Given a video and associated text, we propose an automatic annotation scheme in which we employ a latent topic model to generate topic distributions from weighted text and then modify these distributions based on visual similarity. We apply this scheme to location annotation of a television series for which transcripts are available. The topic distributions allow us to avoid explicit classifica...
متن کاملOnline multiple people tracking-by-detection in crowded scenes
Multiple people detection and tracking is a challenging task in real-world crowded scenes. In this paper, we have presented an online multiple people tracking-by-detection approach with a single camera. We have detected objects with deformable part models and a visual background extractor. In the tracking phase we have used a combination of support vector machine (SVM) person-specific classifie...
متن کامل