MILC2: A Multi-Layer Multi-Instance Learning Approach to Video Concept Detection
نویسندگان
چکیده
Video is a kind of structured data with multi-layer (ML) information, e.g., a shot is consisted of three layers including shot, keyframe, and region. Moreover, multi-instance (MI) relation is embedded along the consecutive layers. Both the ML structure and MI relation are essential for video concept detection. The previous work [5] dealt with ML structure and MI relation by constructing a MLMI kernel in which each layer is assumed to have equal contribution. However, such equal weighting technique cannot well model MI relation or handle ambiguity propagation problem, i.e., the propagation of uncertainty of sub-layer label through multiple layers, as it has been proved that different layers have different contributions to the kernel. In this paper, we propose a novel algorithm named MILC (Multi-Layer Multi-Instance Learning with Inter-layer Consistency Constraint.) to tackle the ambiguity propagation problem, in which an inter-layer consistency constraint is explicitly introduced to measure the disagreement of inter-layers, and thus the MI relation is better modeled. This learning task is formulated in a regularization framework with three components including hyper-bag prediction error, inter-layer inconsistency measure, and classifier complexity. We apply the proposed MILC to video concept detection over TRECVID 2005 development corpus, and report better performance than both standard Support Vector Machine based and MLMI kernel methods.
منابع مشابه
A transductive multi-label learning approach for video concept detection
In this paper, we address two important issues in the video concept detection problem: the insufficiency of labeled videos and the multiple labeling issue. Most existing solutions merely handle the two issues separately. We propose an integrated approach to handle them together, by presenting an effective transductive multi-label classification approach that simultaneously models the labeling c...
متن کاملMSRA-USTC-SJTU at TRECVID 2007: High-Level Feature Extraction and Search
This paper describes the MSRA-USTC-SJTU experiments for TRECVID 2007. We performed the experiments in high-level feature extraction and automatic search tasks. For high-level feature extraction, we investigated the benefit of unlabeled data by semi-supervised learning, and the multi-layer (ML) multi-instance (MI) relation embedded in video by MLMI kernel, as well as the correlations between con...
متن کاملMulti-Modal Multiple-Instance Learning and Attribute Discovery with the Application to the Web Violent Video Detection
Along with the ever-growing web, violent video sharing in the Internet has interfered with our daily life and affected our, especially children’s health. Therefore violent video recognition is becoming important for web content filtering. In this paper, we classified the video into violent and nonviolent using Multi-Modal Multiple-Instance Learning and Attribute Discovery approach by combining ...
متن کاملFeature-based Malicious URL and Attack Type Detection Using Multi-class Classification
Nowadays, malicious URLs are the common threat to the businesses, social networks, net-banking etc. Existing approaches have focused on binary detection i.e. either the URL is malicious or benign. Very few literature is found which focused on the detection of malicious URLs and their attack types. Hence, it becomes necessary to know the attack type and adopt an effective countermeasure. This pa...
متن کاملExploring multi-modality structure for cross domain adaptation in video concept annotation
Domain adaptive video concept detection and annotation has recently received significant attention, but in existing video adaptation processes, all the features are treated as one modality, while multimodalities, the unique and important property of video data, is typically ignored. To fill this blank, we propose a novel approach, named multi-modality transfer based on multi-graph optimization ...
متن کامل