Video-Audio Domain Generalization via Confounder Disentanglement
نویسندگان
چکیده
Existing video-audio understanding models are trained and evaluated in an intra-domain setting, facing performance degeneration real-world applications where multiple domains distribution shifts naturally exist. The key to domain generalization (VADG) lies alleviating spurious correlations over multi-modal features. To achieve this goal, we resort causal theory attribute such correlation confounders affecting both features labels. We propose a DeVADG framework that conducts uni-modal cross-modal deconfounding through back-door adjustment. performs disentanglement obtains fine-grained at class-level domain-level using half-sibling regression unpaired transformation, which essentially identifies domain-variant factors class-shared cause between false promote VADG research, collect VADG-Action dataset for action recognition with 5,000 video clips across four (e.g., cartoon game) ten classes cooking riding). conduct extensive experiments, i.e., multi-source DG, single-source qualitative analysis, validating the rationality of our analysis effectiveness framework.
منابع مشابه
Domain Generalization via Invariant Feature Representation
This paper investigates domain generalization: How to take knowledge acquired from an arbitrary number of related domains and apply it to previously unseen domains? We propose Domain-Invariant Component Analysis (DICA), a kernel-based optimization algorithm that learns an invariant transformation by minimizing the dissimilarity across domains, whilst preserving the functional relationship betwe...
متن کاملFinite-time disentanglement via spontaneous emission.
We show that under the influence of pure vacuum noise two entangled qubits become completely disentangled in a finite-time, and in a specific example we find the time to be given by ln((2+sqrt[2] / 2) times the usual spontaneous lifetime.
متن کاملConfounder selection via penalized credible regions.
When estimating the effect of an exposure or treatment on an outcome it is important to select the proper subset of confounding variables to include in the model. Including too many covariates increases mean square error on the effect of interest while not including confounding variables biases the exposure effect estimate. We propose a decision-theoretic approach to confounder selection and ef...
متن کاملVideo Abstraction in H.264/AVC Compressed Domain
Video abstraction allows searching, browsing and evaluating videos only by accessing the useful contents. Most of the studies are using pixel domain, which requires the decoding process and needs more time and process consuming than compressed domain video abstraction. In this paper, we present a new video abstraction method in H.264/AVC compressed domain, AVAIF. The method is based on the norm...
متن کاملTemporal Generalization with Domain Generalization Graphs
This paper addresses the problem of using domain generalization graphs to generalize temporal data extracted from relational databases. A domain generalization graph associated with an attribute deenes a partial order which represents a set of generalization relations for the attribute. We propose formal speciications for domain generalization graphs associated with calendar (date and time) att...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence
سال: 2023
ISSN: ['2159-5399', '2374-3468']
DOI: https://doi.org/10.1609/aaai.v37i12.26787