Memory Warps for Learning Long-Term Online Video Representations
نویسندگان
چکیده
This paper proposes a novel memory-based online video representation that is efficient, accurate and predictive. This is in contrast to prior works that often rely on computationally heavy 3D convolutions, ignore actual motion when aligning features over time, or operate in an off-line mode to utilize future frames. In particular, our memory (i) holds the feature representation, (ii) is spatially warped over time to compensate for observer and scene motions, (iii) can carry long-term information, and (iv) enables predicting feature representations in future frames. By exploring a variant that operates at multiple temporal scales, we efficiently learn across even longer time horizons. We apply our online framework to object detection in videos, obtaining a large 2.3 times speed-up and losing only 0.9% mAP on ImageNet-VID dataset, compared to prior works that even use future frames. Finally, we demonstrate the predictive property of our representation in two novel detection setups, where features are propagated over time to (i) significantly enhance a real-time detector by more than 10% mAP in a multi-threaded online setup and to (ii) anticipate objects in future frames.
منابع مشابه
The effect of intrahippocampal microinjection of Naloxone on short –term and long-term memory in adult male rats
Introduction:The hippocampus is one for the major centers of learning and memory. Role of the opioid system has been investigated and on the other hand receptors related to this system such as mu-opioid receptors (MOR) are extended in the hippocampus. In this study the effect of Naloxone administration as a mu opioid receptor antagonist on passive avoidance memory in adult male rats was i...
متن کاملUnsupervised Learning of Video Representations using LSTMs
We use multilayer Long Short Term Memory (LSTM) networks to learn representations of video sequences. Our model uses an encoder LSTM to map an input sequence into a fixed length representation. This representation is decoded using single or multiple decoder LSTMs to perform different tasks, such as reconstructing the input sequence, or predicting the future sequence. We experiment with two kind...
متن کاملP3: Mechanisms of TrkB-Mediated Hippocampal Long-Term Potentiation in Learning and Memory
Long-term potentiation (LTP) is a process that certain types of synaptic stimulation lead to a long-lasting enhancement in the strength of synaptic transmission. Studies in recent years indicate the importance of molecular pathways in the development of memory and learning. Tropomyosin receptor kinase B (TrkB) is a member of the neurotrophin receptor tyrosine kinase family, that its ligand is b...
متن کاملWord Type Effects on L2 Word Retrieval and Learning: Homonym versus Synonym Vocabulary Instruction
The purpose of this study was twofold: (a) to assess the retention of two word types (synonyms and homonyms) in the short term memory, and (b) to investigate the effect of these word types on word learning by asking learners to learn their Persian meanings. A total of 73 Iranian language learners studying English translation participated in the study. For the first purpose, 36 freshmen from an ...
متن کاملRapid Online Learning of Objects in a Biologically Motivated Recognition Architecture
We present an approach for the supervised online learning of object representations based on a biologically motivated architecture of visual processing. We use the output of a recently developed topographical feature hierarchy to provide a view-based representation of threedimensional objects using a dynamical vector quantization approach. For a simple short-term object memory model we demonstr...
متن کامل