Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles
نویسندگان
چکیده
In this paper we study the problem of image representation learning without human annotation. Following the principles of selfsupervision, we build a convolutional neural network (CNN) that can be trained to solve Jigsaw puzzles as a pretext task, which requires no manual labeling, and then later repurposed to solve object classification and detection. To maintain the compatibility across tasks we introduce the context-free network (CFN), a Siamese-ennead CNN. The CFN takes image tiles as input and explicitly limits the receptive field (or context) of its early processing units to one tile at a time. We show that the CFN is a more compact version of AlexNet, but with the same semantic learning capabilities. By training the CFN to solve Jigsaw puzzles, we learn both a feature mapping of object parts as well as their correct spatial arrangement. Our experimental evaluations show that the learned features capture semantically relevant content. The performance in object detection of features extracted from the CFN is the highest (51.8%) among unsupervisedly trained features, and very close to that of supervisedly trained features (56.5%). In object classification the CFN features achieve also the best accuracy (38.1%) among unsupervisedly trained features on the ImageNet 2012 dataset.
منابع مشابه
Learning Image Representations by Completing Damaged Jigsaw Puzzles
In this paper, we explore methods of complicating selfsupervised tasks for representation learning. That is, we do severe damage to data and encourage a network to recover them. First, we complicate each of three powerful self-supervised task candidates: jigsaw puzzle, inpainting, and colorization. In addition, we introduce a novel complicated self-supervised task called “Completing damaged jig...
متن کاملJigsaw: the unsupervised construction of spatial representations
A fundamental assumption in machine vision is that the spatial arrangement of pixels is given. In challenging this assumption we have utilised a general relationship that exists between space and behaviour. This relationship presents itself as spatial redundancy, which other researchers have considered problematic. We present a mathematical model and empirical investigations into this relations...
متن کاملJigsaw Puzzles As Cognitive Enrichment (PACE) - the effect of solving jigsaw puzzles on global visuospatial cognition in adults 50 years of age and older: study protocol for a randomized controlled trial
BACKGROUND Neurocognitive disorders are an important societal challenge and the need for early prevention is increasingly recognized. Meta-analyses show beneficial effects of cognitive activities on cognition. However, high financial costs, low intrinsic motivation, logistic challenges of group-based activities, or the need to operate digital devices prevent their widespread application in clin...
متن کاملNo Easy Puzzles: A Hardness Result for Jigsaw Puzzles
We show that solving jigsaw puzzles requires Θ(n ) edge matching comparisons, making them as hard as their trivial upper bound. This result generalises to puzzles of all shapes, and is applicable to both pictorial and apictorial puzzles.
متن کاملRobust Sex Differences in Jigsaw Puzzle Solving—Are Boys Really Better in Most Visuospatial Tasks?
Sex differences are consistently reported in different visuospatial tasks with men usually performing better in mental rotation tests while women are better on tests for memory of object locations. In the present study, we investigated sex differences in solving jigsaw puzzles in children. In total 22 boys and 24 girls were tested using custom build tablet application representing a jigsaw puzz...
متن کامل