Functionality-Based Web Image Categorization
نویسندگان
چکیده
The World Wide Web provides an increasingly powerful and popular publication mechanism. Web documents often contain a large number of images serving various different purposes. Identifying the functional categories of these images has important applications including information extraction, web mining, web page summarization and mobile access. This paper describes a study on the functional categorization of Web images using data collected from news web sites. We describe the image categories found in such web pages and their distributions, identify the main research issues involved in automatically classifying images into these categories, and present a novel algorithm for automatic identification of two of the most important image categories, namely story and preview images.
منابع مشابه
Image flip CAPTCHA
The massive and automated access to Web resources through robots has made it essential for Web service providers to make some conclusion about whether the "user" is a human or a robot. A Human Interaction Proof (HIP) like Completely Automated Public Turing test to tell Computers and Humans Apart (CAPTCHA) offers a way to make such a distinction. CAPTCHA is a reverse Turing test used by Web serv...
متن کاملSupervised Categorization of JavaScriptTM Using Program Analysis Features
Web pages often embed scripts for a variety of purposes, including advertising and dynamic interaction. Understanding embedded scripts and their purpose can often help to interpret or provide crucial information about the web page. We have developed a functionality-based categorization of JavaScript, the most widely used web page scripting language. We then view understanding embedded scripts a...
متن کاملExploiting Privileged Information from Web Data for Image Categorization
Relevant and irrelevant web images collected by tag-based image retrieval have been employed as loosely labeled training data for learning SVM classifiers for image categorization by only using the visual features. In this work, we propose a new image categorization method by incorporating the textual features extracted from the surrounding textual descriptions (tags, captions, categories, etc....
متن کاملRefining Image Categorization by Exploiting Web Images and General Corpus
Studies show that refining real-world categories into semantic subcategories contributes to better image modeling and classification. Previous image sub-categorization work relying on labeled images and WordNet’s hierarchy is not only laborintensive, but also restricted to classify images into NOUN subcategories. To tackle these problems, in this work, we exploit general corpus information to a...
متن کاملNon-photographic Image Categorization
The rapid growth of IT industry today has undoubtedly boosted the widespread use of computer images in both web pages and modern computer programs. Applications like online image search, automatic webpage summarization and web mining, rely heavily on image categorization. This project presents a system that categorizes non-photographic images 1 based on their textual and image features. The cor...
متن کامل