Identifying Informative Web Content Blocks using Web Page Segmentation
نویسندگان
چکیده
منابع مشابه
Identifying Informative Web Content Blocks using Web Page Segmentation
Information Extraction has become an important task for discovering useful knowledge or information from the Web. A crawler system, which gathers the information from the Web, is one of the fundamental necessities of Information Extraction. A search engine uses a crawler to crawl and index web pages. Search engine takes into account only the informative content for indexing. In addition to info...
متن کاملRecognising Informative Web Page Blocks Using Visual Segmentation for Efficient Information Extraction
As web sites are getting more complicated, the construction of web information extraction systems becomes more troublesome and time-consuming. A common theme is the difficulty in locating the segments of a page in which the target information is contained, which we call the informative blocks. This article reports on the Recognising Informative Page Blocks algorithm (RIPB), which is able to ide...
متن کاملIdentifying Content Blocks from Web Documents
Intelligent information processing systems, such as digital libraries or search engines index web-pages according to their informative content. However, web-pages contain several non-informative contents, e.g., navigation sidebars, advertisements, copyright notices, etc. It is very important to separate the informative “primary content blocks” from these non-informative blocks. In this paper, t...
متن کاملWICE- Web Informative Content Extraction
With the accelerated Internet development a huge amount of data have been accumulated and stored on the Web. Web pages usually contain various contents, which are relevant or irrelevant with the main topic. The extraction of useful or relevant information in mass information becomes more complex and time consuming. Identifying of useful data region is a significant problem for information extra...
متن کاملA personalized web page content filtering model based on segmentation
In the view of massive content explosion in World Wide Web through diverse sources, it has become mandatory to have content filtering tools. The filtering of contents of the web pages holds greater significance in cases of access by minor-age people. The traditional web page blocking systems goes by the Boolean methodology of either displaying the full page or blocking it completely. With the i...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Applied Information Systems
سال: 2014
ISSN: 2249-0868
DOI: 10.5120/ijais14-451129