Retrieve Main Content using Vision-base Web Page Segmentation with Gomory-Hu Tree

نویسندگان

چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Identifying Informative Web Content Blocks using Web Page Segmentation

Information Extraction has become an important task for discovering useful knowledge or information from the Web. A crawler system, which gathers the information from the Web, is one of the fundamental necessities of Information Extraction. A search engine uses a crawler to crawl and index web pages. Search engine takes into account only the informative content for indexing. In addition to info...

متن کامل

Segmenting Webpage with Gomory-Hu Tree Based Clustering

We propose a novel web page segmentation algorithm based on finding the Gomory-Hu tree in a planar graph. The algorithm firstly distills vision and structure information from a web page to construct a weighted undirected graph, whose vertices are the leaf nodes of the DOM tree and the edges represent the visible position relationship between vertices. Then it partitions the graph with the Gomor...

متن کامل

Dynamic Gomory-Hu Tree Construction - fast and simple

A cut tree (or Gomory-Hu tree) of an undirected weighted graph G = (V,E) encodes a minimum s-t-cut for each vertex pair {s, t} ⊆ V and can be iteratively constructed by n − 1 maximum flow computations. They solve the multiterminal network flow problem, which asks for the all-pairs maximum flow values in a network and at the same time they represent n− 1 non-crossing, linearly independent cuts t...

متن کامل

A personalized web page content filtering model based on segmentation

In the view of massive content explosion in World Wide Web through diverse sources, it has become mandatory to have content filtering tools. The filtering of contents of the web pages holds greater significance in cases of access by minor-age people. The traditional web page blocking systems goes by the Boolean methodology of either displaying the full page or blocking it completely. With the i...

متن کامل

A language independent web data extraction using vision based page segmentation algorithm

Web usage mining is a process of extracting useful information from server logs i.e. user’s history. Web usage mining is a process of finding out what users are looking for on the internet. Some users might be looking at only textual data, where as some others might be interested in multimedia data. One would retrieve the data by copying it and pasting it to the relevant document. But this is t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: International Journal of Computer Applications

سال: 2014

ISSN: 0975-8887

DOI: 10.5120/19006-0547