نتایج جستجو برای: web page classification
تعداد نتایج: 749611 فیلتر نتایج به سال:
Authors evaluated supervised automatic classification algorithms for determination of health related web-page compliance with individual HONcode criteria of conduct using varying length character n-gram vectors to represent healthcare web page documents. The training/testing collection comprised web page fragments extracted by HONcode experts during the manual certification process. The authors...
The web is a large repository of information and to facilitate the search and retrieval of pages from it, categorization of web documents is essential. An effective means to handle the complexity of information retrieval from the internet is through automatic classification of web pages. Although lots of automatic classification algorithms and systems have been presented, most of the existing a...
The research reported in this paper is part of a larger project on the automatic classification of Web pages by their genres, using a distance function classification model. In this paper, we investigate the effect of several commonly used data preprocessing steps, explore the use of byte and word n-grams, and test our classification model on three Web page data sets. Our approach is to represe...
We present a highly accurate method for classifying web pages based on link percentage, which is the percentage of text characters that are parts of links normalized by the number of all text characters on a web page. K -means clustering is used to create unique thresholds to differentiate index pages and article pages on individual web sites. Index pages contain mostly links to articles and ot...
Web page classification has attracted increasing research interest. It is intrinsically a multi-view and semi-supervised application, since web pages usually contain two or more types of data, such as text, hyperlinks and images, and unlabeled pages are generally much more than labeled ones. Web page data is commonly high-dimensional. Thus, how to extract useful features from this kind of data ...
We present a highly accurate method for classifying web pages based on link percentage, which is the percentage of text characters that are parts of links normalized by the number of all text characters on a web page. K -means clustering is used to create unique thresholds to differentiate index pages and article pages on individual web sites. Index pages contain mostly links to articles and ot...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید