نتایج جستجو برای: crawler

تعداد نتایج: 1856  

Journal: :Journal of Physics: Conference Series 2020

Journal: :TELKOMNIKA (Telecommunication Computing Electronics and Control) 2011

2006
Wang Xuan Kan Min Yen Nguyen Thi Hoang

.......................................................................................................................III Acknowledgement....................................................................................................... IV Table of

2014
Priyanka Singla Rakesh Batra

Abstract Focused Crawler aims to select relevant web pages from internet. These pages are relevant to some predefined topics. Previous focused crawlers have a problem of not keeping track of user interest and goals .The topic weight table is calculated only once statically and that is less sensitive to potential changes in environment. To address this problem we design a focused crawler based o...

2009
Krisztian Balog Ian Soboroff Peter Bailey Arjen P. de Vries

The collection consists of all the *.csiro.au (public) websites as they appeared in March 2007. The resulting data set consists of 370 715 documents, with total size 4.2 gigabytes. The web crawler visited the outward-facing pages of CSIRO in a fashion similar to the crawl used in CSIRO’s own search engine. In fact, the same crawler technology that CSIRO uses was used to gather the CSIRO documen...

2003
Wang Lam Hector Garcia-Molina

Web crawlers generate significant loads on Web servers, and are difficult to operate. Instead of repeatedly running crawlers at many “client” sites, we propose a central crawler and Web repository that multicasts appropriate subsets of the central repository, and their subsequent changes, to subscribing clients. Loads at Web servers are reduced because a single crawler visits the servers, as op...

2010
Zhixing GAO Kunhui LIN

Distributed Web crawlers have recently received more and more attention from researchers. Centralized solutions are known to have problems like link congestion, being a single point of failure ,while the fully distributed crawlers become an interesting architectural paradigm for its scalability, increased autonomy of nodes. This paper provides a distributed crawler system which consists of mult...

2008
Shervin Daneshpajouh Mojtaba Mohammadi Nasiri Mohammad Ghodsi

In this paper, we present a new and fast algorithm for generating the seeds set for web crawlers. A typical crawler normally starts from a fixed set like DMOZ links, and then continues crawling from URLs which are found in these web pages. Crawlers are supposed to download more good pages in less iteration. Crawled pages are good if they have high PageRanks and are from different communities. I...

2006
Daniel Gomes Mário J. Silva

This report discusses architectural aspects of web crawlers and details the design, implementation and evaluation of the Viuva Negra (VN) crawler. VN has been used for 4 years, feeding a search engine and an archive of the Portuguese web. In our experiments it crawled over 2 million documents per day, correspondent to 63 GB of data. We describe hazardous situations to crawling found on the web ...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید