نتایج جستجو برای: crawler

تعداد نتایج: 1856  

Journal: :CoRR 2014
Prashant Dahiwale Mukesh M. Raghuwanshi Latesh G. Malik

Majority of the computer or mobile phone enthusiasts make use of the web for searching activity. Web search engines are used for the searching; The results that the search engines get are provided to it by a software module known as the Web Crawler. The size of this web is increasing round-the-clock. The principal problem is to search this huge database for specific information. To state whethe...

Journal: :ISPRS Int. J. Geo-Information 2016
Dongyang Hou Jun Chen Hao Wu

Automatic discovery of isolated land cover web map services (LCWMSs) can potentially help in sharing land cover data. Currently, various search engine-based and crawler-based approaches have been developed for finding services dispersed throughout the surface web. In fact, with the prevalence of geospatial web applications, a considerable number of LCWMSs are hidden in JavaScript code, which be...

2014
David C. Wyld Prashant Dahiwale M M Raghuwanshi Latesh Malik

Majority of the computer or mobile phone enthusiasts make use of the web for searching activity. Web search engines are used for the searching; The results that the search engines get are provided to it by a software module known as the Web Crawler. The size of this web is increasing round-the-clock. The principal problem is to search this huge database for specific information. To state whethe...

2014
Raphael do Vale Amaral Gomes Marco A. Casanova Giseli Rabello Lopes Luiz André P. Paes Leme

The Linked Data best practices recommend publishers of triplesets to use well-known ontologies in the triplication process and to link their triplesets with other triplesets. However, despite the fact that extensive lists of open ontologies and triplesets are available, most publishers typically do not adopt those ontologies and link their triplesets only with popular ones, such as DBpedia and ...

2003
Gautam Pant Shannon Bradshaw Filippo Menczer

Web crawlers have been used for nearly a decade as a search engine component to create and update large collections of documents. Typically the crawler and the rest of the search engine are not closely integrated. If the purpose of a search engine is to have as large a collection as possible to serve the general Web community, a close integration may not be necessary. However, if the search eng...

2006
Shisanu Tongchim Canasai Kruengkrai Virach Sornlertlamvanich Hitoshi Isahara

This paper proposes an idea for constructing a distributed web crawler by utilizing existing high-speed research networks. This is an initial effort of the Web Language Engineering (WLE) project which investigates techniques in processing the languages found in published web documents. In this paper, we focus on designing a geographically distributed web crawler. Multiple crawlers work collabor...

2004
Odysseas Papapetrou George Samaras

Distributed crawling has shown that it can overcome important limitations of the centralized crawling paradigm. However, the distributed nature of current distributed crawlers is currently not fully utilized. The optimal benefits of this approach are usually limited to the sites hosting the crawler. In this work we describe IPMicra, a distributed location aware web crawler that utilizes an IP a...

Journal: :Softw., Pract. Exper. 2008
Daniel Gomes Mário J. Silva

This paper documents hazardous situations on the Web that crawlers must address. This knowledge was accumulated while developing and operating the Viúva Negra (VN) crawler to feed a search engine and a Web archive for the Portuguese Web for four years. The design, implementation and evaluation of the VN crawler are also presented as a case study of a Web crawler design. The case study tested pr...

2008
Lefteris Kozanidis

In this paper we present a novel approach for building a focused crawler. The goal of our crawler is to effectively identify web pages that relate to a set of predefined topics and download them regardless of their web topology or connectivity with other popular pages on the web. The main challenges that we address in our study concern the following. First we need to be able to effectively iden...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید