نتایج جستجو برای: crawler

تعداد نتایج: 1856  

2007
Krishnan Suresh Sivaramakrishnan Kaveri

Focused crawling is an efficient mechanism for discovering resources of interest on the web. Link structure is an important property of the web that defines its content. In this thesis, FOCUS a novel focused crawler is described, which primarily uses the link structure of the web in its crawling strategy. It uses currently available search engine APIs, provided by Google, to construct a layered...

2015
Swati Agarwal Ashish Sureka

Online video sharing platforms such as YouTube contains several videos and users promoting hate and extremism. Due to low barrier to publication and anonymity, YouTube is misused as a platform by some users and communities to post negative videos disseminating hatred against a particular religion, country or person. We formulate the problem of identification of such malicious videos as a search...

2013
Tomasz Kusmierczyk Marcin Sydow

This paper concerns predicting the content of textual web documents based on features extracted from web pages that link to them. It may be applied in an intelligent, keyword-focused web crawler. The experiments made on publicly available real data obtained from Open Directory Project with the use of several classification models are promising and indicate potential usefulness of the studied ap...

2008
Ioannis Partalas Georgios Paliouras Ioannis P. Vlahavas

Focused crawlers are programs that wander in the Web, using its graph structure, and gather pages that belong to a specific topic. The most critical task in Focused Crawling is the scoring of the URLs as it designates the path that the crawler will follow, and thus its effectiveness. In this paper we propose a novel scheme for assigning scores to the URLs, based on the Reinforcement Learning (R...

1998
Soumen Chakrabarti Martin van den Berg Byron Dom

The rapid growth of the world-wide web poses unprecedented scaling challenges for general-purpose crawlers and search engines. In this paper we describe a new hypertext information management system called a Focused Crawler. The goal of a focused crawler is to selectively seek out pages that are relevant to a pre-defined set of topics. The topics are specified not using keywords, but using exem...

2013
Aidin Parsakhoo Seyed Ataollah Hosseini Majid Lotfalian Hamid Jalilvand

The operating hourly cost of the machine is a suitable factor to analyze the cost fluctuation certain machinery in a changing environment and to find of economically feasible work concepts for the studied machine system. This paper, which is based on studies carried out in northern forests of IR-Iran, analyzed and compared costs of four skidding and excavation machines used in timber harvesting...

2010
Debashis Hati Amritesh Kumar A. Pal D. S. Tomar

The rapid growth of the World Wide Web (WWW) poses unprecedented scaling challenges for general-purpose crawlers. Crawlers are software which can traverse the internet and retrieve web pages by hyperlinks. The focused crawler of a special-purpose search engine aims to selectively seek out pages that are relevant to a pre-defined set of topics, rather than to exploit all regions of the Web. Focu...

2010
Sybille Peters Claus-Peter Rückemann Wolfgang Sander-Beuermann

Search engines typically consist of a crawler which traverses the web retrieving documents and a search frontend which provides the user interface to the acquired information. Focused crawlers refine the crawler by intelligently directing it to predefined topic areas. The evolution of search engines today is expedited by supplying more search capabilities such as a search for metadata as well a...

Journal: :EURASIP J. Information Security 2017
Christos Iliou George Kalpakis Theodora Tsikrika Stefanos Vrochidis Yiannis Kompatsiaris

Focused crawlers enable the automatic discovery of Web resources about a given topic by automatically navigating through the Web link structure and selecting the hyperlinks to follow by estimating their relevance to the topic of interest. This work proposes a generic focused crawling framework for discovering resources on any given topic that reside on the Surface or the Dark Web. The proposed ...

2013
Gang Lu Shumei Liu Kevin Lü

Amount of microblogs data is needed to be crawled for research, business analyzing, and so on. However, a lot of dynamic Web techniques are used in microblog Web pages. That makes it hard to crawl data by parsing the contents of Web pages for traditional Web page crawlers. Fortunately, microblogs provide APIs. Well-structured data can be returned to users simply by accessing those APIs in form ...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید