Web Crawler: A Review

نویسندگان

  • Md. Abu Kausar
  • V. S. Dhaka
  • Sanjeev Kumar Singh
چکیده

Information Retrieval deals with searching and retrieving information within the documents and it also searches the online databases and internet. Web crawler is defined as a program or software which traverses the Web and downloads web documents in a methodical, automated manner. Based on the type of knowledge, web crawler is usually divided in three types of crawling techniques: General Purpose Crawling, Focused crawling and Distributed Crawling. In this paper, the applicability of Web Crawler in the field of web search and a review on Web Crawler to different problem domains in web search is discussed.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Prioritize the ordering of URL queue in Focused crawler

The enormous growth of the World Wide Web in recent years has made it necessary to perform resource discovery efficiently. For a crawler it is not an simple task to download the domain specific web pages. This unfocused approach often shows undesired results. Therefore, several new ideas have been proposed, among them a key technique is focused crawling which is able to crawl particular topical...

متن کامل

Review Paper on Web Crawler

Web crawler is software or a computer program which will be used for the browsing in World Wide Web in an ordered manner. The methodology used for this type of procedure is known as Web crawling or spidering.The different search engines used for spidering will give you current information. Web crawlers will create the copy of all the visited web pages that is used by the search engine as a refe...

متن کامل

URL ordering policies for distributed crawlers: a review

With the increase in size of web, the information is also spreading at large scale. Search Engines are the medium to access this information. Crawler is the module of search engine which is responsible for download the web pages. In order to download the fresh information and get the database rich, crawler should crawl the web in some order. This is called as ordering of URLs. URL ordering shou...

متن کامل

Focused Web Crawling Algorithms

Nowadays the web is rich of any kind of information. And this information is freely available thanks to the hypermedia information systems and the Internet. This information greatly influenced our lives, our lifestyle and way of thinking. A web search engine is a complex multi-level system that helps us to search the information that available on the Internet. A web crawler is one of the most i...

متن کامل

A Study of Focused Web Crawlers for Semantic Web

Finding useful information from the web which has a large and distributed structure requires efficient search strategies. Focused crawlers selectively retrieve Web documents that are relevant to a predefined set of topics. To intelligently make decisions about relevant URLs and web pages, different authors had proposed different strategies. In this paper we review and compare focused crawling s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013