Curvature of co-links uncovers hidden thematic layers in the World Wide Web.
نویسندگان
چکیده
Beyond the information stored in pages of the World Wide Web, novel types of "meta-information" are created when pages connect to each other. Such meta-information is a collective effect of independent agents writing and linking pages, hidden from the casual user. Accessing it and understanding the interrelation between connectivity and content in the World Wide Web is a challenging problem [Botafogo, R. A. & Shneiderman, B. (1991) in Proceedings of Hypertext (Assoc. Comput. Mach., New York), pp. 63-77 and Albert, R. & Barabási, A.-L. (2002) Rev. Mod. Phys. 74, 47-97]. We demonstrate here how thematic relationships can be located precisely by looking only at the graph of hyperlinks, gleaning content and context from the Web without having to read what is in the pages. We begin by noting that reciprocal links (co-links) between pages signal a mutual recognition of authors and then focus on triangles containing such links, because triangles indicate a transitive relation. The importance of triangles is quantified by the clustering coefficient [Watts, D. J. & Strogatz, S. H. (1999) Nature (London) 393, 440-442], which we interpret as a curvature [Bridson, M. R. & Haefliger, A. (1999) Metric Spaces of Non-Positive Curvature (Springer, Berlin)]. This curvature defines a World Wide Web landscape whose connected regions of high curvature characterize a common topic. We show experimentally that reciprocity and curvature, when combined, accurately capture this meta-information for a wide variety of topics. As an example of future directions we analyze the neural network of Caenorhabditis elegans, using the same methods.
منابع مشابه
Studying of Research Related to COVID-19 Vaccine in Iran and the World: A Thematic Analysis and Scientific Collaborations
Background and Objective: The purpose of the present study is thematic analysis and scientific collaborations of research related to Covid 19 vaccine in Iran and the world based on scientific products indexed in Web of Science (WOS). Materials and Methods: The present study is a descriptive-analytical study with a scientometric approach and using the methods of content analysis and techniques ...
متن کاملVisualizing the Clusters and Dynamics of HPV Research Area
Purpose: The purpose of the present study is to visualize HPV clusters’ relationships and thematic trends in the world. Methodology: The research type is an applied one with analytical approach and it has been done using co-word analysis. The population of this study consists of articles’ keywords indexed during 2014-2018 in the Web of Science (WoS) in HPV subject area. The total numbers of th...
متن کاملCoronavirus: Discover the Structure of Global Knowledge, Hidden Patterns & Emerging Events
Background & Objective: The present study aimed at exploring the structure of global knowledge, hidden patterns, and emerging Coronavirus events using co-word techniques. Co-word analysis is one of the most efficient scientific methods to analyze the structure and dynamics of knowledge and the general state of research. Materials & Methods: This applied research performed using Co-word anal...
متن کاملHypertext Summary Extraction for Fast Document Browsing
This article describes an application of Natural Langnage Processing (NLP) techniques to enable fast browsing of on-line documents by automatically generating Hypertext s~lmmaries of one or more documents. Unlike previous w~k on summarization, the system described here, HyperGen, does not produce plain-text snmmaries and does not throw away parts of the document that weren’t included in the sum...
متن کاملA Technique for Improving Web Mining using Enhanced Genetic Algorithm
World Wide Web is growing at a very fast pace and makes a lot of information available to the public. Search engines used conventional methods to retrieve information on the Web; however, the search results of these engines are still able to be refined and their accuracy is not high enough. One of the methods for web mining is evolutionary algorithms which search according to the user interests...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Proceedings of the National Academy of Sciences of the United States of America
دوره 99 9 شماره
صفحات -
تاریخ انتشار 2002