Getting lost in space: Large sample analysis of the commute distance

نویسندگان

  • Ulrike von Luxburg
  • Agnes Radl
  • Matthias Hein
چکیده

The commute distance between two vertices in a graph is the expected time it takes a random walk to travel from the first to the second vertex and back. We study the behavior of the commute distance as the size of the underlying graph increases. We prove that the commute distance converges to an expression that does not take into account the structure of the graph at all and that is completely meaningless as a distance function on the graph. Consequently, the use of the raw commute distance for machine learning purposes is strongly discouraged for large graphs and in high dimensions. As an alternative we introduce the amplified commute distance that corrects for the undesired large sample effects.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Getting lost in space: Large sample analysis of the resistance distance

The commute distance between two vertices in a graph is the expected time it takes a random walk to travel from the first to the second vertex and back. We study the behavior of the commute distance as the size of the underlying graph increases. We prove that the commute distance converges to an expression that does not take into account the structure of the graph at all and that is completely ...

متن کامل

Evaluating The Creation of Dwelling Space in relation to Place

Primary man in trying to find food went everywhere. But by forming ranching arranged a chain of places and became emigrant. By happening industrial revolution, human life was centralized on one place.  Places that base on its advantages make different biologic and behavioral types. Forming cities in seaboard, river shore, boundary of mountains and Champaign cause to make different cultures that...

متن کامل

Robust Outlier Detection Using Commute Time and Eigenspace Embedding

We present a method to find outliers using ‘commute distance’ computed from a random walk on graph. Unlike Euclidean distance, commute distance between two nodes captures both the distance between them and their local neighborhood densities. Indeed commute distance is the Euclidean distance in the space spanned by eigenvectors of the graph Laplacian matrix. We show by analysis and experiments t...

متن کامل

Large Scale Spectral Clustering Using Resistance Distance and Spielman-Teng Solvers

Spectral clustering is a novel clustering method which can detect complex shapes of data clusters. However, it requires the eigen decomposition of the graph Laplacian matrix, which is proportion to O(n) and thus is not suitable for large scale systems. Recently, many methods have been proposed to accelerate the computational time of spectral clustering. These approximate methods usually involve...

متن کامل

Some criteria of regeneration density in young beech populations

Some criteria of density in beech saplings were studied in various forest associations (mainly Galio odorati-Fagetum typicum) growing in the submontane region near Zurich (Swiss Central Plateau). The sample plots were established in regeneration gaps resulting from Swiss irregular shelter wood system (Femelschlag). Five sample plots, each 2x2m in 3 transects (a total of 15 plots in each gap, x...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010