Tree matching techniques have been investigated in many fields, including web data mining and extraction, as a key component to analyze the content of pages. However, when applied existing pages, traditional tree approaches, covered by algorithms like Tree-Edit Distance (TED) or XyDiff, either fail scale beyond few hundred nodes exhibit relatively low accuracy. In this article, we therefore pro...