This study addresses an image-matching problem in challenging cases, such as large scene variations or textureless scenes. To gain robustness to situations, most previous studies have attempted encode the global contexts of a via graph neural networks transformers. However, these do not explicitly represent high-level contextual information, structural shapes semantic instances; therefore, enco...