Image-text matching plays a critical role in bridging the vision and language, great progress has been made by exploiting global alignment between image sentence, or local alignments regions words. However, how to make most of these infer more accurate scores is still underexplored. In this paper, we propose novel Similarity Graph Reasoning Attention Filtration (SGRAF) network for image-text ma...