Abstract Although deep learning-based methods have dominated stereo matching leaderboards by yielding unprecedented disparity accuracy, their inference time is typically slow, i.e. , less than 4 FPS for a pair of 540p images. The main reason that the leading employ time-consuming 3D convolutions applied to 4D feature volume. A common way speed up computation downsample volume, but this loses hi...