In the past few years, Transformer has been widely adopted in many domains and applications because of its impressive performance. Vision (ViT), a successful well-known variant, attracts considerable attention from both industry academia thanks to record-breaking performance various vision tasks. However, ViT is also highly nonlinear like other classical neural networks could be easily fooled b...