We presented a novel hardware architecture that uses dual Benes networks to accelerate Convolutional Neural Network (CNN) algorithms. This can reduce the need for high-speed buses and maintain connection between execution units memories. Also, this design work with multiple neural network models by changing configuration only due reorderability of non-blocking networks. The proposed save time r...