Recently, the speech community is seeing a significant trend of moving from deep neural network based hybrid modeling to end-to-end (E2E) for automatic recognition (ASR). While E2E models achieve state-of-the-art results in most benchmarks terms ASR accuracy, are still used large proportion commercial systems at current time. There lots practical factors that affect production model deployment ...