The use of Transformer-based pre-trained language models has become prevalent in enhancing the performance task-oriented dialogue systems. These models, which are on large text data to grasp syntax and semantics, fine-tune entire parameter set according a specific task. However, as scale model increases, several challenges arise during fine-tuning process. For example, training time escalates g...