Abstract The delay-causing text data contain valuable information such as the specific reasons for delay, location and time of disturbance, which can provide an efficient support prediction train delays improve guidance control efficiency. Based on operation Wuhan–Guangzhou high-speed railway, relevant algorithms in natural language processing field are used to process data. It also integrates ...