Automatic speech recognition (ASR) systems require large amounts of transcribed speech data, for training state-of-theart deep neural network (DNN) acoustic models. Transcribed speech is a scarce and expensive resource, and ASR systems are prone to underperform in domains where there is not a lot of training data available. In this work, we open up a vast and previously unused resource of trans...