This study proposes a cross-domain multi-objective speech assessment model, called MOSA-Net, which can simultaneously estimate the quality, intelligibility, and distortion scores of an input signal. MOSA-Net comprises convolutional neural network bidirectional long short-term memory architecture for representation extraction, multiplicative attention layer fully connected each metric prediction...