Post-processing speech recordings during MRI
نویسندگان
چکیده
We discuss post-processing of speech that has been recorded during Magnetic Resonance Imaging (MRI) of the vocal tract area. These speech recordings are contaminated by high levels of acoustic noise from the MRI scanner. Also, the frequency response of the sound signal path is not flat as a result of restrictions on recording instrumentation and arrangements due to MRI technology. The post-processing algorithm for noise reduction is based on adaptive spectral filtering, and it has been designed keeping in mind the requirements of subsequent formant extraction. Speech material was used for validation of the post-processing algorithm, consisting of samples of prolonged vowel productions during the MRI. The comparison data was recorded in anechoic chamber from the same test subject. Spectral envelopes and formants were computed for the post-processed speech and the comparison data. Artificially noise-contaminated vowel samples (with a known formant structure) were used for validation experiments to determine performance of the algorithm where using true data would be difficult. Resonances computed by an acoustic model and, similarly, those measured from 3D printed vocal tract physical models were used as comparison data as well. The properties of recording instrumentation or the postprocessing algorithm do not explain the observed frequency dependent discrepancy between formant data from experiments during the MRI and in anechoic chamber. It is shown that the discrepancy is statistically significant, in particular, where it is largest at 1 kHz and 2 kHz. There is evidence that the reflecting surfaces of the MRI head and neck coil change the speech acoustics which results in “exterior formants” at these frequencies. However, the role of test subject adaptation to noise and constrained space acoustics during an MRI examination cannot be ruled out.
منابع مشابه
Psychoacoustically-motivated Dereverberation for Recordings Taken in the German Parliament
In this paper, we discuss the application of speech dereverberation techniques for post-processing of recordings taken in the German parliament. Based on a novel psychoacoustically-motivated dereverberation concept, a significant improvement in terms of the perceived quality is obtained in comparison to a conventional dereverberation approach. Since time-varying changes of the acoustical enviro...
متن کاملRecording high quality speech during tagged cine-MRI studies using a fiber optic microphone.
PURPOSE To investigate the feasibility of obtaining high quality speech recordings during cine imaging of tongue movement using a fiber optic microphone. MATERIALS AND METHODS A Complementary Spatial Modulation of Magnetization (C-SPAMM) tagged cine sequence triggered by an electrocardiogram (ECG) simulator was used to image a volunteer while speaking the syllable pairs /a/-/u/, /i/-/u/, and ...
متن کاملA Novel Fuzzy-C Means Image Segmentation Model for MRI Brain Tumor Diagnosis
Accurate segmentation of brain tumor plays a key role in the diagnosis of brain tumor. Preset and precise diagnosis of Magnetic Resonance Imaging (MRI) brain tumor is enormously significant for medical analysis. During the last years many methods have been proposed. In this research, a novel fuzzy approach has been proposed to classify a given MRI brain image as normal or cancer label and the i...
متن کاملAutomated post-hoc noise cancellation tool for audio recordings acquired in an MRI scanner.
There are several types of experiment in which it is useful to have subjects speak overtly in a magnetic resonance imaging (MRI) scanner, including those studying the articulatory apparatus and the neural basis of speech production, and fMRI experiments in which speech is used as a response modality. Although it is relatively easy to record sound from the bore, it can be difficult to hear the s...
متن کاملStatistical multi-stream modeling of real-time MRI articulatory speech data
This paper investigates different statistical modeling frameworks for articulatory speech data obtained using real-time (RT) magnetic resonance imaging (MRI). To quantitatively capture the spatio-temporal shaping process of the human vocal tract during speech production a multi-dimensional stream of direct image features is extracted automatically from the MRI recordings. The features are close...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Biomed. Signal Proc. and Control
دوره 39 شماره
صفحات -
تاریخ انتشار 2018