Features for factored language models for code-Switching speech

نویسندگان

  • Heike Adel
  • Katrin Kirchhoff
  • Dominic Telaar
  • Ngoc Thang Vu
  • Tim Schlippe
  • Tanja Schultz
چکیده

This paper presents investigations of features which can be used to predict Code-Switching speech. For this task, factored language models are applied and implemented into a state-of-the-art decoder. Different possible factors, such as words, part-of-speech tags, Brown word clusters, open class words and open class word clusters are explored. We find that Brown word clusters, part-of-speech tags and open-class words are most effective at reducing the perplexity of factored language models on the Mandarin-English Code-Switching corpus SEAME. In decoding experiments, the model containing Brown word clusters and part-of-speech tags and the model also including open class word clusters yield the best mixed error rate results. In summary, the factored language models can reduce the perplexity on the SEAME evaluation set by up to 10.8% relative and the mixed error rate by up to 3.4% relative.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech Recognition on English-Mandarin Code-Switching Data using Factored Language Models - with Part-of-Speech Tags, Language ID and Code-Switch Point Probability as Factors pdfsubject=Multilingual Speech Recognition

Code-switching is defined as ”the alternate use of two or more languages in the same utterance or conversation” [1]. CS is a wide-spread phenomenon in multilingual communities, where multiple languages are concurrently used in a conversation. For automatic speech recognition (ASR), particularly intra-sentential code-switching poses an interesting challenge due to the multilingual context for la...

متن کامل

Combining recurrent neural networks and factored language models during decoding of code-Switching speech

In this paper, we present our latest investigations of language modeling for Code-Switching. Since there is only little text material for Code-Switching speech available, we integrate syntactic and semantic features into the language modeling process. In particular, we use part-of-speech tags, language identifiers, Brown word clusters and clusters of open class words. We develop factored langua...

متن کامل

Combination of Recurrent Neural Networks and Factored Language Models for Code-Switching Language Modeling

In this paper, we investigate the application of recurrent neural network language models (RNNLM) and factored language models (FLM) to the task of language modeling for Code-Switching speech. We present a way to integrate partof-speech tags (POS) and language information (LID) into these models which leads to significant improvements in terms of perplexity. Furthermore, a comparison between RN...

متن کامل

Functions of Code-Switching Strategies among Iranian EFL Learners and Their Speaking Ability Improvement through Code-Switching

This study investigated the impact of code-switching on speaking ability of Iranian low proficiency EFL learners. Moreover, it was an attempt to show what functions existed behind code-switching strategies used by the EFL learners. To this end, 60 male and female Iranian EFL learners age-ranged between 20 and 30 participated in the study. Data collection instruments which were used were the Int...

متن کامل

Functions of Code-Switching Strategies among Iranian EFL Learners and Their Speaking Ability Improvement through Code-Switching

This study investigated the impact of code-switching on speaking ability of Iranian low proficiency EFL learners. Moreover, it was an attempt to show what functions existed behind code-switching strategies used by the EFL learners. To this end, 60 male and female Iranian EFL learners age-ranged between 20 and 30 participated in the study. Data collection instruments which were used were the Int...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014