Cross-lingual Text Selection and Simplification for Bilingual Education

نویسنده

  • Sarah Schwarm
چکیده

Reading proficiency is a fundamental component of language competency. However, finding topical texts at an appropriate reading level for foreign and second language learners is a challenge for teachers. This task can be addressed with natural language processing technology to assess reading level and simplify text. This proposal presents a research plan for the development of tools for reading level assessment based on statistical language models. In addition, we propose adapting paraphrasing and summarization techniques for the task of text simplification. Coupled with an information retrieval system, these tools will be used to select and simplify reading material in multiple languages for use by language learners. A pilot study of the use of statistical language models for reading level assessment confirms that this approach is promising. Based on the large number of Spanish-speaking students learning English in the U.S., we selected English and Spanish as the specific languages for this work, but we expect the techniques developed will generalize to other languages as well.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Natural Language Processing Tools for Reading Level Assessment and Text Simplification for Bilingual Education

Natural Language Processing Tools for Reading Level Assessment and Text Simplification for Bilingual Education

متن کامل

Cross - lingual Information Retrieval Model based on Bilingual Topic Correlation ⋆

How to construct relationship between bilingual texts is important to effectively processing multi-lingual text data and cross language barriers. Cross-lingual latent semantic indexing (CL-LSI) corpus-based doesnot fully take into account bilingual semantic relationship. The paper proposes a new model building semantic relationship of bilingual parallel document via partial least squares (PLS)....

متن کامل

Mining bilingual topic hierarchies from unaligned text

Recent years have seen an exponential growth in the amount of multilingual text available on the web. This situation raises the need for novel applications for organizing and accessing multilingual content. Common examples of such applications include Multilingual Topic Tracking, Cross-Language Information retrieval systems etc. Most of these applications rely on the availability of multilingua...

متن کامل

Cross-lingual Predicate Cluster Acquisition to Improve Bilingual Event Extraction by Inductive Learning

In this paper we present two approaches to automatically extract cross-lingual predicate clusters, based on bilingual parallel corpora and cross-lingual information extraction. We demonstrate how these clusters can be used to improve the NIST Automatic Content Extraction (ACE) event extraction task. We propose a new inductive learning framework to automatically augment background data for lowco...

متن کامل

Weakly-Supervised Cross-lingual Predicate Cluster Acquisition to Improve Bilingual Event Extraction

In this paper we present two approaches to automatically extract cross-lingual predicate clusters, based on bilingual parallel corpora and cross-lingual information extraction. We demonstrate how these clusters can be used to improve the NIST Automatic Content Extraction (ACE) event extraction task. We propose a new inductive learning framework to automatically augment background data for lowco...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004