Cross-lingual Text Selection and Simplification for Bilingual Education
نویسنده
چکیده
Reading proficiency is a fundamental component of language competency. However, finding topical texts at an appropriate reading level for foreign and second language learners is a challenge for teachers. This task can be addressed with natural language processing technology to assess reading level and simplify text. This proposal presents a research plan for the development of tools for reading level assessment based on statistical language models. In addition, we propose adapting paraphrasing and summarization techniques for the task of text simplification. Coupled with an information retrieval system, these tools will be used to select and simplify reading material in multiple languages for use by language learners. A pilot study of the use of statistical language models for reading level assessment confirms that this approach is promising. Based on the large number of Spanish-speaking students learning English in the U.S., we selected English and Spanish as the specific languages for this work, but we expect the techniques developed will generalize to other languages as well.
منابع مشابه
Natural Language Processing Tools for Reading Level Assessment and Text Simplification for Bilingual Education
Natural Language Processing Tools for Reading Level Assessment and Text Simplification for Bilingual Education
متن کاملCross - lingual Information Retrieval Model based on Bilingual Topic Correlation ⋆
How to construct relationship between bilingual texts is important to effectively processing multi-lingual text data and cross language barriers. Cross-lingual latent semantic indexing (CL-LSI) corpus-based doesnot fully take into account bilingual semantic relationship. The paper proposes a new model building semantic relationship of bilingual parallel document via partial least squares (PLS)....
متن کاملMining bilingual topic hierarchies from unaligned text
Recent years have seen an exponential growth in the amount of multilingual text available on the web. This situation raises the need for novel applications for organizing and accessing multilingual content. Common examples of such applications include Multilingual Topic Tracking, Cross-Language Information retrieval systems etc. Most of these applications rely on the availability of multilingua...
متن کاملCross-lingual Predicate Cluster Acquisition to Improve Bilingual Event Extraction by Inductive Learning
In this paper we present two approaches to automatically extract cross-lingual predicate clusters, based on bilingual parallel corpora and cross-lingual information extraction. We demonstrate how these clusters can be used to improve the NIST Automatic Content Extraction (ACE) event extraction task. We propose a new inductive learning framework to automatically augment background data for lowco...
متن کاملWeakly-Supervised Cross-lingual Predicate Cluster Acquisition to Improve Bilingual Event Extraction
In this paper we present two approaches to automatically extract cross-lingual predicate clusters, based on bilingual parallel corpora and cross-lingual information extraction. We demonstrate how these clusters can be used to improve the NIST Automatic Content Extraction (ACE) event extraction task. We propose a new inductive learning framework to automatically augment background data for lowco...
متن کامل