Language Model Adaptation with a Word List and a Raw Corpus
نویسندگان
چکیده
منابع مشابه
Language model adaptation with a word list and a raw corpus
In this paper, we discuss language model adaptation methods given a word list and a raw corpus. In this situation, the general method is to segment the raw corpus automatically using a word list, correct the output sentences by hand, and build a model from the segmented corpus. In this sentence-by-sentence error correction method, however, the annotator encounters grammatically complicated posi...
متن کاملA Corpus-driven Food Science and Technology Academic Word List
The overarching goal of this study was to create a list of the most frequently occurring academic words in Food Science and Technology (FST). To this end, a 4,652,444-word corpus called Food Science and Technology Research Articles (FSTRA), which included 1,421 research articles (RAs) randomly selected from 38 journals across five sub-disciplines in FST, was developed. Frequency and range-based...
متن کاملDeveloping a Corpus-Based Word List in Pharmacy Research Articles: A Focus on Academic Culture
The present corpus-based lexical study reports the development of a Pharmacy Academic Word List (PAWL); a list of the most frequent words from a corpus of 3,458,445 tokens made up of 800 most recent pharmacy texts including research articles, review articles, and short communications in four sub-disciplines of pharmacy. WordSmith (Scott, 2017) and AntWordProfiler (Anthony, 2014) were used to sc...
متن کاملvalidation of a revised logical-mathematical intelligence scale and exploring its relationship with english language proficiency
نظریه هوش چندگانه قسمتهای متفاوت هوش بشری را مورد بررسی قرار می دهد که با شناخت آن شخص به درک بهتری از توانایی های خود میرسد و در نتیجه سعی در استفاده از آن جهت یادگیری بهتر میکند. همچنین با شناخت استعداد دانش آموزان، فرایند یادگیری بهتر میشود. هدف از انجام دادن این تحقیق بررسی رابطه بین هوش ریاضی و استعداد یادگیری زبان انگلیسی میباشد. برای انجام این تحقیق از پرسشنامه هوش ریاضی که توسط شیرر در ...
Raw Corpus Word Sense Disambiguation
A wide range of approaches have been applied to word sense disambiguation. However, most require manually crafted knowledge such as annotated text, machine readable dictionaries or thesari, semantic networks, or aligned bilingual corpora. The reliance on these knowledge sources limits portability since they generally exist only for selected domains and languages. This poster presents a corpus-b...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Natural Language Processing
سال: 2006
ISSN: 1340-7619,2185-8314
DOI: 10.5715/jnlp.13.4_33