Frequency Effects in Morpheme Segmentation

نویسنده

  • Sara Finley
چکیده

The present study explores the effects of frequency in learning to parse novel morphological patterns. In two experiments, suffixes were divided into three classes: high, medium and low frequency, based on the proportion of stems in the input that each suffix attached to (high frequency = 12/12, medium frequency = 6/12, and low frequency = 2/12). In Experiment 1, learners were better at segmenting words containing high frequency suffixes compared to low frequency suffixes, even when the stems were novel. In Experiment 2, token frequency was controlled for across all three suffix frequency classes, but learners were still better at segmenting high frequency suffixes, even when words containing high frequency suffixes were less frequent. These results suggest that learners are sensitive to the frequency distributions of the morphemes in their language, supporting work suggesting that a Zipfian distribution may be ideal for language learning.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cross-lingual Word Segmentation and Morpheme Segmentation as Sequence Labelling

This paper presents our segmentation system developed for the MLP 2017 shared tasks on cross-lingual word segmentation and morpheme segmentation. We model both word and morpheme segmentation as character-level sequence labelling tasks. The prevalent bidirectional recurrent neural network with conditional random fields as the output interface is adapted as the baseline system, which is further i...

متن کامل

Derivational morphology and base morpheme frequency

0749-596X/$ see front matter 2010 Published b doi:10.1016/j.jml.2009.01.003 * Corresponding author. Fax: +44 (0)1223 766452 E-mail addresses: [email protected] (M.A. Ford). Morpheme frequency effects for derived words (e.g. an influence of the frequency of the base ‘‘dark” on responses to ‘‘darkness”) have been interpreted as evidence of morphemic representation. However, it has been s...

متن کامل

High-Performance, Language-Independent Morphological Segmentation

This paper introduces an unsupervised morphological segmentation algorithm that shows robust performance for four languages with different levels of morphological complexity. In particular, our algorithm outperforms Goldsmith’s Linguistica and Creutz and Lagus’s Morphessor for English and Bengali, and achieves performance that is comparable to the best results for all three PASCAL evaluation da...

متن کامل

Bound Morpheme Frequencies in the Performance of Iranian English Language Undergraduates and English Language Materials Developers in Written Descriptive Tasks

This mini-corpus, cross-linguistic, comparative, and norm-referenced study intends to render the most frequently and oft-used affixes in the written descriptive tasks in the performance of English language materials developers (ELMDs) and Iranian English language undergraduates (IELUs). Samples of writings of both groups were studied and analyzed through affixation principles. The frequency of ...

متن کامل

Morpheme Segmentation and Concatenation Approaches for Uyghur LVCSR

In this paper, various kinds of sub-word lexica are thoroughly investigated under the framework of Uyghur LVCSR system. Experimental results show that it is inefficient to directly model based on word units or small units like morpheme or even syllable units. It is observed that an optimal sub-word unit set between word and morpheme units can better fit for ASR system. In order to select best u...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015