Automatic Acquisition for low frequency lexical items
نویسندگان
چکیده
This paper addresses a specific case of the task of lexical acquisition understood as the induction of information about the linguistic characteristics of lexical items on the basis of information gathered from their occurrences in texts. Most of the recent works in the area of lexical acquisition have used methods that take as much textual data as possible as source of evidence, but their performance decreases notably when only few occurrences of a word are available. The importance of covering such low frequency items lies in the fact that a large quantity of the words in any particular collection of texts will be occurring few times, if not just once. Our work proposes to compensate the lack of information resorting to linguistic knowledge on the characteristics of lexical classes. This knowledge, obtained from a lexical typology, is formulated probabilistically to be used in a Bayesian method to maximize the information gathered from single occurrences as to predict the full set of characteristics of the word. Our results show that our method achieves better results than others for the treatment of low frequency items.
منابع مشابه
A Young EFL Learner’s Lexical Development through Different Input and Output Frequency Patterns
The present study was undertaken to investigate the effects of varying frequency patterns (FPs) of words on the productive acquisition of a young EFL learner in a home setting. Target words were presented to the learner using games and role plays. They were subsequently traced for their frequencies in input and output. Eighteen immediate tests and delayed tests were administered to measure the ...
متن کاملOn Situating the Stance of Estrogen in the Acquisition and Recall of L2 Lexical Items: A Biological Look
The present study examined whether the advantage of females on L2 vocabulary recall and acquisition is partly as a result of estrogen secretion or not. In this regard, through volunteer and convenience sampling 15 intermediate EFL female participants aged between 23-31 were selected from the subject pool of 55 participants. The participants were studying at Iranian Language Center located in Ba...
متن کاملDeveloping a Semantic Similarity Judgment Test for Persian Action Verbs and Non-action Nouns in Patients With Brain Injury and Determining its Content Validity
Objective: Brain trauma evidences suggest that the two grammatical categories of noun and verb are processed in different regions of the brain due to differences in the complexity of grammatical and semantic information processing. Studies have shown that the verbs belonging to different semantic categories lead to neural activity in different areas of the brain, and action verb processing is r...
متن کاملNon-negative Matrix Factorization for Word Acquisition from Multimodal In- formation Including Speech
The current generation of automatic speech recognizers incorporates a lot of hard coded knowledge about how speech is structured. Yet children seem to discover the structure of speech and language from examples. A new computational method to discover lexical items with little or no supervision, based on non-negative matrix factorization (NMF) of cooccurrence counts of low-level acoustic events ...
متن کاملThe role of frequency in the acquisition of English word order
Akhtar [Akhtar, N. (1999). Acquiring basic word order: Evidence for data-driven learning of syntactic structure. Journal of Child Language, 26, 339–356] taught children novel verbs in ungrammatical word orders. Her results suggested that the acquisition of canonical word order is a gradual, data-driven process. The current study adapted this methodology, using English verbs of different frequen...
متن کامل