A Statistical Approach to Persian Light Verb Constructions
نویسنده
چکیده
This article presents the linguistic bases of Persian light verb constructions and shows the corpus based construction of lists of collocates for some common Persian verbs. The proposed methods of corpus construction are language independent and the good results on a relatively small corpus of 20 million words confirms the power of association measures based on the hypergeometric distribution. The resulting lists show a graduation of lexicalization and the semantic homogeneity of some light verb subcategorization schemes which could be the reason for their wide usage.
منابع مشابه
Using Noun Similarity to Adapt an Acceptability Measure for Persian Light Verb Constructions
Light verb constructions (LVCs), such as take a walk and make a decision, are a common subclass of multiword expressions (MWEs), whose distinct syntactic and semantic properties call for a special treatment within a computational system. In particular, LVCs are formed semi-productively: often a semantically-general verb (such as take) combines with a number of semantically-similar nouns to form...
متن کاملSplit complex predicates in Persian
Complex predicates, or compound verbs, constitute a major portion of verbal forms in the Persian language. They are normally formed using a noun, adjective, preposition, or prepositional phrase, followed by a light verb. Unlike many other languages that employ such constructions, Persian allows the two components to become separated. This paper will investigate where complex predicates syntacti...
متن کاملExtending the coverage of a MWE database for Persian CPs exploiting valency alternations
PersPred is a manually elaborated multilingual syntactic and semantic Lexicon for Persian Complex Predicates (CPs), referred to also as “Light Verb Constructions” (LVCs) or “Compound Verbs”. CPs constitutes the regular and the most common way of expressing verbal concepts in Persian, which has only around 200 simplex verbs. CPs can be defined as multi-word sequences formed by a verb and a non-v...
متن کاملStatistical Measures Of The Semi-Productivity Of Light Verb Constructions
We propose a statistical measure for the degree of acceptability of light verb constructions, such as take a walk, based on their linguistic properties. Our measure shows good correlations with human ratings on unseen test data. Moreover, we find that our measure correlates more strongly when the potential complements of the construction (such as walk, stroll, or run) are separated into semanti...
متن کاملHow to Account for Idiomatic German Support Verb Constructions in Statistical Machine Translation
Support-verb constructions (i.e., multiword expressions combining a semantically light verb with a predicative noun) are problematic for standard statistical machine translation systems, because SMT systems cannot distinguish between literal and idiomatic uses of the verb. We work on the German to English translation direction, for which the identification of support-verb constructions is chall...
متن کامل