Urdu/Hindi Motion Verbs and Their Implementation in a Lexical Resource

نویسندگان

  • Annette Hautli-Janisz
  • Miriam Butt
چکیده

A central task of natural language processing is to find a way of answering the question Who did what to whom, how, when and where? with automatic means. This requires insights on how a language realizes events and the participants that partake in them and how this information can be encoded in a humanas well as machine-readable way. In this thesis, I investigate the ways that the spatial notions of figure, ground, path and manner of motion are realized in Urdu/Hindi and I implement these insights in a computationally-usable lexical resource, namely Urdu/Hindi VerbNet. I show that in particular the encoding of complex predicates can serve as a guiding principle for the encoding of similar constructions in other VerbNets. This enterprise involves a detailed investigation of the syntax-semantics interface of motion verb constructions in Urdu/Hindi, in particular the different syntactic alternation patterns that realize motion events. As it turns out, Urdu/Hindi employs complex predicates of motion that denote the manner of motion along a path with two verbal heads. This construction exhibits similar syntactic properties as aspectual complex predicates in the language (Butt 1995). The thesis shows that the combinatorial possibilities between main verb and light verb are driven by the manner/result complementarity established by Levin and Rappaport Hovav (2008, 2013), according to which verbs either lexicalize non-scalar manner of motion or denote a scalar result event. An analysis of the construction in Lexical-Functional Grammar (Bresnan and Kaplan 1982, Dalrymple 2001) shows that the two predicates merge their arguments at the level of argument structure, which in turn can be mapped onto the functional representation along the lines of Bresnan and Zaenen (1990). From a typological point of view, the combination of two verbal heads denoting manner of motion along a path in a monoclausal construction shows that Urdu/Hindi belongs to the group of equipollently-framed languages (Slobin 2004, 2005).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A House United: Bridging the Script and Lexical Barrier between Hindi and Urdu

In Computational Linguistics, Hindi and Urdu are not viewed as a monolithic entity and have received separate attention with respect to their text processing. From part-of-speech tagging to machine translation, models are separately trained for both Hindi and Urdu despite the fact that they represent the same language. The reasons mainly are their divergent literary vocabularies and separate or...

متن کامل

Towards Identifying Hindi/Urdu Noun Templates in Support of a Large-Scale LFG Grammar

Complex predicates (CPs) are a highly productive predicational phenomenon in Hindi and Urdu and present a challenge for deep syntactic parsing. For CPs, a combination of a noun and light verb express a single event. The combinatorial preferences of nouns with one (or more) light verb is useful for predicting an instance of a CP. In this paper, we present a semi-automatic method to obtain noun g...

متن کامل

A First Approach Towards an Urdu WordNet

This paper reports on a first experiment with developing a lexical knowledge resource for Urdu on the basis of Hindi WordNet. Due to the structural similarity of Urdu and Hindi, we can focus on overcoming the differences in the scriptual systems of the two languages by using transliterators. Various natural language processing tools, among them a computational semantics based on the Urdu ParGra...

متن کامل

Automatic Classification of Hindi Verbs in Syntactic Perspective

We report of a rule based, knowledge-base driven tool to automatically classify Hindi verbs in syntactic perspective. We also report of developing the largest lexical resource for Hindi verbs along with the information on their class based on valency and some syntactic diagnostic tests as well as their morphological/inflectional type. We use this resource to develop the tool to automatically cl...

متن کامل

Developing English-Urdu Machine Translation Via Hindi

The paper presents a strategy for deriving English to Urdu translation using English to Hindi MT system. The English-Hindi lexical database is used to collect all possible Hindi words and phrases. These are further augmented by including their morphological variations and attaching all possible postpositions. This list is used to provide mapping from Hindi to Urdu. There may be change in gender...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014