Using lexical and Dependency Features to Disambiguate Discourse Connectives in Hindi

نویسندگان

  • Rohit Jain
  • Himanshu Sharma
  • Dipti Misra Sharma
چکیده

Discourse parsing is a challenging task in NLP and plays a crucial role in discourse analysis. To enable discourse analysis for Hindi, Hindi Discourse Relations Bank was created on a subset of Hindi TreeBank. The benefits of a discourse analyzer in automated discourse analysis, question summarization and question answering domains has motivated us to begin work on a discourse analyzer for Hindi. In this paper, we focus on discourse connective identification for Hindi. We explore various available syntactic features for this task. We also explore the use of dependency tree parses present in the Hindi TreeBank and study the impact of the same on the performance of the system. We report that the novel dependency features introduced have a higher impact on precision, in comparison to the syntactic features previously used for this task. In addition, we report a high accuracy of 96% for this task.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Prosody cues for classification of the discourse particle "hã" in hindi

In Hindi, affirmative particle "hã" carries out a variety of discourse functions. Preliminary investigation has shown that though it is difficult to disambiguate these different functions from prosody alone, there seems to be a distinct prosodic pattern associated with each of these. In this paper, we present a corpus study of spoken utterances of the Hindi word "hã". We identify these prosodic...

متن کامل

Towards an Annotated Corpus of Discourse Relations in Hindi

We describe our initial efforts towards developing a large-scale corpus of Hindi texts annotated with discourse relations. Adopting the lexically grounded approach of the Penn Discourse Treebank (PDTB), we present a preliminary analysis of discourse connectives in a small corpus. We describe how discourse connectives are represented in the sentence-level dependency annotation in Hindi, and disc...

متن کامل

Using Syntax to Disambiguate Explicit Discourse Connectives in Text

Discourse connectives are words or phrases such as once, since, and on the contrary that explicitly signal the presence of a discourse relation. There are two types of ambiguity that need to be resolved during discourse processing. First, a word can be ambiguous between discourse or non-discourse usage. For example, once can be either a temporal discourse connective or a simply a word meaning “...

متن کامل

Acquiring a Disambiguation Model For Discourse Connectives

Discourse connectives can show sense ambiguities, in that they can signal more than one possible rhetorical relation. The aim of this study is discover how to disambiguate such discourse connectives using a statistical model. Six discourse connectives (after, as soon as, before, once, since and while) which show ambiguities in the sdrt (Segmented Discourse Representation Theory (Asher & Lascari...

متن کامل

Disambiguating Explicit Discourse Connectives without Oracles

Deciding whether a word serves a discourse function in context is a prerequisite for discourse processing, and the performance of this subtask bounds performance on subsequent tasks. Pitler and Nenkova (2009) report 96.29% accuracy (F1 94.19%) relying on features extracted from gold-standard parse trees. This figure is an average over several connectives, some of which are extremely hard to cla...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016