Verb Sense and Verb Subcategorization Probabilities

نویسندگان

  • Douglas William Roland
  • Daniel Jurafsky
  • Lise Menn
  • Daniel S. Jurafsky
چکیده

Roland, Douglas William (Ph.D., Linguistics) Verb Sense and Verb Subcategorization Probabilities Thesis directed by Associate Professor Daniel S. Jurafsky This dissertation investigates a variety of problems in psycholinguistics and computational linguistics caused by the differences in verb subcategorization probabilities found between various corpora and experimental data sets. For psycholinguistics, these problems include the practical problem of which frequencies to use for norming psychological experiments, as well as the more theoretical issue of which frequencies are represented in the mental lexicon and how those frequencies are learned. In computational linguistics, these problems include the decreases in the accuracy of probabilistic applications such as parsers when they are used on corpora other than the one on which they were trained. Evidence is presented showing that different senses of verbs and their corresponding differences in subcategorization, as well as inherent differences between the production of sentences in psychological norming protocols and language use in context, are important causes of the subcategorization frequency differences found between corpora. This suggests that verb subcategorization probabilities should be based on individual senses of verbs rather than the whole verb lexeme, and that “test tube” sentences are not the same as “wild” sentences. Hence, iv the influences of experimental design on verb subcategorization probabilities should be given careful consideration. This dissertation will demonstrate a model of how the relationship between verb sense and verb subcategorization can be employed to predict verb subcategorization based on the semantic context preceding the verb in corpus data. The predictions made by the model are shown to be the same as predictions made by human subjects given the same contexts.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Verb Subcategorization Frequency Differences Between Business-News And Balanced Corpora: The Role Of Verb Sense

We explore the differences in verb subcategorization frequencies across several corpora in an effort to obtain stable cross corpus subcategorization probabilities for use in norming psychological experiments. For the 64 single sense verbs we looked at, subcategorization preferences were remarkably stable between British and American corpora, and between balanced corpora and financial news corpo...

متن کامل

How Verb Subcategorization Frequencies Are Affected By Corpus Choice

The probabilistic relation between verbs and their arguments plays an important role in modern statistical parsers and supertaggers, and in psychological theories of language processing. But these probabilities are computed in very different ways by the two sets of researchers. Computational linguists compute verb subcategorization probabilities from large corpora while psycholinguists compute ...

متن کامل

Verb Sense and Subcategorization: Using Joint Inference to Improve Performance on Complementary Task

We propose a general model for joint inference in correlated natural language processing tasks when fully annotated training data is not available, and apply this model to the dual tasks of word sense disambiguation and verb subcategorization frame determination. The model uses the EM algorithm to simultaneously complete partially annotated training sets and learn a generative probabilistic mod...

متن کامل

Verb Sense And Subcategorization: Using Joint Inference To Improve Performance On Complementary Tasks

We propose a general model for joint inference in correlated natural language processing tasks when fully annotated training data is not available, and apply this model to the dual tasks of word sense disambiguation and verb subcategorization frame determination. The model uses the EM algorithm to simultaneously complete partially annotated training sets and learn a generative probabilistic mod...

متن کامل

Using Verb Subcategorization for Word Sense Disambiguation

We develop a model for predicting verb sense from subcategorization information and integrate it into SSI-Dijkstra, a wide-coverage knowledge-based WSD algorithm. Adding syntactic knowledge in this way should correct the current poor performance of WSD systems on verbs. This paper also presents, for the first time, an evaluation of SSI-Dijkstra on a standard data set which enables a comparison ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998