Word order in a grammarless language: A 'small-data' information-theoretic approach

نویسندگان

  • Nicholas Lester
  • Fermín Moscoso del Prado Martín
چکیده

David Gil has argued that Riau Indonesian (Sumatra. Indonesia) has no syntax, or at least not much. This controversial analysis undermines all current models of grammar, especially those describing acquisition and on-line processing. To test the strength of this analysis, we computed the information gain holding between unigram and bigram models of regular and randomized samples of English and Riau Indonesian. English samples were included as a relatively syntax-heavy baseline. We then correlated information gain values with language (English vs. Riau Indonesian), text type (original vs. randomized), and their interaction within a linear mixed-effects regression. The results suggest (a) that English and Riau Indonesian have the same amount of bigram informativity and (b) that randomization eliminates this effect in both languages. These findings do not support Gil’s syntax-free analysis; rather, they point to some kind of productive constraints on Riau Indonesian word order.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Iranian Advanced EFL Learners’ Awareness and the Use of Marked Word Order: Discourse-pragmatically Motivated Variations

The present investigation was designed to study the production and comprehension of specific means for information highlighted by advanced Iranian learners of English as a Foreign Language. The study focused on the discourse-pragmatically motivated variations of the basic word order such as inversion, pre-posing, it- and Wh-clefts. After taking the Nelson test, a homogeneous group was settled. ...

متن کامل

Relational Indexing Using a Grammarless Parser

This article proposes an alternate view on natural language parsing. Instead of looking for some predefined (phrase) structure it takes inter-word relations as startingpoint. The reason for this is twofold: firstly it circumvents traditional parsing and linguistic problems and secondly it offers a possibility to extract information specifically needed by IR applications. The close relationship ...

متن کامل

Welfare Impacts of Imposing a Tariff on Rice in Iran vs an Export Tax in Thailand: A Game Theoretic Approach

In this study, the social welfare impacts of the interaction of Iranian rice import policies and Thai export policies are analyzed using a game theoretic approach in conjunction with econometric supply and demand models. The joint impacts of increasing the world price of rice, resulting from the export policies in Thailand along with changes in tariff rates in Iran, on social welfare are analyz...

متن کامل

An information theoretic approach for using word cluster information in natural language call routing

In this paper, an information theoretic approach for using word clusters in natural language call routing (NLCR) is proposed. This approach utilizes an automatic word class clustering algorithm to generate word classes from the word based training corpus. In our approach, the information gain (IG) based term selection is used to combine both word term and word class information in NLCR. A joint...

متن کامل

Automatic Construction of Persian ICT WordNet using Princeton WordNet

WordNet is a large lexical database of English language, in which, nouns, verbs, adjectives, and adverbs are grouped into sets of cognitive synonyms (synsets). Each synset expresses a distinct concept. Synsets are interlinked by both semantic and lexical relations. WordNet is essentially used for word sense disambiguation, information retrieval, and text translation. In this paper, we propose s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015