Characterizing linguistic structure with mutual information.

نویسندگان

  • Emmanuel M Pothos
  • Patrick Juola
چکیده

We explore mutual information (MI) as a means of characterizing linguistic statistical structure. The MI between two linguistic tokens x and y is the degree to which seeing x helps us anticipate the occurrence of y. We computed MI between words in 595 samples of written text in 25 languages. Our analyses indicate that MI dependencies do not extend beyond a range of five words. Moreover, the similarity between MI profiles of different languages was used to cluster the languages. These results are discussed in terms of a putative link between short-term memory and linguistic structure and the further utility of MI in terms of characterizing the latter.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the practical implication of mutual information for statistical decisionmaking

A theorem characterizing fractional Brownian motion by Index Terms -Wavelet transform, fractional Brownian motion.the covariance structure of its wavelet transform is established.

متن کامل

Correctness of Mutual Exclusion Algorithms

One way of characterizing non-token-based mutual exclusion (m.e.) algorithms is in terms of the underlying information structure. The information structure for a given m.e. algorithm specifies which particular processes interact with which other processes before entering their critical sections, and which processes they interact with when leaving their critical sections. By focusing on the info...

متن کامل

Mutual Visibility and Information Structure Enhance Synchrony between Speech and Co-Speech Movements

Our study aims at gaining a better understanding of how speech-gesture synchronization is affected by the factors (1) mutual visibility and (2) linguistic information structure. To this end, we analyzed spontaneous dyadic interactions where interlocutors are engaged in a verbalized version of the game TicTacToe, both with and without mutual visibility. The setting allows for a straightforward d...

متن کامل

A Supervised Method for Constructing Sentiment Lexicon in Persian Language

Due to the increasing growth of digital content on the internet and social media, sentiment analysis problem is one of the emerging fields. This problem deals with information extraction and knowledge discovery from textual data using natural language processing has attracted the attention of many researchers. Construction of sentiment lexicon as a valuable language resource is a one of the imp...

متن کامل

A Hybrid Classifier for Characterizing Motor Unit Action Potentials in Diagnosing Neuromuscular Disorders

Background: The time and frequency features of motor unit action potentials (MUAPs) extracted from electromyographic (EMG) signal provide discriminative information for diagnosis and treatment of neuromuscular disorders. However, the results of conventional automatic diagnosis methods using MUAP features is not convincing yet.Objective: The main goal in designing a MUAP characterization system ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • British journal of psychology

دوره 98 Pt 2  شماره 

صفحات  -

تاریخ انتشار 2007