Measuring dialect di erences ∗

نویسنده

  • Wilbert Heeringa
چکیده

We measure varietal di erences in general, and di erences with respect to standard languages in particular ( dialectality , in Herrgen/Schmidt's sense) in order to systematize observations about dialect di erences, to make sense of exceptions, and to enable measurements based on randomly selected material, thus obviating issues of potential bias. Finally, measurements allow the characterization of abstract relations among language varieties. We illustrate some issues with simple techniques for categorical data introduced by Séguy and re ned by Goebl, viz., issues concerning frequency, irrelevant variation, and competing forms. We proceed to measuring pronunciation di erences, focusing on di erences in the pronunciation of the same words in di erent varieties. Caution is needed to isolate pronunciation di erences from di erences in in ectional morphology, sandhi, and intonation. We characterize the di erence between sound segments and develop a measure of the di erence between the sequences of those segments in words, including insertions, deletions, and swaps (epenthesis, elision and metathesis). Automating measurement techniques exposes the issue of validation, which lay largely unexamined in earlier dialectology. We propose to validate measurements based on the degree to which they correlate with dialect speakers' judgments of di erence, justi ed by the presumed function of linguistic variation, that of signaling provenance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Quantitative Analysis of Bulgarian Dialect Pronunciation∗

We apply a computational measure of pronunciation di erence to a database of 36 word pronunciations from 490 sites throughout Stoykov's Bulgarian Dialect Atlases. The result is a comprehensive view of the aggregate pronunciation di erences among the 490 sites. This study aims to contribute therefore to Bulgarian dialectology, as well as to the development and testing of the computational techni...

متن کامل

Dialect loss and dialect vitality in Flanders

Dialect loss is a relatively new but by now quite general phenomenon in Flanders (i.e., Dutch-speaking Belgium). Although the processes of dialect change and dialect loss have proceeded with great regional di¤erences in speed and intensity in the past decades, there is a general tendency toward replacing primary dialect features of a relatively local scope by secondary dialect features that hav...

متن کامل

The use of shibboleth words for automatically classifying speakers by dialect

Real-world applications using speech recognition must perform well over a range of dialects. Di erences in dialect between the speakers in the training database and the target users often leads to degraded recognition performance. For the BBN Hark Hidden Markov Model (HMM) based system, we have already developed a reasonably e ective technique [1] for dealing with multiple US dialects. The solu...

متن کامل

Dialects in western Europe: a balanced picture of language death, innovation, and change*

This thematic issue of the International Journal of the Sociology of Language addresses the question of whether dialects in western Europe are dying. Can dialects still be a medium of communication in our industrialized and increasingly urbanized societies? Is there a place for dialects in a globalizing world? And what kind of dialect do we speak right now and shall we be speaking in the near f...

متن کامل

Estimation of protein{production levels for ORFs found by E. coli and yeast genome projects, basing on levels of \optimal codon" usage, in connection with feasibility of their protein coding ability and with assignment to foreign{type genes

function; \codon dialect" found for individual unicellular organisms [1]. Taxonomically related organisms have similar dialects but those distantly related have distinct ones. For example, characteristics of E. coli codon{choice (E. coli dialect) di er considerably from those of yeast S. cerevisiae, but are similar to those of Salmonella. By measuring cellular tRNA contents of these three speci...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008