Gossip is More than Just Story Telling Topic Modelling and Quantitative Analysis on a Spontaneous Speech Corpus

نویسندگان

  • Boróka Pápay
  • Bálint Kubik
  • Júlia Galántai
چکیده

Gossip is one of the most widespread human activities with multiple functions such as enhancing human cooperation, establishing social order, information sharing, norm enhancing or stress reduction. Gossip has been analyzed mostly by qualitative or survey methods. In this paper, we describe a quantitative approach to identify gossip in a large corpus containing spontaneous talk with LDA topic modeling and quantitative analysis. We aim to identify gossip and its characteristics to analyze its topics, the verbal and non-verbal emotions that were used during gossiping, and other non-textual data such as the number of speakers and the number of persons present during the gossiping events. We also analyze the topics to distinguish gossiping and storytelling by dividing gossip and non-gossip texts in our large spontaneous speech corpora.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

بررسی اثربخشی گفتار نشانه‌دار بر مهارتهای زبانی حفظ موضوع ، اطلاعات اصلی و توالی وقایع داستان در دانش‌آموزان کم شنوای پیش زبانی با عمل کاشت حلزون دیرهنگام

Objective: Cochlear Implant has very positive impact on expressive language growth of children with severe impaired hearing and the effectiveness of Cued Speech has been studied in several investigations. The purpose of this study was to assess the effectiveness of using Cued Speech on topic maintenance, basic information and sequence events of the story in the late cochlear implanted prelingua...

متن کامل

The edinburgh speech production facility doubletalk corpus

The DoubleTalk articulatory corpus was collected at the Edinburgh Speech Production Facility (ESPF) using two synchronized Carstens AG500 electromagnetic articulometers. The first release of the corpus comprises orthographic transcriptions aligned at phrasal level to EMA and audio data for each of 6 mixed-dialect speaker pairs. It is available from the ESPF online archive. A variety of tasks we...

متن کامل

Phonological Mean Length of Utterance in 48-60-Month-old Persian-speaking Children with Isfahani Accent: Comparison of Story Generation and Conversation Samples

Objective:Phonological Mean Length of Utterance (PMLU), a quantitative measure for assessment of phonological skills, has been considered in developmental studies as a diagnostic and clinical criterion in phonological development. Moreover, it is an indicator rate of the efficacy of the intervention. The PMLU is a word level measure that can be calculated on the child’s transcribed speech sampl...

متن کامل

Just-in-time language modelling

Traditional approaches to language modelling have relied on a fixed corpus of text to inform the parameters of a probability distribution over word sequences. Increasing the corpus size often leads to better-performing language models, but no matter how large, the corpus is a static entity, unable to reflect information about events which postdate it. In these pages we introduce an online parad...

متن کامل

Quantitative prosodic analysis of spontaneous speech

This paper presents a study analyzing the prosody of spontaneous speech exemplified by the OSU Buckeye corpus of American English employing the Fujisaki model and syllable duration analysis. Results show, inter alia, that model parameters are much more strongly influenced by the higher level discourse structure than by their syntactic function. The study shows that modeling strategies can be em...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2018