Discovering and Summarizing Email Conversations

نویسنده

  • Xiaodong Zhou
چکیده

With the ever increasing popularity of emails, it is very common nowadays that people discuss specific issues, events or tasks among a group of people by emails. Those discussions can be viewed as conversations via emails and are valuable for the user as a personal information repository. For instance, in 10 minutes before a meeting, a user may want to quickly go through a previous discussion via emails that is going to be discussed in the meeting soon. In this case, rather than reading each individual email one by one, it is preferable to read a concise summary of the previous discussion with major information summarized. In this thesis, we study the problem of discovering and summarizing email conversations. We believe that our work can greatly support users with their email folders. However, the characteristics of email conversations, e.g., lack of synchronization, conversational structure and informal writing style, make this task particularly challenging. In this thesis, we tackle this task by considering the following aspects: discovering emails in one conversation, capturing the conversation structure and summarizing the email conversation. We first study how to discover all emails belonging to one conversation. Specifically, we study the hidden email problem, which is important for email summarization and other applications but has not been studied before. We propose a framework to discover and regenerate hidden emails. The empirical evaluation shows that this framework is accurate and scalable to large folders. Second, we build a fragment quotation graph to capture email conversations. The hidden emails belonging to each conversation are also included into the corresponding graph. Based on the quotation graph, we develop a novel email conversation summarizer, ClueWordSummarizer. The comparison with a state-of-the-art email summarizer as well as with a popular multi-document summarizer shows that ClueWordSummarizer obtains a higher accuracy in most cases. Furthermore, to address the characteristics of email conversations, we study several ways to improve the ClueWordSummarizer by considering more lexical features. The experiments show that many of those improvements can significantly increase the accuracy especially the subjective words and phrases.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Summarizing Spoken and Written Conversations

In this paper we describe research on summarizing conversations in the meetings and emails domains. We introduce a conversation summarization system that works in multiple domains utilizing general conversational features, and compare our results with domain-dependent systems for meeting and email data. We find that by treating meetings and emails as conversations with general conversational fe...

متن کامل

Summarizing Emails with Conversational Cohesion and Subjectivity

In this paper, we study the problem of summarizing email conversations. We first build a sentence quotation graph that captures the conversation structure among emails. We adopt three cohesion measures: clue words, semantic similarity and cosine similarity as the weight of the edges. Second, we use two graph-based summarization approaches, Generalized ClueWordSummarizer and PageRank, to extract...

متن کامل

Extractive Summarization and Dialogue Act Modeling on Email Threads: An Integrated Probabilistic Approach

In this paper, we present a novel supervised approach to the problem of summarizing email conversations and modeling dialogue acts. We assume that there is a relationship between dialogue acts and important sentences. Based on this assumption, we introduce a sequential graphical model approach which simultaneously summarizes email conversation and models dialogue acts. We compare our model with...

متن کامل

An Ontology-based Visual Interface for Browsing and Summarizing Conversations

In this paper we present a visual interactive interface to create focused summaries of human conversations via mapping to the concepts within an ontology. The ontology includes nodes for the conversation participants, for Dialog Act (DA) properties such as decision, action-item or subjectivity, as well as for entities mentioned in the conversation. The classifiers used to annotate conversation ...

متن کامل

Summarizing Multi-Party Argumentative Conversations in Reader Comment on News

Existing approaches to summarizing multi-party argumentative conversations in reader comment are extractive and fail to capture the argumentative nature of these conversations. Work on argument mining proposes schemes for identifying argument elements and relations in text but has not yet addressed how summaries might be generated from a global analysis of a conversation based on these schemes....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008