E-mail signature block analysis

نویسندگان

  • Hao Chen
  • Jianying Hu
  • Richard Sproat
چکیده

The signature block is a common structured component found in e-mail messages. Accurate identification and analysis of signature blocks are important in many multimedia messaging and information retrieval applications such as email text-to-speech rendering. It is also a very challenging task, because signature blocks often appear in complex twodimensional layouts which are guided only by loose conventions. Traditional text analysis methods designed to deal with sequential text cannot handle 2-dimensional structures, while the highly unconstrained nature of signature blocks makes the application of 2-dimensional grammars very difficult. In this paper we describe an algorithm for signature block analysis which combines two-dimensional structural segmentation with one-dimensional grammatical constraints. The information obtained from both geometrical and linguistic analysis are integrated in the form of weighted finite state transducers (WFST), and the final solution is the optimal interpretation under both constraints.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Prototyping a Lightweight Trust Architecture to Fight Phishing

LTA is a lightweight trust architecture that fights phishing attacks by authenticating e-mail messages. LTA uses separable identity-based ring signatures, which are more attractive than traditional digital signatures for e-mail signing. Because the signatures are identity-based, users do not need to generate keys ahead of time. And because the ring signatures are repudiable, users do not need t...

متن کامل

Secure XMaiL or How to Get Rid of Legacy Code in Secure E-Mail Applications

E-mail is one of the oldest applications on the internet. Clients have to adhere to message formats that have been defined in RFC 822 [13] back in 1982, and at the same time be able to transport all types of content. Additionally, there are severe restrictions for the use of both encryption and digital signatures due to the adherence to RFC822. In this paper we propose a new approach based on o...

متن کامل

Finding Experts and their Details in E-mail Corpora

We present methods for finding experts (and their contact details) using e-mail messages. We locate messages on a topic, and then find the associated experts. Our approach is unsupervised: both the list of potential experts and their personal details are obtained automatically from e-mail message headers and signatures, respectively. Evaluation is done using the e-mail lists in the W3C corpus.

متن کامل

Ca ii and Na i absorption signatures from extraplanar gas in the halo of the Milky Way

1 Argelander-Institut für Astronomie, Universität Bonn, Auf dem Hügel 71, 53121 Bonn, Germany e-mail: [email protected] 2 Institut für Physik und Astronomie, Universität Potsdam, Haus 28, Karl-Liebknecht-Str. 24/25, 14476 Potsdam, Germany e-mail: [email protected] 3 Australia Telescope National Facility, PO Box 76, Epping NSW 1710, Australia e-mail: tobias.westmeier@c...

متن کامل

Current status of gravitational–wave observations

The first generation of gravitational wave interferometric detectors has taken data at, or close to, their design sensitivity. This data has been searched for a broad range of gravitational wave signatures. An overview of gravitational wave search methods and results are presented. Searches for gravitational waves from unmodelled burst sources, compact binary coalescences, continuous wave sourc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998