Representing and Accessing Multilevel Linguistic Annotation using the MEANING Format

نویسندگان

  • Emanuele Pianta
  • Luisa Bentivogli
  • Christian Girardi
  • Bernardo Magnini
چکیده

We present an XML annotation format (MEANING Annotation Format, MAF) specifically designed to represent and integrate different levels of linguistic annotations and a tool that provides flexible access to them (MEANING Browser). We describe our experience in integrating linguistic annotations coming from different sources, and the solutions we adopted to implement efficient access to corpora annotated with the Meaning Format.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discontinuous Constituents: a Problematic Case for Parallel Corpora Annotation and Querying

In this paper, we discuss some linguistic phenomena that pose potential problems for multilevel linguistic annotation of parallel corpora in general and specifically for data encoding with state-of-art multilevel corpus querying tools such as CQP. We describe the strategy we use for integrating the standard hierarchical XML representation used to annotate such phenomena in our aligned bilingual...

متن کامل

AnCoraPipe: A tool for multilevel annotation

AnCoraPipe is a corpus annotation tool which allows different linguistic levels to be annotated simultaneously and efficiently, since it uses a single format for all stages. In this way, the required annotation time is reduced and the integration of the work of all annotators is made easier.

متن کامل

Accessing Heterogeneous Linguistic Data — Generic XML-based Representation and Flexible Visualization

Annotation of linguistic data increasingly focuses on information beyond the (morpho-)syntactic level. Moreover, annotated data of less-studied languages is growing in importance. To maximally profit from this data, straightforward and user-friendly access has to be provided. In this paper, we describe a linguistic database that is accessed via a web browser and offers flexible visualization of...

متن کامل

GrAF: A Graph-based Format for Linguistic Annotations

In this paper we describe the Graph Annotation Format (GrAF) and show how it is used represent not only independent linguistic annotations, but also sets of merged annotations as a single graph. To demonstrate this, we have automatically transduced several different annotations of the Wall Street Journal corpus into GrAF and show how the annotations can then be merged, analyzed, and visualized ...

متن کامل

The NITE Object Model Library for Handling Structured Linguistic Annotation on Multimodal Data Sets

The NITE Object Model Library is an implemented set of routines for loading, accessing, manipulating, and serializing linguistic data. It is similar in spirit to the data handling provided by the Annotation Graph Toolkit, but is aimed at data that is heavily cross-annotated with structured information, and thus chooses higher expressivity at the cost of processing speed. We describe our open-so...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006