Classical Art Semantics Information Extraction: CASIE Pilot Project

نویسندگان

Andreas Vlachidis

Douglas Tudhope

چکیده

The paper discusses the application of Natural Language Processing (NLP) techniques in the context of semantic annotation of classical art text via rule-based Information Extraction (IE) techniques combined with ontological and domain vocabulary input. The CASIE (Classical Art Semantics Information Extraction) was a pilot collaborative project between the Hypermedia Research Unit (University of South Wales) and the Beazley Archive (Oxford University), which aims to automatically extract information about cultural objects from classical art scholarly texts and represent this information in terms of the ISO metadata standard for cultural heritage, the International Council of Museum’s CIDOC Conceptual Reference Model (CRM). In total 12 documents (fascicules – high quality catalogues) were processed, originating from the Corpus Vasorum Antiquorum (CVA) collection containing over 350 high quality catalogues of mostly ancient Greek painted pottery, illustrating more than 100,000 vases. The extracted information was expressed in interoperable RDF graphs consistent with the CLAROS project format. The role of CIDOC-CRM is central for enabling semantic interoperability across the range of datasets that contribute to CLAROS. The CASIE pilot enabled a complementary exploitation of terminological and ontological resources via rule-based information extraction techniques, delivering semantic annotation with respect to the CRM in the broader field of digital humanities

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Entity Extraction via Ensemble Semantics

Combining information extraction systems yields significantly higher quality resources than each system in isolation. In this paper, we generalize such a mixing of sources and features in a framework called Ensemble Semantics. We show very large gains in entity extraction by combining state-of-the-art distributional and patternbased systems with a large set of features from a webcrawl, query lo...

متن کامل

WikiPhiloSofia and PanAnthropon: Extraction and Visualization of Facts, Relations, and Networks for a Digital Humanities Knowledge Portal

Wikipedia, with its unique structural features and a vast amount of user-generated content, is being increasingly recognized as a valuable knowledge source for various applications. Nevertheless, the mode of information search and retrieval on Wikipedia remains that of conventional keyword-based search and retrieval. The objective of my (soon-to-be-proposed) thesis project, entitled PanAnthropo...

متن کامل

Reverse Engineering of Network Software Binary Codes for Identification of Syntax and Semantics of Protocol Messages

Reverse engineering of network applications especially from the security point of view is of high importance and interest. Many network applications use proprietary protocols which specifications are not publicly available. Reverse engineering of such applications could provide us with vital information to understand their embedded unknown protocols. This could facilitate many tasks including d...

متن کامل

Boemie: Bootstrapping Ontology Evolution with Multimedia Information Extraction

The BOEMIE project proposes a bootstrapping approach to knowledge acquisition, which uses multimedia ontologies for fused extraction of semantics from multiple modalities, and feeds back the extracted information, aiming to automate the ontology evolution process.

متن کامل

Deliverable D 2 . 2 : Semantics

for dissemination) This deliverable presents the state-of-the-art on “Semantics Extraction from visual content”. We first give an overview of OCR methodologies dealing with visual content, namely image and video, that provide textual information. Although, this kind of analysis does not provide directly any semantics, it is a critical step involving visual content, which feeds BOEMIE’s module d...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

Classical Art Semantics Information Extraction: CASIE Pilot Project

نویسندگان

چکیده

منابع مشابه

Entity Extraction via Ensemble Semantics

WikiPhiloSofia and PanAnthropon: Extraction and Visualization of Facts, Relations, and Networks for a Digital Humanities Knowledge Portal

Reverse Engineering of Network Software Binary Codes for Identification of Syntax and Semantics of Protocol Messages

Boemie: Bootstrapping Ontology Evolution with Multimedia Information Extraction

Deliverable D 2 . 2 : Semantics

عنوان ژورنال:

اشتراک گذاری