Clustering and semantics preservation in cultural heritage information spaces

نویسندگان

  • Javier Pereira
  • Felipe Schmidt
  • Pedro Contreras
  • Fionn Murtagh
  • Hernán Astudillo
چکیده

In this paper, we analyze the preservation of original semantic similarity among objects when dimensional reduction is applied on the original data source and a further clustering process is performed on dimensionally reduced data. An experiment is designed to test Baire, or longest common prefix ultrametric, and K-Means when prior random projection is applied. A data matrix extracted from a cultural heritage database has been prepared for the experiment. Given that the random projection produces a vector with components ranging on the interval [0, 1], clusters are obtained at different precision levels. Next, the mean semantic similarity of clusters is calculated using a modified version of the Jaccard index. Our findings show that semantics is difficult to preserve by these methods. However, a Student’s hypothesis test on mean similarity indicates that Baire clusters objects are semantically better than K-Means when we increase the digit precision, but paying an increasing cost for orphan clustered objects. Despite this cost, it is argued that the ultrametric technique provides an efficient process to detect semantic homogeneity on the original data space.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Geomatics and Architectural Heritage: a Multi-layer Interactive Map of Tuscia-Italy

The main aims of this research are the design and implementation of a multilayered and interactive geomatic map of the cultural heritage of Tuscia, one of the richest and most complex cultural areas of Italy, thanks to the presence of different civilizations, from Etruscans and Romans to the Middle Age. Its cultural heritage is very rich, valuable and above all diversified because including tan...

متن کامل

Personalizing Access to Cultural Heritage Collections using Pathways

This paper discusses mechanisms for personalizing access to cultural heritage collections and suggests that paths or trails are a flexible and powerful model for this and could link with existing models of cognitive information behaviour. We also describe a European project called PATHS (Personalized Access To cultural Heritage Spaces) that aims to support information exploration and discovery ...

متن کامل

Management of Cultural Heritage: Bologna Gates

A growing demand is felt today for realistic 3D models enabling the cognition and popularization of historical-artistic heritage. Evaluation and preservation of Cultural Heritage is inextricably connected with the innovative processes of gaining, managing, and using knowledge. The development and perfecting of techniques for acquiring and elaborating photorealistic 3D models, made them pivotal ...

متن کامل

Addressing Privacy and Trust Issues in Cultural Heritage Modelling

The management of cultural heritage information is an important aspect of human society since it enables us to document and understand our past and learn from it. Recent developments in ICT have significantly boosted research and development activities aimed at the creation and management of cultural heritage resources. As a result, information systems play an increasingly important role on sto...

متن کامل

The Traditional Malay Textile (TMT)Knowledge Model: Transformation towards Automated Mapping

The growing interest on national heritage preservation has led to intensive efforts on digital documentation of cultural heritage knowledge. Encapsulated within this effort is the focus on ontology development that will help facilitate the organization and retrieval of the knowledge. Ontologies surrounding cultural heritage domain are related to archives, museum and library information such as ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010