Modeling the Hebrew Bible: Potential of Topic Modeling Techniques for Semantic Annotation and Historical Analysis

نویسندگان

  • Mathias Coeckelbergs
  • Seth van Hooland
چکیده

Providing useful and efficient semantic annotations is a major challenge for knowledge design of any body of text, especially historical documents. In this article, we propose Topic Modeling as an important first step to gather semantic information beyond the lexicon which can be added as annotations in the SHEBANQ. By laying out a case study, we discuss both noise and structure found in comparing topics extracted within different distributions, and show the value of such approach, which we label a topic hierarchy. We also show a first result in applying such approach to study diachronic variety in the Bible, and show how this overall Topic Modeling approach can result in more query options for users of the database.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Joint Semantic Vector Representation Model for Text Clustering and Classification

Text clustering and classification are two main tasks of text mining. Feature selection plays the key role in the quality of the clustering and classification results. Although word-based features such as term frequency-inverse document frequency (TF-IDF) vectors have been widely used in different applications, their shortcoming in capturing semantic concepts of text motivated researches to use...

متن کامل

Annotation as a New Paradigm in Research

We outline a paradigm to preserve results of digital scholarship, whether they are query results, feature values, or topic assignments. This paradigm is characterized by using annotations as multifunctional carriers and making them portable. The testing grounds we have chosen are two significant enterprises, one in the history of science, and one in Hebrew scholarship. The first one (CKCC) focu...

متن کامل

Annotation as a New Paradigm in Research Archiving

We outline a paradigm to preserve results of digital scholarship, whether they are query results, feature values, or topic assignments. This paradigm is characterized by using annotations as multifunctional carriers and making them portable. The testing grounds we have chosen are two significant enterprises, one in the history of science, and one in Hebrew scholarship. The first one (CKCC) focu...

متن کامل

Reliability analysis of repairable systems using system dynamics modeling and simulation

Repairable standby system’s study and analysis is an important topic in reliability. Analytical techniques become very complicated and unrealistic especially for modern complex systems. There have been attempts in the literature to evolve more realistic techniques using simulation approach for reliability analysis of systems. This paper proposes a hybrid approach called as Markov system ...

متن کامل

LAF-Fabric: a data analysis tool for Linguistic Annotation Framework with an application to the Hebrew Bible

The Linguistic Annotation Framework (LAF) provides a general, extensible stand-off markup system for corpora. This paper discusses LAF-Fabric, a new tool to analyse LAF resources in general with an extension to process the Hebrew Bible in particular. We first walk through the history of the Hebrew Bible as text database in decennium-wide steps. Then we describe how LAF-Fabric may serve as an an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016