Learning a Domain Ontology from Hierarchically Structured Texts

نویسندگان

  • Pavel Makagonov
  • Alejandro Ruiz Figueroa
  • Konstantin Sboychakov
  • Alexander Gelbukh
چکیده

Any scientific or technical document is organized hierarchically: some sections of the text (such as the abstract or conclusions) summarize the contents of the main text; sections have titles describing their contents in general words; chapter titles describe the contents of a set of sections; book title describes the contexts of all chapters, etc. Moreover, whole collections of scientific documents are usually organized hierarchically: e.g., papers are organized in journals, conferences, etc., which in turn have their own titles. We exploit this hierarchical structure to learn a lexical ontology, in which subordination relationships roughly mirror those between the texts and titles in which these words occur: words occurring in more general titles subordinate the words occurring in the texts described by these titles.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Presenting a method for extracting structured domain-dependent information from Farsi Web pages

Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...

متن کامل

Extracting domain knowledge from tables of contents

Knowledge in textual form is always presented as visually and hierarchically structured units of text, which is particularly true in the case of academic texts. One research hypothesis of the ongoing project Knowledge ordering in texts— text structure and structure visualisations as sources of natural ontologies1 is that the textual structure of academic texts effectively mirrors essential part...

متن کامل

Ontology Learning and Semantic Annotation: a Necessary Symbiosis

Semantic annotation of text requires the dynamic merging of linguistically structured information and a “world model”, usually represented as a domain-specific ontology. On the other hand, the process of engineering a domain ontology through semi-automatic ontology learning system requires the availability of a considerable amount of semantically annotated documents. Facing this bootstrapping p...

متن کامل

Generic Ontology Learners on Application Domains

In ontology learning from texts, we have ontology-rich domains where we have large structured domain knowledge repositories or we have large general corpora with large general structured knowledge repositories such as WordNet (Miller, 1995). Ontology learning methods are more useful in ontology-poor domains. Yet, in these conditions, these methods have not a particularly high performance as tra...

متن کامل

Query Architecture Expansion in Web Using Fuzzy Multi Domain Ontology

Due to the increasing web, there are many challenges to establish a general framework for data mining and retrieving structured data from the Web. Creating an ontology is a step towards solving this problem. The ontology raises the main entity and the concept of any data in data mining. In this paper, we tried to propose a method for applying the "meaning" of the search system, But the problem ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005