Ontology Engineering and Knowledge Extraction for Cross-Lingual Retrieval
نویسندگان
چکیده
In this paper, we show that by integrating existing NLP techniques and Semantic Web tools in a novel way, we can provide a valuable contribution to the solution of the knowledge acquisition bottleneck problem. NLP techniques to create a domain ontology on the basis of an open domain corpus have been combined with Semantic Web tools. More specifically, Watson and Prompt have been employed to enhance the kick-off ontology while Cornetto, a lexical database for Dutch, has been adopted to establish a link between the concepts and their Dutch lexicalization. The lexicalized ontology constitutes the basis for the cross-language retrieval of learning objects within the LT4eL eLearning project.
منابع مشابه
Public Transport Ontology for Passenger Information Retrieval
Passenger information aims at improving the user-friendliness of public transport systems while influencing passenger route choices to satisfy transit user’s travel requirements. The integration of transit information from multiple agencies is a major challenge in implementation of multi-modal passenger information systems. The problem of information sharing is further compounded by the multi-l...
متن کاملEnglish-Persian Plagiarism Detection based on a Semantic Approach
Plagiarism which is defined as “the wrongful appropriation of other writers’ or authors’ works and ideas without citing or informing them” poses a major challenge to knowledge spread publication. Plagiarism has been placed in four categories of direct, paraphrasing (rewriting), translation, and combinatory. This paper addresses translational plagiarism which is sometimes referred to as cross-li...
متن کاملLinguistic Annotation for the Semantic Web
Establishing the semantic web on a large scale implies the widespread annotation of web documents with ontology-based knowledge markup. For this purpose, tools have been developed that allow for semi-automatic annotation of web documents with ontology-based metadata. However, given that a large number of web documents consist either fully or at least partially of free text, language technology ...
متن کاملSemantic Hadith: Leveraging Linked Data Opportunities for Islamic Knowledge
While the linked data paradigm has gathered much attention over the recent years, the domain of Islamic knowledge has yet to cache upon its full potential. The web-scale integration of Islamic texts and knowledge sources at large is currently not well facilitated. The two primary sources of the Islamic legislation are the Qur’an and the Hadith (collections of Prophetic Narrations) and form the ...
متن کاملExtending, Trimming and Fusing WordNet for Technical Documents
This paper describes a tool for the automatic extension and trimming of a multilingual WordNet database for cross-lingual retrieval and multilingual ontology building in intranets and domain-specific document collections. Hierarchies, built from automatically extracted terms and combined with the WordNet relations, are trimmed with a disambiguation method based on the document salience of the w...
متن کامل