Large-Scale Linguistic Ontology as a Basis for Text Categorization of Legislative Documents
نویسندگان
چکیده
The paper describes the structure and properties of a large linguistic ontology – a new kind of information retrieval thesaurus Thesaurus on Sociopolitical Life for Conceptual Indexing. The thesaurus is used in various realscale information-retrieval applications in the legal domain. At present one of the main applications of the Thesaurus is knowledge-based text categorization. Categories are connected with the Thesaurus by flexible relationships. The categorization system can process text collections containing texts different in sizes and types.
منابع مشابه
Linguistic Annotation for the Semantic Web
Establishing the semantic web on a large scale implies the widespread annotation of web documents with ontology-based knowledge markup. For this purpose, tools have been developed that allow for semi-automatic annotation of web documents with ontology-based metadata. However, given that a large number of web documents consist either fully or at least partially of free text, language technology ...
متن کاملخوشهبندی اسناد مبتنی بر آنتولوژی و رویکرد فازی
Data mining, also known as knowledge discovery in database, is the process to discover unknown knowledge from a large amount of data. Text mining is to apply data mining techniques to extract knowledge from unstructured text. Text clustering is one of important techniques of text mining, which is the unsupervised classification of similar documents into different groups. The most important step...
متن کاملDevelopment of Bilingual Domain-Specific Ontology for Automatic Conceptual Indexing
In the paper we describe development, means of evaluation and applications of Russian–English Sociopolitical Thesaurus specially developed as a linguistic resource for automatic text processing applications. The Sociopolitical domain is not a domain of social research but a broad domain of social relations including economic, political, military, cultural, sports and other subdomains. The knowl...
متن کاملTask Description for PASCAL Challenge Evaluating Ontology Learning and Population from Text
Ontologies are formal, explicit specifications of shared conceptualizations, representing concepts and their relations that are relevant to a given domain of discourse. Currently, ontologies are mostly developed as well as used through a manual process, which is very ineffective and may cause major barriers to their large-scale use in such areas as Knowledge Discovery and Semantic Web. As human...
متن کاملCentralized Clustering Method To Increase Accuracy In Ontology Matching Systems
Ontology is the main infrastructure of the Semantic Web which provides facilities for integration, searching and sharing of information on the web. Development of ontologies as the basis of semantic web and their heterogeneities have led to the existence of ontology matching. By emerging large-scale ontologies in real domain, the ontology matching systems faced with some problem like memory con...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005