Creating Digital Language Resources
نویسنده
چکیده
We discuss building digital language resources (such as annotated corpora, lexicons, ontologies, terminologies, tools), which are the main prerequisite for successful communication and information management in the e-society of the 21 century. We give an overview of the main requirements and best practices, and point to necessary steps for creation and maintenance of standardsbased and reusable language resources for written language. The notion of basic and extended language resource kits are discussed, along with other international initiatives, including the Declaration on open access to language resources. We also analyse challenges and responsibilities in creating digital language resources, and identify the need for wider national and international coordination and cooperation.
منابع مشابه
A Supervised Method for Constructing Sentiment Lexicon in Persian Language
Due to the increasing growth of digital content on the internet and social media, sentiment analysis problem is one of the emerging fields. This problem deals with information extraction and knowledge discovery from textual data using natural language processing has attracted the attention of many researchers. Construction of sentiment lexicon as a valuable language resource is a one of the imp...
متن کاملPersian in MULTEXT-East Framework
Farsi, also known as Persian, is the official language of Iran, Tajikistan and one of the two main languages spoken in Afghanistan. It is an Indo-European agglutinating language, written in Arabic script. This paper presents the first step in creating Farsi basic language resources kit. This Step comprises the specifications for morphosyntactic encoding, which is based on the EAGLES/MULTEXT mod...
متن کاملUsing Interactive Search Elements in Digital Libraries
Background and Aim: Interaction in a digital library help users locating and accessing information and also assist them in creating knowledge, better perception, problem solving and recognition of dimension of resources. This paper tries to identify and introduce the components and elements that are used in interaction between user and system in search and retrieval of information in digital li...
متن کاملکتابخانهی ملی دیجیتال پزشکی ایران(INMDL) : بایدها و نبایدها
Iran National Digital Library of Medicine was launched in 2008 by Shahid Beheshti University of Medical Sciences in order to supply English language scientific resources for the Universities of Medical Sciences throughout the country. The Library could be accessed via www.inlm.org. Given the academic definition for national and digital libraries, it seems that the services and resources offered...
متن کاملProceedings of the workshop on Semantic resources and semantic annotation for Natural Language Processing and the Digital Humanities
Lexical-semantic knowledges sources are a stock item in the language technologist’s toolbox, having proved their practical worth in many and diverse natural language processing (NLP) applications. In linguistics, lexical semantics comes in many flavors, but in the NLP world, wordnets reign more or less supreme. There has been some promising work utilizing Roget-style thesauruses instead, but wi...
متن کامل