Corpus-based Lexicographic Pragmatics: On 'transforming' dirty corpora
نویسنده
چکیده
Just over one decade ago, corpus-based pragmatics labelling made its way into a major dictionary for the first time. In the Collins COBUILD English Dictionary (COBUILD 2, Sinclair 1995), the compilers started to insert the novel 'PRAGMATICS' sign into the Extra Column, whenever the 'statement of meaning' for certain senses of words had to be supplemented by an 'added meaning'. In the latest edition, COBUILD 5, this single label is split into seven different labels, as can be seen from Figures 1 and 2.
منابع مشابه
Hedges in English for Academic Purposes: A Corpus-based study of Iranian EFL learners
Hedges, as tools to express tentativeness and doubt, have been studied in plenty of research papers in the Iranian EFL research setting. However, their use in a learner corpus, portraying Iranian learner English, is in need of more research attention. With this end in view, this study aimed at investigating how Iranian EFL learners who have majored in English-related fields in Iran deployed hed...
متن کاملHooking up to the corpus: the Viennese Lexicographic Editor’s corpus interface
The paper addresses the issue of interfacing between digital corpora and a new dictionary writing application being developed at the ICLTT (Institute of Corpus Linguistics and Text Technology of the Austrian Academy of Sciences). It deals with issues of dictionary creation, software design, usability and interoperability in relation to the example of this fairly new piece of software, the Vienn...
متن کاملThe Role of Pragmatics in Solving the Winograd Schema Challenge
Different aspects and approaches to commonsense reasoning have been investigated in order to provide solutions for the Winograd Schema Challenge (WSC). The vast complexities of natural language processing (parsing, assigning word sense, integrating context, pragmatics and world-knowledge, ...) give broad appeal to systems based on statistical analysis of corpora. However, solutions based purely...
متن کاملExploiting large corpora: A circular process of partial syntactic analysis, corpus query and extraction of lexicographic information
Our approach follows the work of Eckle-Kohler (1999) who used a regular grammar to extract lexicographic information from text corpora. We employ a system that allows to improve her querybased grammar especially with respect to recall and speed without reducing accuracy. In contrast to Eckle-Kohler (1999), we do not attempt to parse a whole sentence or phrase at once during the extraction proce...
متن کاملA Perspective on the Lexicographic Value of Mega Newspaper Corpora — The Case of Afrikaans in South Africa
The aim of this article is to assess the potential use of a mega newspaper corpus, the Media24 archive, in the absence of large balanced and representative corpora, for the compilation of major general dictionaries for Afrikaans. Firstly, an evaluation of Media24 against the lemmalists of both a major single-volume and a multi-volume monolingual dictionary for Afrikaans is undertaken to determi...
متن کامل