WASP: Evaluation of Different Strategies for the Automatic Generation of Spanish Verse

نویسنده

  • Pablo Gervás
چکیده

WASP is a forward reasoning rule-based system that takes as input data a set of words and a set of verse patterns and returns a set of verses. Using a generate and test method, guided by a set of construction heuristics obtained from formal literature on Spanish poetry, the system can operate in two modes: either generating an unrestricted set of verses, or generating a poem according to one of three predefined structures (romance, cuarteto, or terceto). Five different construction heuristics are tested over different combinations of two sets of initial data, one obtained from a classic poem and one obtained from a paragraph of a doctoral thesis in linguistics. A set of numerical parameters are extracted from each test, and evaluated in search of significant correlations. The aim is to ascertain the relative importance of size of initial vocabulary, choice of words, choice of verse patterns and construction heuristics with respect to the general acceptability of the resulting verse. 1 Programs that Write Poetry Automatically The creation of programs that write poetry automatically has been a recurring dream within the AI community, but it has always been assigned a very low priority. Practical applications in the area of natural language processing, such as natural language database interfaces, information retrieval and extraction, automatic translation, and dialogue systems provide more immediate rewards. On one hand the automatic generation of poetry involves advanced linguistic skills and common sense, two of the major challenges that face AI in general. On the other hand it involves an important amount of creativity and sensibility. These ingredients are very difficult to characterise formally, and very little is known about how they might be treated algorithmically. On the positive side, poetry has the advantage of not requiring exaggerate precision. If one accepts that the main aim of a poem is to be pleasing rather than conveying a meaningful message, the general problem becomes tractable. The present paper considers how the different parameters that can be controlled by the generating program affect the acceptability of the result. The set of parameters to be monitored are: size of initial vocabulary, choice of words, choice of verse patterns, and construction heuristics. The elusive concept of acceptability of a verse is determined by resorting to hand evaluation by a team of volunteers. By searching for correlations between the strategy and initial data used to generate a verse and the positive or negative evaluation of the resulting verse, information is obtained about the relative relevance of these parameters to the end result. 1.1 Guiding Heuristically the Random Generation of a Verse Poems written by combining randomly a given set of words rate very poorly with discerning readers. For the words to make sense together, they must be organised according to particular patterns. A possible course of action would be to provide the system with adequately rich lexicon, syntax and semantics for the language involved. Results obtained with inadequate formalisms are too rigid and tend to have a mechanical ring to them. The system presented in this paper resorts to a radical simplification of the underlying linguistic skills. The exhaustive knowledge approach is abandoned in favour of a heuristic engineering solution. Only the barest outline of a grammatical outline is provided (in the form of a verse pattern) to ensure syntactic correctness. Semantic correctness is not enforced, on the understanding that creativity in poetry relies to a certain extent on daring transgressions (such as imaginative metaphors). The aim of the paper is to establish whether acceptable verse may be obtained by controlling other parameters within these initial restrictions. The hope is to identify whether the elementary ingredients considered can be manipulated smartly enough to produce a pleasing phrase. 1.2 The Effect of the Selection of Initial Data Under these restrictions, Spanish has been chosen as a test language. The phonetics of Spanish are quite straightforward to obtain from the written word. Most letters in Spanish sound the same wherever they appear in a piece of text, so the metrics, or the syllabic division, of a verse can be worked out algorithmically (RAE , 1986). Spanish scholars have a love for rules, and there is a good set of formal rules (Quilis , 1985) describing the conditions that a poem must fulfil in order to be acceptable. Given such a set of rules, the challenge becomes a simple problem of transforming the given evaluation rules (designed to be applied to an existing poem in order to ascertain its acceptability) into the corresponding construction rules. These rules have to be applied to an initial set of data consisting of: a given vocabulary (given a set of words, the poet will choose only some of them, this process of selection must surely play a role in the quality of the final result), and a particular choice of ways of combining the chosen words (word order, frequency of adjectives, length of verse...) represented as a set of verse patterns. The selected vocabulary is a set of words that includes extra information about part of speech roles, number of syllables of each word, position of stressed syllables, and rhyme. The system cannot handle morphological variations, so it considers the singular and plural, masculine and feminine forms of a word as totally distinct (and different tenses of a verb also). This decision reduces the complexity of the generation process to pattern matching between word categories and verse patterns, but it has consequences on the quality of the resulting verses. The set of valid words is stored as facts of the form:

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Workflow Generation and Modification by Enterprise Ontologies and Documents

This article presents a novel method and development paradigm that proposes a general template for an enterprise information structure and allows for the automatic generation and modification of enterprise workflows. This dynamically integrated workflow development approach utilises a conceptual ontology of domain processes and tasks, enterprise charts, and enterprise entities. It also suggests...

متن کامل

Automatic Workflow Generation and Modification by Enterprise Ontologies and Documents

This article presents a novel method and development paradigm that proposes a general template for an enterprise information structure and allows for the automatic generation and modification of enterprise workflows. This dynamically integrated workflow development approach utilises a conceptual ontology of domain processes and tasks, enterprise charts, and enterprise entities. It also suggests...

متن کامل

Strategies Available for Translating Persian Epic Poetry: A Case of Shahnameh

This study tried to find the strategies applied in three English translations of the Battle of Rostam and Esfandiyar. To this aim, the source text (ST) was analyzed verse by verse with each verse being compared with its English translations to determine what procedures the translators had used to render the source text. Subsequently, the frequency of usage for each procedure was measured ...

متن کامل

Improvement of generative adversarial networks for automatic text-to-image generation

This research is related to the use of deep learning tools and image processing technology in the automatic generation of images from text. Previous researches have used one sentence to produce images. In this research, a memory-based hierarchical model is presented that uses three different descriptions that are presented in the form of sentences to produce and improve the image. The proposed ...

متن کامل

POS-Tag Based Poetry Generation with WordNet

In this paper we present the preliminary work of a Basque poetry generation system. Basically, we have extracted the POS-tag sequences from some verse corpora and calculated the probability of each sequence. For the generation process we have defined 3 different experiments: Based on a strophe from the corpora, we (a) replace each word with other according to its POS-tag and suffixes, (b) repla...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000