Detecting Rhetorical Figures Based on Repetition of Words: Chiasmus, Epanaphora, Epiphora
نویسنده
چکیده
Dubremetz, M. 2017. Detecting Rhetorical Figures Based on Repetition of Words: Chiasmus, Epanaphora, Epiphora. Studia Linguistica Upsaliensia 18. 49 pp. Uppsala: Acta Universitatis Upsaliensis. ISBN 978-91-513-0165-5. This thesis deals with the detection of three rhetorical figures based on repetition of words: chiasmus (“Fair is foul, and foul is fair.”), epanaphora (“Poor old European Commission! Poor old European Council.”) and epiphora (“This house is mine. This car is mine. You are mine.”). For a computer, locating all repetitions of words is trivial, but locating just those repetitions that achieve a rhetorical effect is not. How can we make this distinction automatically? First, we propose a new definition of the problem. We observe that rhetorical figures are a graded phenomenon, with universally accepted prototypical cases, equally clear non-cases, and a broad range of borderline cases in between. This makes it natural to view the problem as a ranking task rather than a binary detection task. We therefore design a model for ranking candidate repetitions in terms of decreasing likelihood of having a rhetorical effect, which allows potential users to decide for themselves where to draw the line with respect to borderline cases. Second, we address the problem of collecting annotated data to train the ranking model. Thanks to a selective method of annotation, we can reduce by three orders of magnitude the annotation work for chiasmus, and by one order of magnitude the work for epanaphora and epiphora. In this way, we prove that it is feasible to develop a system for detecting the three figures without an unsurmountable amount of human work. Finally, we propose an evaluation scheme and apply it to our models. The evaluation reveals that, even with a very incompletely annotated corpus, a system for repetitive figure detection can be trained to achieve reasonable accuracy. We investigate the impact of different linguistic features, including length, n-grams, part-of-speech tags, and syntactic roles, and find that different features are useful for different figures. We also apply the system to four different types of text: political discourse, fiction, titles of articles and novels, and quotations. Here the evaluation shows that the system is robust to shifts in genre and that the frequencies of the three rhetorical figures vary with genre.
منابع مشابه
Syntax Matters for Rhetorical Structure: The Case of Chiasmus
The chiasmus is a rhetorical figure involving the repetition of a pair of words in reverse order, as in “all for one, one for all”. Previous work on detecting chiasmus in running text has only considered superficial features like words and punctuation. In this paper, we explore the use of syntactic features as a means to improve the quality of chiasmus detection. Our results show that taking sy...
متن کاملRhetorical Figure Detection: the Case of Chiasmus
We propose an approach to detecting the rhetorical figure called chiasmus, which involves the repetition of a pair of words in reverse order, as in “all for one, one for all”. Although repetitions of words are common in natural language, true instances of chiasmus are rare, and the question is therefore whether a computer can effectively distinguish a chiasmus from a random criss-cross pattern....
متن کاملThe RhetFig Project: Computational Rhetorics and Models of Persuasion
We argue, reason, cajole, and persuade — we deploy rhetoric — because we are social animals endowed with a symbolic mode of thought and communication who seek to shape our social environment, to compete, and to cooperate. As rhetoricians, philosophers, and semiologists have regularly noticed, some patterns of argumentation and cajolery are more successful than others. These patterns of usage — ...
متن کاملHarnessing rhetorical figures for argument mining
The generalised, automated reconstruction of the reasoning structures underlying persuasive communication is an enormously challenging task. While this work in argument mining is increasingly informed by the rich tradition of argumentation studies outside the computational field, the rhetorical perspective on argumentation is thus far largely ignored. To explore the application of rhetorical in...
متن کاملMachine Learning for Rhetorical Figure Detection: More Chiasmus with Less Annotation
Figurative language identification is a hard problem for computers. In this paper we handle a subproblem: chiasmus detection. By chiasmus we understand a rhetorical figure that consists in repeating two elements in reverse order: “First shall be last, last shall be first”. Chiasmus detection is a needle-in-the-haystack problem with a couple of true positives for millions of false positives. Due...
متن کامل