Maite Taboada is Professor in the Department of Linguistics and Associate Member of the Cognitive Science Program and the School of Computing Science at Simon Fraser University. She is a linguist working at the intersection of discourse analysis and computational linguistics. In discourse analysis, her research addresses the mechanisms for coherence in discourse, focusing on how links across sentences produce the impression of coherence in text and speech. In computational linguistics, she develops methods and algorithms to process and exploit discourse structure in different applications, especially for sentiment analysis. Current research projects involve analyses of online comments, with the goal of building a moderation platform to feature constructive comments more prominently; and a study of the language of misinformation, using text classification techniques to distinguish ‘fake’ and fact-based news stories. Her lab, the Discourse Processing Lab at SFU is also collaborating with Informed Opinions. Together, they have built the Gender Gap Tracker, an online tool to track the number of men and women quoted in Canadian mainstream news media.
Finding discourse relations
In this talk, I will do two things. First, I will discuss the space that discourse or rhetorical relations occupy in language, and the issue of consensus on a common taxonomy of relations. Second, I will review the issue of signals for coherence relations and describe our corpus annotation of a broad set of signals.
First, in terms of the space that rhetorical relations occupy and their classification, I propose a top-down approach, that is, one that views relations between propositions in discourse as relations that help create coherence. I will review different approaches to rhetorical, coherence and conjunctive relations, and explain where Rhetorical Structure Theory (Mann and Thompson, 1988) fits in with other proposals. Coherence is part of texture, and thus related to entity-based coherence or cohesion (Halliday and Hasan, 1976) and to general properties of discourse. I will argue that there is a cline of grammaticalization of rhetorical relations, from discourse to syntax, and that differences across theories are sometimes rooted in where in that cline the theory positions itself. For instance, RST is at the end of the cline closer to discourse, and does not make strong claims about the syntactic realization of rhetorical relations. The conjunctive relations of Halliday and Hasan (1976) and Martin (1992), on the other hand, are more clearly syntactic, and have lexical elements as signals of the relation. My optimistic view is that, in this broad space of rhetorical relations, we can map relations across different theories if we bear in mind that they may be more or less abstract versions of each other.
In the second part of the talk, I will discuss signalling. In this sense of rhetorical relations as relations of coherence, the relations are present whether signalled by a particular device or not. This is the long-held view within Rhetorical Structure Theory. The concern in RST has been to explain how coherence, and the impression of coherence, is achieved when relations are apparently not signalled. Signalling has traditionally been taken to refer to conjunctions or discourse markers which link propositions. I will propose that signalling is actually quite prevalent, if we broaden our definition of signalling devices. I will report on the results of our annotation (Das and Taboada, 2018a, 2018b) of the RST Discourse Treebank (Carlson et al., 2002), which shows that the vast majority of relations are signalled by at least one device, according to our annotation of the RST-DT, available through the Linguistic Data Consortium (Das et al., 2015). I will describe the annotation process, the taxonomy of signalling devices, and will provide detail on the types of signalling devices found for various relations.
Carlson, Lynn, Daniel Marcu & Mary Ellen Okurowski. 2002. RST Discourse Treebank, LDC2002T07 [Corpus]. Philadelphia, PA: Linguistic Data Consortium.
Das, Debopam & Maite Taboada. 2018a. RST Signalling Corpus: A corpus of signals of coherence relations. Language Resources and Evaluation 52(1). 149-184.
Das, Debopam & Maite Taboada. 2018b. Signalling of coherence relations in discourse, beyond discourse markers. Discourse Processes 55(8). 743-770.
Das, Debopam, Maite Taboada & Paul McFetridge. 2015. RST Signalling Corpus. Philadelphia, PA: Linguistic Data Consortium.
Halliday, Michael A. K. & Ruqaiya Hasan. 1976. Cohesion in English. London: Longman.
Mann, William C. & Sandra A. Thompson. 1988. Rhetorical Structure Theory: Toward a functional theory of text organization. Text 8(3). 243-281.
Martin, James R. 1992. English Text: System and Structure. Amsterdam and Philadelphia: John Benjamins.