Description:
Discourse markers such as German aber, wohl or obwohl can be regarded as valuable information for a wide range of text-linguistic applications, since they provide important cues for the interpretation of texts or text segments. Unfortunately, many of them are highly ambiguous. Thus, for their use in applications like automatic text summarizations a reliable disambiguation of discourse markers is needed. This should be done automatically, since manual disambiguation is feasible only for small amounts of data. The aim of this pilot study, therefore, was to investigate methodological requirements of automatic disambiguation of German discourse markers. Two different methods known from word-sense disambiguation, Naive-Bayes and decisionlists, were used for the highly ambiguous marker wenn. A statistical approach was taken to compare the two approaches and different feature combinations.