Automatic summarization of spoken dialogues is a relatively new field
in computational linguistics. Existing methods rely heavily on work
done on written texts. But spoken dialogues and especially spontaneous
and/or colloquial speech has special characteristics that are not
covered by those approaches. One of these characterics is the
considerably higher amount of pronouns.
Within the project DIANA-SUMM we want to examine how much anaphora-resolution can contribute to a summarization method.
Therefore, our aim is to build a tool that provides a first step
towards automatic generation of meeting protocols, that is enhanced by
anaphora resolution.
If you are interested in our Part-of-Speech annotation of the
ICSI-Meeting Corpus data please contact Margot Mieskes.
Due to copyright reasons we cannot provide the ICSI data, but only the
POS annotation. In case you already use the ICSI data it is no problem
to send you the POS annotation. The manual annotation consists of 12
meetings annotated by three annotators in parallel and 25 additional
meetings, which were individually annotated. For details please see our
LREC 2006 paper.
page last modified: Friday,
13.07.2007
Project Manager
Dr. Michael Strube Email:
Phone: +49 (0)6221 - 533 - 243 Fax: +49 (0)6221 - 533 - 298 more >>