Hasso-Plattner-Institut
Prof. Dr. Felix Naumann
  
 

Publications

Online Temporal Summarization of News Events

Tobias Schubotz and Ralf Krestel
In Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT), pages 679-684, 12 2015 IEEE Computer Society.

Abstract:

Nowadays, an ever increasing number of news articles is published on a daily basis. Especially after notable national and international events or disasters, news coverage rises tremendously. Temporal summarization is an approach to automatically summarize such information in a timely manner. Summaries are created incrementally with progressing time, as soon as new information is available. Given a user-defined query, we designed a temporal summarizer based on probabilistic language models and entity recognition. First, all relevant documents and sentences are extracted from a stream of news documents using BM25 scoring. Second, a general query language model is created which is used to detect typical sentences respective to the query with Kullback-Leibler divergence. Based on the retrieval result, this query model is extended over time by terms appearing frequently during the particular event. Our system is evaluated with a document corpus including test data provided by the Text Retrieval Conference (TREC).

BibTeX file

@inproceedings{wi15,
author = { Tobias Schubotz and Ralf Krestel },
title = { Online Temporal Summarization of News Events },
year = { 2015 },
pages = { 679-684 },
month = { 12 },
abstract = { Nowadays, an ever increasing number of news articles is published on a daily basis. Especially after notable national and international events or disasters, news coverage rises tremendously. Temporal summarization is an approach to automatically summarize such information in a timely manner. Summaries are created incrementally with progressing time, as soon as new information is available. Given a user-defined query, we designed a temporal summarizer based on probabilistic language models and entity recognition. First, all relevant documents and sentences are extracted from a stream of news documents using BM25 scoring. Second, a general query language model is created which is used to detect typical sentences respective to the query with Kullback-Leibler divergence. Based on the retrieval result, this query model is extended over time by terms appearing frequently during the particular event. Our system is evaluated with a document corpus including test data provided by the Text Retrieval Conference (TREC). },
publisher = { IEEE Computer Society },
booktitle = { Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT) },
priority = { 0 }
}

Copyright Notice

This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.

last change: Wed, 20 Apr 2016 10:33:22 +0200