Schubotz, Tobias; Krestel, Ralf
Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT)
IEEE Computer Society
Nowadays, an ever increasing number of news articles is published on a daily basis. Especially after notable national and international events or disasters, news coverage rises tremendously. Temporal summarization is an approach to automatically summarize such information in a timely manner. Summaries are created incrementally with progressing time, as soon as new information is available. Given a user-defined query, we designed a temporal summarizer based on probabilistic language models and entity recognition. First, all relevant documents and sentences are extracted from a stream of news documents using BM25 scoring. Second, a general query language model is created which is used to detect typical sentences respective to the query with Kullback-Leibler divergence. Based on the retrieval result, this query model is extended over time by terms appearing frequently during the particular event. Our system is evaluated with a document corpus including test data provided by the Text Retrieval Conference (TREC).