Our group includes PostDocs, PhD students, and student assistants, and is headed by Prof. Felix Naumann. If you are interested in joining our team, please contact Felix Naumann.
For bachelor students we offer German lectures on database systems in addition to paper- or project-oriented seminars. Within a one-year bachelor project, students finalize their studies in cooperation with external partners. For master students we offer courses on information integration, data profiling, and information retrieval enhanced by specialized seminars, master projects and we advise master theses.
Most of our research is conducted in the context of larger research projects, in collaboration across students, across groups, and across universities. We strive to make available most of our datasets and source code.
AbstractMassive Open Online Courses (MOOCs) have grown in reach and importance over the last few years, enabling a vast userbase to enroll in online courses. Besides watching videos, user participate in discussion forums to further their understanding of the course material. As in other community-based question-answering communities, in many MOOC forums a user posting a question can mark the answer they are most satisfied with. In this paper, we present a machine learning model that predicts this accepted answer to a forum question using historical forum data.
A Serendipity Model For News Recommendation. Jenders, Maximilian; Lindhauer, Thorben; Kasneci, Gjergji; Krestel, Ralf; Naumann, Felix in Lecture Notes in Computer Science (2015). (Vol. 9324) 111–123.
AbstractRecommendation algorithms typically work by suggesting items that are similar to the ones that a user likes, or items that similar users like. We propose a content-based recommendation technique with the focus on serendipity of news recommendations. Serendipitous recommendations have the characteristic of being unexpected yet fortunate and interesting to the user, and thus might yield higher user satisfaction. In our work, we explore the concept of serendipity in the area of news articles and propose a general framework that incorporates the benefits of serendipity- and similarity-based recommendation techniques. An evaluation against other baseline recommendation models is carried out in a user study.
How to Stay Up-to-date on Twitter with General Keywords. Roick, Mandy; Jenders, Maximilian; Krestel, Ralf in CEUR Workshop Proceedings (2015). (Vol. 1458)
AbstractMicroblogging platforms make it easy for users to share information through the publication of short personal messages. However, users are not only interested in sharing, but even more so in consuming information. As a result, they are confronted with new challenges when it comes to retrieving information on microblogging platforms. In this paper we present a query expansion method based on latent topics to support users interested in topical information. Similar to news aggregator sites, our approach identifies subtopics to a given query and provides the user with a quick overview of discussed topics within the microblogging platform. Using a document collection of microblog posts from Twitter as an exemplary microblogging platform, we compare the quality of search results returned by our algorithm with a baseline approach and a state-of-the-art microblog-specific query expansion method. To this end, we introduce a novel, innovative semi-supervised evaluation strategy based on expert Twitter users. In contrast to existing query expansion methods, our approach can be used to aggregate and visualize topical query results based on the calculated topic models, while achieving competitive results for traditional keyword-based search with regards to mean average precision.
Analyzing and Predicting Viral Tweets. Jenders, Maximilian; Kasneci, Gjergji; Naumann, Felix (2013).
AbstractTwitter and other microblogging services have become indispensable sources of information in today's web. Understanding the main factors that make certain pieces of information spread quickly in these platforms can be decisive for the analysis of opinion formation and many other opinion mining tasks. This paper addresses important questions concerning the spread of information on Twitter. What makes Twitter users retweet a tweet? Is it possible to predict whether a tweet will become "viral", i.e., will be frequently retweeted? To answer these questions we provide an extensive analysis of a wide range of tweet and user features regarding their influence on the spread of tweets. The most impactful features are chosen to build a learning model that predicts viral tweets with high accuracy. All experiments are performed on a real-world dataset, extracted through a public Twitter API based on user IDs from the TREC 2011 microblog corpus.
Ein Datenbankkurs mit 6000 Teilnehmern - Erfahrungen auf der openHPI MOOC Plattform. Naumann, Felix; Jenders, Maximilian; Papenbrock, Thorsten in Informatik-Spektrum (2013). 37(12) 333–340.