Carl Ambroselli, Ralf Krestel, Andreas Loos, Julian Risch
The results of the Master's thesis by Carl Ambroselli have been accepted for presentation at the North American Chapter of the Association for Computational Linguistics: Human Language Technologies Conference (NAACL), which will be held in New Orleans, June 1 to June 6, 2018. The paper is titled "Prediction for the Newsroom: Which Articles Will Get the Most Comments?" (Carl Ambroselli, Julian Risch, Ralf Krestel, Andreas Loos).
Prediction for the Newsroom: Which Articles Will Get the Most Comments?
The overwhelming success of the Web and mobile technologies has enabled millions to share their opinions publicly at any time. But the same success also endangers this freedom of speech due to closing down of participatory sites misused by individuals or interest groups. We propose to support manual moderation by proactively drawing the attention of our moderators to article discussions that most likely need their intervention. To this end, we predict which articles will receive a high number of comments. In contrast to existing work, we enrich the article with metadata, extract semantic and linguistic features, and exploit annotated data from a foreign language corpus. Our logistic regression model improves F1-scores by over 80% in comparison to state-of-the-art approaches.