Project Description

The comment sections of online newspapers are an important space to indulge in political discussions and discuss various opinions. These discussion forums have to be moderated due to the misuse by spammers, haters, trolls, and means of propaganda. This moderation process is very expensive and many online news providers have discontinued their comment sections. With more and more political campaigning, or even agitation being distributed over the internet, serious and safe platforms to discuss political topics are increasingly important.

In this project, we therefore analyze comments, users, and articles to understand the dynamics, the information flow, and the interactions in the comment sections. We work on detecting inappropriate comments, predicting popular news topics, identifying fake news and recommending information. Source code and datasets are available here.

Subprojects

Word Embeddings

We provide 300-dimensional fastText embeddings, which we pre-trained on more than 60 million comments from The Guardian: (5.5GB): Link

Associated Activities

Master Thesis by Victor Künstler, 2019: Modeling User Behavior in Online Discussions on News Platforms
Master Thesis by Johannes Filter, 2019: Context-aware Classification of News Comments
Master Seminar, 2018: Text Mining in Practice
Master Thesis by Carl Ambroselli, 2018: Quality Management for Online News Comments
Master Project, 2017: Hate Speech Detection
Master Thesis by Dustin Gläser, 2017: Detection of Inappropriate Content in Online Comments
Master Thesis by Christian Godde, 2016: Classification of German Newspaper Comments

Embedded YouTube video

Note:This embedded video is provided by YouTube, LLC, 901 Cherry Ave, San Bruno, CA 94066, USA.
When playing the video, a connection to the Youtube servers is established. Youtube will be informed which pages you visit. If you are logged into your Youtube account, Youtube can assign your surfing behavior to you individually. You can prevent this by logging out of your YouTube account beforehand.

Data privacy Show video

Project-Related Publications

Risch, J., Repke, T., Kohlmeyer, L., Krestel, R.: ComEx: Comment Exploration on Online News Platforms. Joint Proceedings of the ACM IUI 2021 Workshops co-located with the 26th ACM Conference on Intelligent User Interfaces (IUI). pp. 1–7. CEUR-WS.org (2021).

[ Abstract ] [ BibTeX ] [ URL ] [ Download ]

Risch, J., Künstler, V., Krestel, R.: HyCoNN: Hybrid Cooperative Neural Networks for Personalized News Discussion Recommendation. Proceedings of the International Joint Conferences on Web Intelligence and Intelligent Agent Technologies (WI-IAT). pp. 41–48 (2020).

[ Abstract ] [ BibTeX ] [ URL ] [ Download ]

Risch, J., Krestel, R.: A Dataset of Journalists’ Interactions with Their Readership: When Should Article Authors Reply to Reader Comments?. Proceedings of the International Conference on Information and Knowledge Management (CIKM). pp. 3117–3124. ACM (2020).

[ Abstract ] [ BibTeX ] [ URL ] [ Download ]

Risch, J., Ruff, R., Krestel, R.: Explaining Offensive Language Detection. Journal for Language Technology and Computational Linguistics (JLCL). 34, 29–47 (2020).

[ Abstract ] [ BibTeX ] [ URL ] [ Download ]

Risch, J., Ruff, R., Krestel, R.: Offensive Language Detection Explained. Proceedings of the Workshop on Trolling, Aggression and Cyberbullying (TRAC@LREC). pp. 137–143. European Language Resources Association (ELRA) (2020).

[ Abstract ] [ BibTeX ] [ URL ] [ Download ]

Risch, J., Krestel, R.: Bagging BERT Models for Robust Aggression Identification. Proceedings of the Workshop on Trolling, Aggression and Cyberbullying (TRAC@LREC). pp. 55–61. European Language Resources Association (ELRA) (2020).

[ Abstract ] [ BibTeX ] [ URL ] [ Download ]

Risch, J., Krestel, R.: Top Comment or Flop Comment? Predicting and Explaining User Engagement in Online News Discussions. Proceedings of the International Conference on Web and Social Media (ICWSM). pp. 579–589. AAAI (2020).

[ Abstract ] [ BibTeX ] [ URL ] [ Download ]

Risch, J., Krestel, R.: Toxic Comment Detection in Online Discussions. In: Agarwal, B., Nayak, R., Mittal, N., and Patnaik, S. (eds.) Deep Learning-Based Approaches for Sentiment Analysis. pp. 85–109. Springer (2020).

[ Abstract ] [ BibTeX ] [ Download ]

Risch, J., Stoll, A., Ziegele, M., Krestel, R.: hpiDEDIS at GermEval 2019: Offensive Language Identification using a German BERT model. Proceedings of the 15th Conference on Natural Language Processing (KONVENS). pp. 403–408. German Society for Computational Linguistics & Language Technology, Erlangen, Germany (2019).

[ Abstract ] [ BibTeX ] [ URL ] [ Download ]

10.

Risch, J., Krebs, E., Löser, A., Riese, A., Krestel, R.: Fine-Grained Classification of Offensive Language. Proceedings of GermEval (co-located with KONVENS). pp. 38–44 (2018).

[ Abstract ] [ BibTeX ] [ Download ]

11.

van Aken, B., Risch, J., Krestel, R., Löser, A.: Challenges for Toxic Comment Classification: An In-Depth Error Analysis. Proceedings of the 2nd Workshop on Abusive Language Online (co-located with EMNLP). pp. 33–42 (2018).

[ Abstract ] [ BibTeX ] [ Download ]

12.

Risch, J., Krestel, R.: Aggression Identification Using Deep Learning and Data Augmentation. Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (co-located with COLING). pp. 150–158 (2018).

[ Abstract ] [ BibTeX ] [ URL ] [ Download ]

13.

Risch, J., Krestel, R.: Delete or not Delete? Semi-Automatic Comment Moderation for the Newsroom. Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (co-located with COLING). pp. 166–176 (2018).

[ Abstract ] [ BibTeX ] [ Download ]

14.

Ambroselli, C., Risch, J., Krestel, R., Loos, A.: Prediction for the Newsroom: Which Articles Will Get the Most Comments?. Proceedings of the 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL). pp. 193–199. ACL, New Orleans, Louisiana, USA (2018).

[ Abstract ] [ BibTeX ] [ URL ] [ Download ]

15.

Godde, C., Lazaridou, K., Krestel, R.: Classification of German Newspaper Comments. Proceedings of the Conference Lernen, Wissen, Daten, Analysen. pp. 299–310. CEUR-WS.org (2016).

[ Abstract ] [ BibTeX ] [ Download ]

Project Description

Subprojects

Word Embeddings

Associated Activities

Project-Related Publications

Chair

News

06.10.2024 | Paper accepted at EDBT 2025

06.09.2024 | Congratulations Dr. Phillip Wenig

06.09.2024 | Congratulations Dr. Mazhar Hameed!

16.07.2024 | Congratulations Dr. Leon Bornemann-Paulus!

23.05.2024 | Paper accepted at NLDB 2024

Project highlights

People and open positions