For bachelor students we offer German lectures on database systems in addition with paper- or project-oriented seminars. Within a one-year bachelor project students finalize their studies in cooperation with external partners. For master students we offer courses on information integration, data profiling, search engines and information retrieval enhanced by specialized seminars, master projects and advised master theses.
Most of our research is conducted in the context of larger research projects, in collaboration across students, across groups, and across universities. We strive to make available most of our data sets and source code.
I provide supervision for Master's theses in the area of News Comment Analysis, e.g., Toxic Comment Classification, User Engagement Prediction, Comment Recommendation, and Discussion Summarization/Visualization. Feel free to schedule an informal meeting with me to discuss details of these topics and/or your own ideas.
hpiDEDIS at GermEval 2019: Offensive Language Identification using a German BERT model
Risch, Julian; Stoll, Anke; Ziegele, Marc; Krestel, Ralf
Proceedings of the 15th Conference on Natural Language Processing (KONVENS)
German Society for Computational Linguistics & Language Technology
Pre-training language representations on large text corpora, for example, with BERT, has recently shown to achieve impressive performance at a variety of downstream NLP tasks. So far, applying BERT to offensive language identification for German- language texts failed due to the lack of pre-trained, German-language models. In this paper, we fine-tune a BERT model that was pre-trained on 12 GB of German texts to the task of offensive language identification. This model significantly outperforms our baselines and achieves a macro F1 score of 76% on coarse-grained, 51% on fine-grained, and 73% on implicit/explicit classification. We analyze the strengths and weaknesses of the model and derive promising directions for future work.