Ralf Krestel

You are here: Home > Publications > Workshop Papers > TRAC 20b

About Me
Publications
- Book Chapters
- Journal Articles
- Conference Papers
- Workshop Papers
  - PST 21
  - LCHANGE 21
  - WOAH 21
  - ESIDA 21
  - FAPER 20
  - LWDA 20
  - TRAC 20a
  - TRAC 20b
  - AI4HI 20
  - GermEval 19
  - MIDAS 19
  - TRAC 18a
  - ALW 18
  - GermEval 18
  - TRAC 18b
  - DSMM 18
  - BigVis 18
  - LWDA 17a
  - LWDA 17b
  - DSMM 17
  - LWDA 16
  - Q4APS 16
  - SBD 16
  - LWA 15
  - TempWeb 15
  - ENRICH 13
  - NLPFrame 10
  - TAC 09
  - DC 09
  - TAC 08
  - RSDC 08
  - LaTeCH 08
  - DUC 07
  - DUC 06
  - SD 05
  - DUC 05
- Posters & Demos
- Proceedings
- Others
Travels

TRAC 20b

Bagging BERT Models for Robust Aggression Identification

Abstract

Modern transformer-based models with hundreds of millions of parameters, such as BERT, achieve impressive results at text classification tasks. This also holds for aggression identification and offensive language detection, where deep learning approaches consistently outperform less complex models, such as decision trees. While the complex models fit training data well (low bias), they also come with an unwanted high variance. Especially when fine-tuning them on small datasets, the classification performance varies significantly for slightly different training data. To overcome the high variance and provide more robust predictions, we propose an ensemble of multiple fine-tuned BERT models based on bootstrap aggregating (bagging). In this paper, we describe such an ensemble system and present our submission to the shared tasks on aggression identification 2020 (team name: Julian). Our submission is the best-performing system for five out of six subtasks. For example, we achieve a weighted F1-score of 80.3% for task A on the test dataset of English social media posts. In our experiments, we compare different model configurations and vary the number of models used in the ensemble. We find that the F1-score drastically increases when ensembling up to 15 models, but the returns diminish for more models.

Full Paper

TRAC20b.pdf

Workshop Homepage

TRAC 2020

BibTex Entry

@inproceedings{krestel-trac20b, title = {Bagging BERT Models for Robust Aggression Identification}, author = {Risch, Julian and Krestel, Ralf}, booktitle = {Proceedings of the Second Workshop on Trolling, Aggression and Cyberbullying (TRAC 2020). Workshop at LREC}, location = {Marseille, France}, OPTmonth = {May 16th}, pages = {55--61}, year = {2020} }

« prev| top| next »

News

Watch our new MOOC in German about hate and fake in the Internet ("Trolle, Hass und Fake-News: Wie können wir das Internet retten?") on openHPI (link).

New Publication

Our work on Measuring and Comparing Dimensionality Reduction Algorithms for Robust Visualisation of Dynamic Text Collections will be presented at CHIIR 2021.

New Photos

I added some photos from my trip to Hildesheim.