Hasso-Plattner-InstitutSDG am HPI
Hasso-Plattner-InstitutDSG am HPI
Login
 

Alejandro Sierra Múnera

Chair of Information Systems
Hasso Plattner Institute

Office: F.207
Tel.: 033155093401
Email: Alejandro.Sierra(at)hpi.de
Starting date: January 2021
Supervisor: Prof. Dr. Felix Naumann

Research Interests

My research focuses mainly on the adaptation of natural language processing techniques to domain-specific scenarios, where the general-domain models fail to deliver satisfactory results or the availability of labeled data is not guaranteed. In particular, I am interested in introducing domain-specific knowledge into NLP models, in order to gain performance in domain-specific tasks.

Additionally, I am interested in complementing the NLP tasks with the analysis of other data sources like pictures or accompanying images. Specially, I intend to combine the analysis of both text and images in art-historic documents, where the pictorial representation of the artwork is accompanied by textual descriptions. In such documents, the combination of computer vision techniques like CNNs and NLP techniques like information extraction could benefit from each other to produce meaningful representations for downstream tasks like information retrieval and knowledge graph construction.

Projects

NER for artwork titles

Artworks are an essential entity in the art domain, and artwork titles are the surface form used to mention these entities in art-historic documents. However the nature of artwork titles makes their recognition a complex task because they might be ambiguous, they contain mentions of other entities like locations and persons, and often they are composed of tokens that without enough context could be categorized like other syntactic constructs. Take for example "Guernica" by Pablo Picasso: without the proper context, a mention to this artwork might be confused for the place instead of the artwork depicting the bombing which took place in that town.

Although deep learning models are able to improve the performance for the task of NER, these models require large amounts of labeled data, which can be very expensive and time-consuming to obtain. Therefore one of the approaches which we are experimenting with is to adapt models and datasets used in different domains, to reduce the amount of labeled data needed to recognize artwork titles. This approach has been previously defined as Cross-domain Named Entity Recognition.

 

Teaching activities

As Teaching Assistant

SS2021

  • Seminar: Knowledge Graphs

As (Co)Advisor

SS2021

  • Master Project: Generating Art with GANs
  • Master Research Module: Distant Supervised Relation Extraction in the Domain of Art History