For centuries, the art world was a completely analog domain, but especially in recent years efforts were made to digitize artworks in order to conserve them and make them available to a wider audience. At the same time, this also enables the application of machine learning techniques such as image captioning/tagging to enrich large amounts of digitized art with meta information, such as tags or captions.
Using generated tags enables more efficient handling of the data for archivists and researchers because they can filter the data by categories, such as portraits, photos of architecture, etc. Additionally, it is a first step to make digitally-published artworks accessible to visually impaired people. They can already get an idea of the visual contents even when no captions (produced by a dedicated image captioning algorithm) are available.
Understanding and tagging images is already widely researched on natural images (e.g. ImageNet). However, there exist some differences to the data of the art domain regarding visuals and semantics (especially when it comes to paintings). Archives can suffer from image degradation over time and often. Artistic images can contain visual cues which differ from the typical objects depicted by figurative images. This is especially true for more abstract artworks.
In addition, art datasets often do not provide the labels needed to apply supervised techniques.
In this seminar, we want to:
- tag and caption a real-life photograph archive provided by our project partner, the Wildenstein Plattner Institute,
- research ways how existing methods can be applied to our unlabeled data,
- study connections between image tagging and captioning, and
- investigate how we can bridge the domain gap between artistic and natural images.