DC 09
Tag Recommendation using Probabilistic Topic Models
Abstract
Tagging systems have become major infrastructures on the Web. They allow users
to create tags that annotate and categorize content and share them with other
users, very helpful in particular for searching multimedia content. However, as
tagging is not constrained by a controlled vocabulary and annotation
guidelines, tags tend to be noisy and sparse. Especially new resources
annotated by only a few users have often rather idiosyncratic tags that do not
reflect a common perspective useful for search. In this paper we introduce an
approach based on Latent Dirichlet Allocation (LDA) for recommending tags of
resources. Resources annotated by many users and thus equipped with a fairly
stable and complete tag set are used to elicit latent topics represented as a
mixture of description tokens and tags. Based on this, new resources are mapped
to latent topics based on their content in order to recommend the most likely
tags from the latent topics. We evaluate recall and precision for the bibsonomy
benchmark provided within the ECML-PKDD Discovery Challenge 2009.
Full Paper
dc09.pdf
BibTex Entry