The Art of Tagging: Measuring the Quality of Tags
Collaborative tagging, supported by many social networking
websites, is currently enjoying an increasing popularity. The
usefulness of this largely available tag data has been explored in
many applications including web resources categorization,deriving
emergent semantics, web search etc. However, since tags
are supplied by users freely, not all of them are useful
and reliable, especially when they are generated by spammers with
malicious intent. Therefore, identifying tags of high quality is
crucial in improving the performance of applications based on
tags. In this paper, we propose TRP-Rank (Tag-Resource Pair Rank), an
algorithm to measure the quality of tags by manually assessing a seed set and
\textit{propagating the quality} through a graph. The three
dimensional relationship among users, tags and web resources is
firstly represented by a graph structure. A set of seed nodes,
where each node represents a tag annotating a resource, is then
selected and their quality is assessed. The quality of the
remaining nodes is calculated by propagating the known quality of
the seeds through the graph structure. We evaluate our approach on
a public data set where tags generated by suspicious spammers
were manually labelled. The experimental results demonstrate the
effectiveness of this approach in measuring the quality of tags
Full Paper
Conference Homepage
ASWC 2008
BibTex Entry