De-Layering Social Networks by Shared Tastes of Friendships. Dietz, Laura; Gamari, Ben; Guiver, John; Snelson, Edward; Herbrich, Ralf (2012).
Traditionally, social network analyses are applied to data from a particular social domain. With the advent of online social networks such as Facebook, we observe an aggregate of various social domains resulting in a layered mix of professional contacts, family ties, and different circles. These aggregates dilute the community structure. We provide a method for de-layering social networks according to shared interests. Instead of relying on changes in the edge density, our shared taste model uses content of users to disambiguate the underlying shared interest of each friendship. We successfully de-layer real world networks from LibraryThing and Boards.ie, obtaining topics that significantly outperform LDA on unsupervised prediction of group membership.
Transparent User Models for Personalization. El-Arini, Khalid; Paquet, Ulrich; Herbrich, Ralf; Van Gael, Jurgen; Agüera y Arcas, Blaise (2012). 678–686.
Personalization is a ubiquitous phenomenon in our daily online experience. While such technology is critical for helping us combat the overload of information we face, in many cases, we may not even realize that our results are being tailored to our personal tastes and preferences. Worse yet, when such a system makes a mistake, we have little recourse to correct it. In this work, we propose a framework for addressing this problem by developing a new user-interpretable feature set upon which to base personalized recommendations. These features, which we call badges, represent fundamental traits of users (e.g., ``vegetarian'' or ``Apple fanboy'') inferred by modeling the interplay between a user's behavior and self-reported identity. Specifically, we consider the microblogging site Twitter, where users provide short descriptions of themselves in their profiles, as well as perform actions such as tweeting and retweeting. Our approach is based on the insight that we can define badges using high precision, low recall rules (e.g., ``Twitter profile contains the phrase `Apple fanboy'''), and with enough data, generalize to other users by observing shared behavior. We develop a fully Bayesian, generative model that describes this interaction, while allowing us to avoid the pitfalls associated with having positive-only data. Experiments on real Twitter data demonstrate the effectiveness of our model at capturing rich and interpretable user traits that can be used to provide transparency for personalization.
A Bayesian Treatment of Social Links in Recommender Systems Gartrell, Mike; Paquet, Ulrich; Herbrich, Ralf (2012).
Recommender systems are increasingly driving user experiences on the Internet. This personalization is often achieved through the factorization of a large but sparse observation matrix of user-item feedback signals. In instances where the user's social network is known, its inclusion can significantly improve recommendations for cold start users. There are numerous ways in which the network can be incorporated into a probabilistic graphical model. We propose and investigate two ways for including a social network, either as a Markov Random Field that describes a user similarity in the prior over user features, or an explicit model that treats social links as observations. State of the art performance is reported on the Flixster online social network dataset.
Kernel Topic Models. Hennig, Philipp; Stern, David; Herbrich, Ralf; Graepel, Thore (2012). 511–519.
Latent Dirichlet Allocation models discrete data as a mixture of discrete distributions, using Dirichlet beliefs over the mixture weights. We study a variation of this concept, in which the documents' mixture weight beliefs are replaced with squashed Gaussian distributions. This allows documents to be associated with elements of a Hilbert space, admitting kernel topic models (KTM), modelling temporal, spatial, hierarchical, social and other structure between documents. The main challenge is efficient approximate inference on the latent Gaussian. We present an approximate algorithm cast around a Laplace approximation in a transformed basis. The KTM can also be interpreted as a type of Gaussian process latent variable model, or as a topic model conditional on document features, uncovering links between earlier work in these areas.
Distributed, Real-time Bayesian Learning in Online Services. Herbrich, Ralf (2012). 203–204.
The last ten years have seen a tremendous growth in Internet-based online services such as search, advertising, gaming and social networking. Today, it is important to analyze large collections of user interaction data as a first step in building predictive models for these services as well as learn these models in real-time. One of the biggest challenges in this setting is scale: not only does the sheer scale of data necessitate parallel processing but it also necessitates distributed models; with over 900 million active users at Facebook, any user-specific sets of features in a linear or non-linear model yields models of a size bigger than can be stored in a single system. In this talk, I will give a hands-on introduction to one of the most versatile tools for handling large collections of data with distributed probabilistic models: the sum-product algorithm for approximate message passing in factor graphs. I will discuss the application of this algorithm for the specific case of generalized linear models and outline the challenges of both approximate and distributed message passing including an in-depth discussion of expectation propagation and Map-Reduce.