We offer 4 topics that are well-described in [4].
Link Analysis
Calculate PageRank on a cluster efficiently (Chapter 5.2) and implement one extension countering link spam (either TrustRank (5.4.4) or SpamMass (5.4.5)).
Clustering in Non-Euclidian Space
Clustering groups similar items according to a distance measure. Chapter 7.5 introduces clustering in non-euclidian space and 7.6.6 outlines briefly a parallel implementation.
Frequent Itemsets
Frequent itemsets (Chapter 6) represent often co-occurring items in a large data set, e.g. books that are regularly bought together at Amazon. The SON algorithm can be well parallelized with MapReduce as described in Chapter 6.4.4.
Collaborative Filtering
Collaborative filtering is a technique to recommend items to users using a large knowledge base of previous user-item relations, e.g., purchase or ratings. Chapter 9 covers recommendation systems in general; a parallel implementation is the parallel stochastic gradient descent.