I’m currently in Koblenz at the yearly review meeting for the ROBUST research project I’m involved in. The project deals with analyzing large scale business communities and is a perfect fit for applying scalable data mining techniques. I’m very glad that the project sees contributing to open source projects as an important dissemination activity and encourages the use of Apache projects such as Hadoop and Mahout.
The main contributions of me and my colleague Christoph consist of several graph mining algorithms implemented in MapReduce as well as improving Mahout’s collaborative filtering implementation. We also plan to publish a paper about our findings in the next weeks.
In the demo and poster session, we presented Mahout’s collaborative filtering capabilities:
Image may be NSFW.
Clik here to view.
We provided a short terminal demo called ‘Using Hadoop for Collaborative Filtering and Link Prediction’ where we demonstrated Mahout’s parallel item-based collaborative filtering algorithms: