I've been working on RecommenderAPI, a general purpose recommendation engine for Drupal, for a few years now. In the meantime, I'm doing my graduate work in recommender system, social computing, and machine learning. In this article, I'd like to discuss what to look forward to in the next major release of RecommenderAPI.
I've always wanted to build a cutting edge recommender system for Drupal as good as what Amazon offers. Google Summer of Code 2009 gave me the first chance to attack this task, and I developed the Recommender API module and helper modules that provides recommendation service based on users browsing history, fivestar ratings, product purchasing history, etc. After 2 years of application in the real world, I received many users feedback concerning performance/scalability issue of the modules, which cannot be fixed under the current PHP implementation -see why here-.
To solve the performance issue, I think the best option is to outsource the complex recommendation computation to Apache Mahout instead of using my own PHP implementation. I have submitted another GSoC application for 2011 to work on this. Hope it will get accepted so that I can get this done.
The second part of my GSoC 2011 application is to build a framework so that 3rd party programs, such as Apache Mahout, can easily exchange data with Drupal for data-intensive computing, such as computing recommendations. More details is discussed in my GSoC 2011 application. I hope this would facilitate more innovations on data-intensive computation with Drupal using 3rd party script/programs.
If you like these ideas, please support my application at http://groups.drupal.org/node/137054.
Drupal rocks, and let's make it rock more :D
Our research group at the University of Michigan has been working on the "related modules" block for Drupal.org for more than 2 years now. We have published 2 papers on this project so far:
1) Assessment of Conversation Co-mentions as a Resource for Software Module Recommendation. Will be presented at ACM Recommender System Conference'09
2) Conversation Pivots and Double Pivots. Presented at ACM Computer Human Interaction Conference'08
Thanks to the support of Google Summer of Code'09, I was able to develop the Recommender Bundle for Drupal, which includes the following modules:
http://drupal.org/project/media_rec (not developed)
My Google Summer of Code 2009 proposal was accepted. The basic idea is to develop at least three modules based on Recommender API. For example, one module is to recommend Flash videos based on users' viewing history like in YouTube. A mockup screenshot is like this:
For more details and discussion, please go to http://groups.drupal.org/node/19894.
There's a big demand from the Drupal community to add fivestar-like ratings to the contrib modules. This would be a pretty cool feature, but it has other concerns too.
Previously, 'related modules' were generated based on discussions in d.o. forum -- if several modules were mentioned in the same discussion threads, we consider them to be somewhat related. (More detailed explanation of the algorithms can be found in my previous Planet Drupal blogs).
From the last Google Analytics (GA) study on the usefulness of "pivots_block" on 4 recommendation algorithms, we learned that the classical "relevancy" algorithm generated the better results. Therefore, we used the relevancy algorithm on D.O. from Dec/4/2008 to Jan/9/2009. And the average click-thru rate was 0.474%.
We developed 4 module recommendation algorithms and tested them on Drupal.org. And we used Google Analytics and tracked the click-through rates. The overall click-through rate was 0.263%, co-occurrences 0.097%, relevance 0.141%, recency 0.114% and uniqueness 0.138%. The relevancy algorithm appeared to have the highest click-through rate, but it was only significantly higher than the co-occurrences algorithm.