Planet Drupal

Recommender Module Performance Enhancement & Drupal for Data-intensive Computing

I've always wanted to build a cutting edge recommender system for Drupal as good as what Amazon offers. Google Summer of Code 2009 gave me the first chance to attack this task, and I developed the Recommender API module and helper modules that provides recommendation service based on users browsing history, fivestar ratings, product purchasing history, etc. After 2 years of application in the real world, I received many users feedback concerning performance/scalability issue of the modules, which cannot be fixed under the current PHP implementation -see why here-.

To solve the performance issue, I think the best option is to outsource the complex recommendation computation to Apache Mahout instead of using my own PHP implementation. I have submitted another GSoC application for 2011 to work on this. Hope it will get accepted so that I can get this done.

The second part of my GSoC 2011 application is to build a framework so that 3rd party programs, such as Apache Mahout, can easily exchange data with Drupal for data-intensive computing, such as computing recommendations. More details is discussed in my GSoC 2011 application. I hope this would facilitate more innovations on data-intensive computation with Drupal using 3rd party script/programs.

If you like these ideas, please support my application at http://groups.drupal.org/node/137054.

Drupal rocks, and let's make it rock more :D

Roadmap for the Recommender API module and helper modules

The Recommender API module and a few helper modules were released in 2009 as a result of my Google Summer of Code 2009 project for Drupal. Thanks to users of the modules, I have received many useful feedback and suggestions over the past 1+ year of application.

Below is the roadmap for the next release of Recommender API module, which will be completely re-written.

  1. Outsource the recommendation computation to Apache Mahout. This is to break the PHP performance bottleneck when doing complex matrix calculations. See more details at Issue #816112 and Issue #414570.

  2. Add Views support so that there are more customized ways to show the recommendations. See more details at Issue #673786.

  3. Support Drupal 7. See more details at Issue #910258.

"Related modules" block for Drupal.org -- Past and Future

Our research group at the University of Michigan has been working on the "related modules" block for Drupal.org for more than 2 years now. We have published 2 papers on this project so far:

1) Assessment of Conversation Co-mentions as a Resource for Software Module Recommendation. Will be presented at ACM Recommender System Conference'09

2) Conversation Pivots and Double Pivots. Presented at ACM Computer Human Interaction Conference'08

Announcing my GSoC 2009 project -- Making Drupal Smart: The Recommender Bundle

My Google Summer of Code 2009 proposal was accepted. The basic idea is to develop at least three modules based on Recommender API. For example, one module is to recommend Flash videos based on users' viewing history like in YouTube. A mockup screenshot is like this:

For more details and discussion, please go to http://groups.drupal.org/node/19894.

Gaming recommender systems for fun and profit

There's a big demand from the Drupal community to add fivestar-like ratings to the contrib modules. This would be a pretty cool feature, but it has other concerns too.

Announcing the "Recommender API" module

From the experience of developing the "pivots" Drupal module recommendation system, I developed the general purpose Recommender API module. It was released today.

"Related module" recommenations based on project_usage.

Previously, 'related modules' were generated based on discussions in d.o. forum -- if several modules were mentioned in the same discussion threads, we consider them to be somewhat related. (More detailed explanation of the algorithms can be found in my previous Planet Drupal blogs).

Recent GA results for "pivots_block" module recommendation system

From the last Google Analytics (GA) study on the usefulness of "pivots_block" on 4 recommendation algorithms, we learned that the classical "relevancy" algorithm generated the better results. Therefore, we used the relevancy algorithm on D.O. from Dec/4/2008 to Jan/9/2009. And the average click-thru rate was 0.474%.

Pivots module recommendation system Google Analysis results

We developed 4 module recommendation algorithms and tested them on Drupal.org. And we used Google Analytics and tracked the click-through rates. The overall click-through rate was 0.263%, co-occurrences 0.097%, relevance 0.141%, recency 0.114% and uniqueness 0.138%. The relevancy algorithm appeared to have the highest click-through rate, but it was only significantly higher than the co-occurrences algorithm.

Pages

Subscribe to Planet Drupal