Feeds 2.0 and the Netflix Prize

The Netflix Prize competition

In October, 2006 Netflix released a large movie rating dataset and challenged the data mining, machine learning and computer science communities to develop systems that could beat the accuracy of their in-house developed recommendation system (Cinematch) by 10%. In order to render the clallenge more interesting, the company will award a Grand Prize of $1M to the first team that will attain this goal, and in addition, Progress Prizes of $50K have been awarded on the anniversaries of the Prize to teams that have made sufficient accuracy improvements. Apart from the financial incentive however, the Netflix Prize contest is enormously useful for recommender system research since the released Netflix dataset is by far the largest ratings dataset ever becoming available to the research community. Most work on recommender systems outside of companies like Amazon or Netflix up to now has had to make do with the relatively small 1M ratings MovieLens data or the 3M ratings EachMovie dataset. Netflix provided 100480507 ratings (on a scale from 1 to 5 integral stars) along with their dates from 480189 randomly-chosen, anonymous subscribers on 17770 movie titles. The data were collected between October, 1998 and December, 2005 and reflect the distribution of all ratings received by Netflix during this period. Netflix withheld over 2M most recent ratings from those same subscribers over the same set of movies as a competition qualifying set and contestants are required to make predictions for all 2M withheld ratings in the qualifying set.

The Feeds2 team

The Feeds2 team participates in the Netflix Prize competition practically since its launch in late 2006. The name of the team is derived from the Feeds 2.0 personalized RSS aggregator service developed by the team members. Feeds 2.0 is a news and blog reading service which prioritizes and recommends incoming news according to the user's interests. Obviously the Netflix Prize competition provided an excellent opportunity, as well as a challenge, in order to test the efficiency and scaling of the algorithms developed for the Feeds 2.0 service.

Feeds2 Team Members

Feeds2 Approach

The Feeds2 team has tackled the problem by applying computational intelligence and data mining methods. These include hundreds of models based on Neural Networks, Matrix Factorization, Singular Value Decomposition, Concept Decomposition, Restricted Boltzmann Machines, Bayesian Models, K-Nearest Neighbors, Time Dependent Models, and others. The results of these various modeling approaches are also combined by linear and non-linear machine learning ensemble methods.

Netflix Prize Status

The Contest has now closed