DMC Review

x
DMC 2011

DATA MINING CUP 2011

DMC Competition 2011

Recommendation algorithms with maximum prediction quality

Scenario
Recommendation engines (REs) are increasingly being used in ecommerce
for product recommendations. Recommendation algorithms calculate and automatically recommend products on the basis of product detail views opened by visitors to web shops. This maximises the user's activity (number of views opened) and the success (sales, turnover). Developing powerful algorithms for REs is currently one of the most popular areas of research focus in data mining.
In the scenario in question, the operator of a web shop would like to use a
recommendation engine, which maximises both activity and success, with success being weighted higher. It is, therefore, a matter of selecting the best algorithm. Three types of transaction are considered for each web session: opening a product detail view, placing a product in the shopping basket and purchasing a product. A session typically takes the following course: the user browses in the web shop, opening product detail views as he goes. If the user likes a product, he will place it directly in his shopping basket. At the end of the session, the user can then click on his basket and order the products he is interested in.

Tasks
The DMC 2011 competition consists of two tasks. These are assessed independently of each other.
The first task involves statically analysing an algorithm i.e. training the algorithm to historical transaction data, the training data. In order to be able to evaluate the prediction quality of the recommendations, the first transactions are specified on a test quantity for each session. These are the test data. The objective of the algorithm is to predict the remaining transaction data for the session. The generated prediction file is sent to the prudsys DMC team. The predicted products are then compared against the actual remaining transaction data from the sessions, the evaluation data. The team with the
highest score based on the evaluation data wins.

The second task involves dynamically evaluating an algorithm, the implementation of which is sent to the prudsys DMC team. The objective is to apply the algorithm stepbystep to historical transaction data and continuously predict the next products in a session.
As it receives all the transactions from each session one after the other in succession, it learns and predicts at the same time. The team with the highest score over all prediction steps wins.

winners (Task 1) from Technical University Dortmund

The best Data Mining youngsters 2011

Task 1

1st place: team TU_Dortmund_1
2nd place: team Uni_San Diego_1
3rd place: team Uni_Potsdam_1

winners (task 2) from Karlsruhe Institute of Technology

Task 2

winner: team Inst_Karlsruhe_1

Download task and data

Title Filesize  
DMC 2011 Task
26673259 zip download
DMC 2011 Task1 Realclass
658257 zip download
DMC 2011 Task2 Realclass
2607502 zip download
x
x x x