Tuesday, January 08, 2008

Netflix, While I Am Waiting, .... Part 3

Two weeks ago, I thought I found the way to 'reduce' the training set for my Netflix run. I deleted all the sqlite database records for those customers who voted more than 50 times a day. The database has been reduced from 100,480,507 records to 64,353,618 records, that is about 36% reduction. However, the predictions have not been good compared with the one generated from the original database, with a standard deviation of 0.6195. Wrong move.

I started my run on 6 Dec 2007 and it has been over a month now. My run is currently at over 82% completion (based on movie id), but just under 17% completion based on predictions. It took 1 month to complete 17% of the predictions, however, my 'buddy' database has grown to 1.1GB in size. Also, over 70% of the predictions are calculated based on the 'buddy' information. This is a very promising sign. I hope to submit my first run by early next month. Keep me fingers crossed.

Labels: ,


Post a Comment

<< Home