Saturday, May 31, 2008

Summer

I have been working for Microsoft for almost a month now. So far it has been a great experience. The company has a ton of amazing benefits and it has been really interesting learning about their software development practices. I am working on a product called Office Live Workspace. There have been a lot of fun intern events such as a poker night, video games, movies, and a trip to the zoo. I have also attended a few research/tech talks on campus. We get all the free drinks we want here and also occasional free food (if you know where to look).

There are a lot of things to see and do around Seattle. A few weeks ago I went to the Pacific Science Center. I think its better than the one in Vancouver. I saw Speed Racer at IMAX. It was pretty dissapointing, but I wasn't too surprised based on its score at IMDB. Yesterday I went to see a Cirque Du Soleil show called Corteo. It was an excellent production, but I liked the Mystère show I saw in Las Vegas better.


I got a new bike earlier this month. There are supposed to be some really nice bike trails in this area, so I plan on exploring a few of them over the summer.

Here is a photo album I will be uploading pictures to over the summer.

Netflix Prize

One of my summer projects will be working on the Netflix prize. It is a competition to write a program to predict user ratings of movies. We are provided with a huge dataset of actual user ratings from the Netflix database. We are also provided with a test set of <user, movie> tuples for which we need to predict ratings. After submitting the predictions Netflix returns the root mean squared error (RMSE) for a subset of the test set. Netflix already has the actual ratings for the test set, which is how they score the predictions. The three submissions I have made so far have gotten the following RMSE:

1.0533
0.9992
0.9844

Netflix's own algorithm (Cinematch) gets a RMSE of 0.9525. In order to win the competition and get the one million dollar prize a team must have a submission with a RMSE below 0.8572. The best team currently has a score of 0.8643. The three submissions I have made so far just use basic statistics for the predictions. I have three main ideas on how to approach the problem - two of them involve clustering algorithms and one of them uses temporal neural networks.

Sunday, May 11, 2008

Temporal Patterns

I have finished implementing a neural network framework for detecting temporal patterns. I made a small demo of it using a single bit input. In theory the framework should scale well to much larger problems with many inputs. It is capable of detecting both spatial and temporal patterns and making predictions of what the next input will be. The framework can be used for a huge variety of applications and problems. I tried testing it for part-of-speech tagging but the results were disappointing. Due to the huge number of neurons/inputs I had to use, the network was taking several seconds to process each input. Since the network needs to be trained on huge corpora consisting of thousands of words, it would take a really long time before it can make accurate predictions. A good property of neural networks is that they are highly parrellelizable so that all of the neurons can be processing information at the same time. On a single CPU my computer processes information for one neuron at a time. Given the right hardware, neural networks can have a constant upper bound on computational time. This means that an arbitrary number of neurons/inputs could be added to the network without affecting the processing time.

Friday, May 09, 2008

First Publication!

A paper I have been working on about AIspace has been accepted for publication at the AAAI 2008 AI Education Colloquium. I will be headed to Chicago on July 13th!

Knoll, B., Kisyński, J., Carenini, G., Conati, C., Mackworth, A., Poole, D. 2008. AIspace: Interactive Tools for Learning Artificial Intelligence. In Proceedings of the AAAI 2008 AI Education Workshop. BibTeX