Thursday, February 19, 2009

Erdős Number

Thanks to Jackie for pointing out that I have an Erdős number of at most four:

Byron Knoll -> David Poole -> Kellogg Booth -> Maria Klawe -> Paul Erdős

Monday, February 16, 2009

Robotics

I was at an interesting presentation today by Andrew Ng about the STanford Artificial Intelligence Robot (STAIR) project. STAIR is a robot which integrates many fields of AI in an attempt to accomplish tasks which require a broad range of skills and intelligence. It is currently capable of fetching items in an office environment based on spoken commands. Although this may sound like a simple task, it requires integration of recent advancements in fields such as computer vision/object recognition, machine learning, planning, speech recognition, etc. The robot needs to be able to understand speech commands, navigate through offices, open doors, recognize objects, and pick up objects, none of which are easy to accomplish for an AI. One of the themes of the talk was that we are on the threshold of robots being capable of accomplishing tasks which may actually be useful for the average person. We were shown a video of a robot (teleoperated by a human) performing tasks such as unloading a dishwasher, vacuuming and cleaning a room. Robots are mechanically capable of performing useful tasks. All that remains is making them automated.

Andrew emphasized that in order to make progress in robotics we need to focus on lower level tasks in order to make robots less specialized for a single domain. While I definitely agree with this, he was referring to lower level tasks such as recognizing/grabbing objects and navigating. I think even these are still at a level much higher than what we should be focusing on. For example, take the task of trying to get depth information from a monocular image (see: http://make3d.stanford.edu). Understanding the 3-dimensional structure of the world is important for robots. Humans can use a variety of cues in order to get this information such as texture, color, haze, defocus, etc. In theory, all of these cues could be entirely learned from patterns in visual experience. Instead of manually implementing these cues, robots should be learning about the world themselves. Predicting and recognizing patterns in data is a fundamental task which can be applied to any AI problem. If we could design an AI which is capable of performing this one task well, all of the other higher level problems will be solved.

Robot Sumo

Yesterday I submitted an entry for the first round of Imagine Cup: Robotics & Algorithm. The first round involves programming the AI for a simulated sumo match between two iRobot Creates. Each robot has a webcam, two bumper sensors, and four IR sensors on the bottom to detect when it is over the outer circle. Here is a simulation of a match in Microsoft Robotics Developer Studio (my AI controls the blue robot):



One thing I dislike about this contest is the fact that the physics engine makes designing a more advanced AI almost completely pointless since when two robots are pushing against each other it is about a 50-50 chance that one will end up below the other and able to push it outside the ring. I tried creating a more advanced strategy to avoid this problem by ramming the other robot and backing away repeatedly, but this just gave the other robot an advantage. The strategy that I ended up submitting has a few tricks such as avoiding repetitive behaviors and randomly choosing between possible actions. However, modifying the AI didn't seem to have any affect on its performance during the matches. Since the matches happen in real-time simulation, it is impossible to simulate a large enough sample size to determine if one strategy is actually superior to another.

I have registered in a team to compete in BattleCode, a programming competition hosted by MIT to design a computer game AI. We haven't started working on it yet, but it looks like it should be a fun project.

Data Compression and Machine Learning

There is an interesting relationship between data compression and machine learning. Any data compression algorithm can be used to perform pattern recognition. This can be done by simply comparing the file size of two independently compressed files to the file size of compressing them together. In other words, compare the size of (compression(A)+compression(B)) to compression(A+B). If the two files contain common patterns, then compression(A+B) will be smaller than (compression(A)+compression(B)). By comparing the relative file size difference using solid compression, it is possible to rank the similarity of different data sets.

In my machine learning course last semester we had an optional project to do binary classification on a dataset with thousands of features. We were given data in a training set which were classified into one of two categories (positive/negative). We needed to write an algorithm to classify test data. To compare the quality of our solution, we were given an implementation of AdaBoost with the following error rates:

false negative rate = 0.1429 (over 350 positive data)
false positive rate = 0.1429 (over 350 negative data)
sum: 0.2858

I used a neural network to achieve a lower error rate:

false negative rate = 0.14 (over 350 positive data)
false positive rate = 0.12285714285714286 (over 350 negative data)
sum: 0.2628571428571429

This morning I tried comparing some different data compression algorithms to see how well they are at binary classification. Using 7z I got the following error rates:

false negative rate = 0.31142857142857144 (over 350 positive data)
false positive rate = 0.2542857142857143 (over 350 negative data)

I also tried using gzip and paq8p. gzip was fast but had a terrible error rate. I think paq8p would have done much better than 7z, but it is extremely slow so I couldn't finish running it on the test set. I assume that in general better (slower) compression algorithms will have a lower error rate.