Monday, February 16, 2009

Robotics

I was at an interesting presentation today by Andrew Ng about the STanford Artificial Intelligence Robot (STAIR) project. STAIR is a robot which integrates many fields of AI in an attempt to accomplish tasks which require a broad range of skills and intelligence. It is currently capable of fetching items in an office environment based on spoken commands. Although this may sound like a simple task, it requires integration of recent advancements in fields such as computer vision/object recognition, machine learning, planning, speech recognition, etc. The robot needs to be able to understand speech commands, navigate through offices, open doors, recognize objects, and pick up objects, none of which are easy to accomplish for an AI. One of the themes of the talk was that we are on the threshold of robots being capable of accomplishing tasks which may actually be useful for the average person. We were shown a video of a robot (teleoperated by a human) performing tasks such as unloading a dishwasher, vacuuming and cleaning a room. Robots are mechanically capable of performing useful tasks. All that remains is making them automated.

Andrew emphasized that in order to make progress in robotics we need to focus on lower level tasks in order to make robots less specialized for a single domain. While I definitely agree with this, he was referring to lower level tasks such as recognizing/grabbing objects and navigating. I think even these are still at a level much higher than what we should be focusing on. For example, take the task of trying to get depth information from a monocular image (see: http://make3d.stanford.edu). Understanding the 3-dimensional structure of the world is important for robots. Humans can use a variety of cues in order to get this information such as texture, color, haze, defocus, etc. In theory, all of these cues could be entirely learned from patterns in visual experience. Instead of manually implementing these cues, robots should be learning about the world themselves. Predicting and recognizing patterns in data is a fundamental task which can be applied to any AI problem. If we could design an AI which is capable of performing this one task well, all of the other higher level problems will be solved.

No comments: