## Sunday, March 11, 2012

### An API for Intelligence

Over the last few years I have made many attempts at creating artificial intelligence. By "artificial intelligence" I mean a general purpose system that can recognise and predict patterns in spatio-temporal data. I have written about this topic in some previous posts.

Spatio-temporal data is any data that has a temporal dimension and one or more spatial dimensions. Everything your brain perceives is a stream of spatial data over time. Any type of sensor (e.g. microphone, camera, thermometer) can create a stream of spatial data over time. The goal of artificial intelligence/machine learning/data compression is to look for patterns in this type of data and predict what the data will be in the future.

All of my AI projects have had essentially the same API. For those of you who speak Java: "double[][] perceive (double[][] inputs)". That is, a single function that takes a matrix of floating point numbers and returns another matrix of floating point numbers. The input represents spatial data at a particular moment in time and the return value from the function represents a prediction for the next input (the next time step).

A magic black box:

Let us imagine that I have a magic black box that does a great job at implementing this function. What could I do with it? Well the most obvious thing I can do is use it to predict the future. Let's use it to find out what the stock prices will be five years from now:

``` // Training. double[][] stockPrices; for (Time t = TimeOfFirstData(); t < Now(); ++t) {   stockPrices = GetHistoricalStock(t);   stockPrices = blackBox.perceive(data); } // Predict the future. for (Time t = Now(); t <= FiveYearsFromNow(); ++t) {   stockPrices = blackBox.perceive(stockPrices); } return stockPrices; ```

Of course, these estimates wouldn't be very accurate because in reality stock prices depend on a vast number of different types of data (which I didn't give to the black box as input). If I gave the black box additional data in the input matrix (such as local news stories, earning reports, weather sensors, etc.) it would do a better job at predicting stock prices.

What else could I do with the black box? If I want to have a conversation with it or make it control a robot, I would need to give it some mechanism to perform actions. Let's imagine I naively plug a light bulb into a random cell in the output matrix of the black box (and the light bulb turns on/off depending on the value of that prediction). This light bulb is now an actuator - it gives the black box a mechanism to interact with the environment. There is now a feedback loop where the black box can influence the value of future sensor readings by changing the light bulb prediction. How would the black box choose which action to take? Well, the only "goal" of the box is to minimise the difference between its predictions and what actually happens in the future. Maybe turning on the light bulb allows the black box to make more accurate predictions of future sensor values, so it "decides" to keep the light on. If I hook up a speaker and microphone to the black box, maybe it will decide that having a conversation with me will also allow it to make more accurate predictions about the future (which is probably true).

The Magic Number:

Hopefully in the previous section I convinced you that if this one function was implemented correctly, it would result in something most people would consider true AI. So, why did I choose a matrix of floating point numbers? Why not simplify the API by just making it a single array instead? Why not a higher dimensional matrix? The answer is because I think that two (as in a two-dimensional matrix) is the magic number that makes a reasonable trade-off between competing factors. Using a higher dimensional space would make implementing the function infeasible due to computational complexity. Using a lower dimensional space loses critical information about the spatial relationships between sensors. As evidence of this information loss, imagine converting a two-dimensional image into a one-dimensional array of pixels. The image would become meaningless to you because you have lost the information of how the pixels are spatially arranged.

Let us consider the most intelligent system we know of today: the human brain. I think that the human brain is complying to the same API I described above. In fact, we only need to look at a part of the brain known as the neocortex. The neocortex is a thin sheet of neurons on the outer surface of the brain. It is responsible for essentially all higher-level thought and what makes humans intelligent. Since the neocortex is fundamentally a two-dimensional surface, it uses topographic maps to project higher-dimensional signals onto a two-dimensional space. For example, the three-dimensional touch sensors on your body are mapped onto a two-dimensional homunculus on your neocortex, where regions that are neighbouring in 3D space are also neighbouring on the homunculus (and regions which are more important are mapped to larger regions on the homunculus).

So, how close are we to being able to implement AI? I think so far the most successful efforts come from the field of data compression. A compression algorithm called PAQ8 does an amazing job of implementing "boolean perceive (boolean input)". However, it doesn't scale well to continuous numbers or higher-dimensional spaces. Another promising attempt at creating AI is coming from Numenta in the form of hierarchical temporal memory. So far their attempts have been unsuccessful, but I think at a higher level their approach to the problem is the best I have seen.