What are ANNs and how are they used?

Artificial Neural Networks (ANNs) are tools used in the world of Artificial Intelligence (AI) for processing information and pattern matching. They are used for a variety of tasks, such as recognising faces, intruder detection, and speech recognition. ANNs tend to be used in situations in which traditional AI methods aren't suitable for some reason (more on this later), although they are rapidly assuming the dominant position in AI research.

ANNs consist of large numbers of artificial neurons, small switching units which model the cells found in the human brain and nervous system. Each of these cells (both the real ones and the artificial variety) is relatively simple in itself, and is very limited in the amount of information that it can process. However, when put together in large numbers, and "wired together" correctly, they develop a powerful ability to draw conclusions from input data.

ANNs work by passing signals in the form of numbers from one neuron to another. Their input is usually some sort of pattern, e.g. an image, a set of words or letters, input values from sensors. Whatever form the inputs take, they are converted to simple numbers and fed into the ANN. The numbers then propagate through the network and produce the output from the ANN, also in the form of numbers, which can then be interpreted as appropriate.

Here's an example. I have set up a simple ANN that takes three numbers in the range 0 to 100 as an input, then tells you whether they are in ascending order, descending order or no particular order. You don't see the neurons, themselves. All you see are the three input slots where you type the numbers and another slot which announces the ANN's decision. Try it now. Enter three numbers (in the range 0 to 100) and click on the button marked Process.

Input	Output
	Ascending order Descending order No particular order

If you were to see the neurons and the connections, they might look something like the following. The neurons are represented by circles, and the connections between them by the lines. The three input values are fed in on the left, and the signal propagates through the net to the right, producing an output indicating the order (or not) of the numbers. If the numbers are in ascending order from top to bottom (i.e. the top number is the smallest, the bottom number the largest), then the output marked Ascending Order will be 1 (equivalent to "on") and the other two will be 0 ("off"). Similarly, if the numbers are in descending order, or in no particular order, then the appropriate output will be 1 and the other outputs will be 0.

You will notice that the neurons are arranged into two groups of three each, called "layers" and that the signal feeds through from left to right. Such an ANN is termed a feed-forward multilayer perceptron. More on this in another section.

ANNs vs. Conventional AI

The world of AI has long been divided over the power and usefulness of ANNs. When AI was first developed in the 1950s and 60s, it dealt entirely with symbolic architectures - systems in which all the data was represented by defined symbols within the program. Let's take the example of an Expert System with rules designed to identify an unknown animal from its characteristics. A typical rule from the system might be the following:

IF (animal_can_fly) AND (animal_is_mammal) THEN IDENTIFY(animal is bat)

This rule is fairly easy to understand, even for someone not familiar with the system. Clearly, animal_can_fly is a variable which determines whether the unknown animal can fly or not, animal_is_mammal whether it is a mammal or not, and if both those conditions are met, then the animal is a bat.

ANNs are not like that. They do not store their data in carefully designed rules. Instead, the information is distributed over the entire network entirely in the form of connection strengths. For this reason, ANNs are often referred to as subsymbolic architectures - they do not deal with identifiable symbols, but with data at the subsymbolic level. You may also hear of ANNs being referred to as connectionist architectures, and the people who program them as connectionists.

While it is often a disadvantage that ANNs do not store their data in rules, that can also be an advantage. There are many situations in life where intelligent behaviour is difficult to put into the form of rules. The classical example is detecting from a person's face whether that person is male or female. When you look at a stranger, you generally have no difficulty in working out whether it is a man or a woman, but imagine if you had to determine a set of rules for an AI system to do the same! Humans, looking at each other, don't use rules to determine gender - we just do it, based on the thousands of humans we have seen in the past.

One famous neural network, SEXNET, was set up by Terry Sejnowski and his team working at MIT, to do just that. It was trained upon a large number of close-ups of human faces showing just the eyes and the nose (i.e. no bald patches or facial hair) to determine whether the photograph was of a man or a woman. After training, SEXNET achieved an accuracy of 92% in classifying previously unseen faces, as compared to 90% achieved by human beings seeing the same images!

This brings us to one of the tenets of ANNs:

ANNs are trained by example

ANNs learn their intelligent behaviour from many (often thousands) of examples. Each example causes the ANN to adjust its internal connections slightly so that it is more likely to respond correctly to a similar example in the future. Some ANN architectures are trained by presenting them with examples that have been labelled with the correct classification already (i.e. they are told what answer they are to produce with each training example) while others learn to classify for themselves (they group training examples into groups based on similarity).

Training by example mimics the way that human beings learn as children to respond intelligently to the world.

If you want a more detailed comparison of ANN and conventional AI, click here