Artificial Neural Networks vs. Conventional Artificial Intelligence

Conventional, or Symbolic, AI is more well established than ANNs (subsymbolic or connectionist AI), having its roots as far back as the early 1950s. Although the origins of Neural Networks can be traced back to the 1940s or even earlier (Hebb, 1949, McCulloch and Pitts, 1943) and there were some early successes in the field of Artificial Neural Networks, it was never really given the same attention (or funding!) as other aspects of AI, such as Expert Systems, Linguistic Parsing and Rule-based Robotics, until the 1980s.

This was mainly due to a seminal academic paper called Perceptrons published in 1969, in which Marvin Minsky and Seymour Papert described in the basic limitations of ANNs at the time of writing. This paper effectively killed off ANN research until the development of the Back-Propagation Algorithm in the 1980s. Since then, the deficiencies of ANNs listed in Perceptrons have all been overcome and they have taken their place in the pantheon of AI techniques. In fact, while conventional AI has rather languished in the doldrums, ANNs are seeing a golden age, and many researchers are predicting that the future of AI will be entirely connectionist.

So what are the essential differences between the two disciplines?

ANNs are trained by example whereas conventional AI systems tend to be rule-based. This is marvellous in situations where it is difficult to formulate rules (such as distinguishing the appearance of one object from another in a visual scene) or where the rules that can be formed have a large number of exceptions (such as pronouncing English words from their written form - think of "Thoroughly plough through the dough").
For example, in NETTalk (Sejnowski and Rosenberg, 1986) a neural network was presented with a series of sentences in written English and the list of phonemes (speech sounds) representing the same sentences in spoken form. It gradually learned to turn combinations of letters into combinations of phonemes, including words whose pronounciation is not obvious from their written form.

However there is a downside to this...
The knowledge in conventional systems is often made explicit. The rules in expert systems are direct transcriptions of rules which make it clear what information is required and how the conclusions are drawn from that information. Neural networks contain their knowledge spread thinly across the entire network in the form of connection strengths. It is pretty much impossible to turn these numbers into rules that humans can easily follow.
ANNs handle uncertainty easily whereas there is no universally accepted way for conventional AI systems to incorporate uncertainty in their rules. Let's take that rule about animal identification again (IF (animal_can_fly) AND (animal_is_mammal) THEN IDENTIFY(animal is bat)). Suppose when you are looking at the animal it isn't flying around. How can you tell whether the animal can fly or not? You can't even tell by looking at whether the animal has wings or not (ostriches have wings but can't fly). When trying to determine whether the animal can fly, there should be a way of representing don't know.
There are a number of well-known ways of representing uncertainty in conventional AI systems, such as Bayesian probabilities (Duda, R.O., Hart, P.E., and Nilsson, N.J., 1976), Dempster-Shafer reasoning (Shafer, 1987), MYCIN calculus (Buchanan, B.G., Shortliffe, E.H., (Eds.) 1984) and Fuzzy Logic (Zadeh, 1984). However, there is a great deal of controversy as to how suitable these methods are and the degree to which they can adequately represent uncertainty. Neural networks, on the other hand, represent uncertainty as a natural part of their functioning, as they deal with inputs can be represented as "definitely yes", "definitely no" or any shade in between. They produce outputs in a similar manner - shades of uncertainty rather than black-and-white yes or no.
ANNs degrade gracefully. If you remove a particular rule from an Expert System, it tends not to function at all. If on the other hand, you alter the connection strengths to a particular neuron in an ANN, the performance will degrade slightly (the percentage of correct classifications will decrease slightly), but it will still work. Of course, the majority of ANNs are still simulated using conventional programming languages on serial computers (i.e. they are not built into special "neural network chips") so any problem with the computer would still lead to a catastrophic failure of the ANN. This does reduce the power of this advantage somewhat.

Back