Peter Gärdenfors proposes that there are three levels of abstraction for modeling thought:
- Symbolic: logic, expert systems, Prolog, Cyc, good old-fashioned AI, theorem proving
- Spatial: geometry, feature spaces, conceptual spaces, semantic spaces, information retrieval, vector space models, latent semantic analysis, machine learning
- Connectionist: neural networks, Hebbian theory, associationism, perceptrons, neuroscience
These levels might be compared to modeling physics at (respectively) the molecular, atomic, and subatomic levels.
AI algorithms can be characterized in terms of the levels of representation used in their input and output or they may be characterized in terms of their general algorithmic approach. For example, Cyc is purely symbolic, both in terms of the level of abstraction in the input and output, and in terms of the reasoning algorithms. On the other hand, machine learning algorithms often span levels. For example, an algorithm that learns to classify images may take input in the form of pixel arrays (the connectionist level of representation) and generate output in the form of words (the symbolic level of representation). The learning algorithm might take a symbolic approach, using hand-coded rules to map pixel patterns to internal symbols (representing shapes such as circles and lines), or the algorithm could take a connectionist approach, using a neural network model.
I believe that AI research needs to proceed on all three levels, and that, ultimately, the levels must be integrated. For example, the Cyc project has not succeeded, because it only addresses the symbolic level. Likewise, neural networks do not address logical reasoning or theorem proving. Gärdenfors argues that the spatial level is the bridge between the symbolic level and the connectionist level. This seems right to me.
For the last few years, my main research interest has been analogy-making and the closely connected problem of understanding semantic relations. The best-known algorithm for analogy-making, the Structure Mapping Engine, is purely symbolic in representation and general algorithmic approach. It has been argued that this is a central weakness of the SME approach to analogies. Tony Plate’s approach to analogies is a blend of connectionist and spatial methods. I am seeking an approach that integrates all three levels of thought. My own approach to analogies is essentially spatial. Like Gärdenfors, I am hoping that this is the right level to work at, in order to bridge the connectionist and symbolic levels.
Filed under: Computational Linguistics, Computer Science, Philosophy of Mind, Semantics | Tagged: symbols, geometry, concepts, connectionism, conceptual spaces
Out of intuition, I would tend to categorize language as a form of connectionism.
Out of intuition, I would tend to categorize language as a form of connectionism.
I believe that our capability to understand language must use representations and algorithms from all three levels. (1) The traditional view is that language is symbolic. The word “logic” comes from the Greek logos, meaning “word, speech, discourse, reason”. Formal logic originated from Aristotle’s syllogisms, which were based on how we use language to reason. Logic is an appropriate tool for modeling some of the ways that we use language. (2) Gärdenfors makes a strong case that the spatial level is most appropriate for modeling many aspects of concepts. For example, measuring similarity of concepts is more natural at the spatial level than the symbolic level. (3) Much of our word knowledge is derived from association, both association between words and association between words and what we perceive. For example, we can determine whether a word is praising or criticizing purely by its statistical association with the words “good” and “bad”.
We use words to express concepts (atomic/spatial level), which we can combine in sentences and propositions (molecular/symbolic level), and we learn associations (subatomic/connectionist level) between words and perceptions. This picture is a bit simplistic, but I think a more refined analysis would preserve the basic claim that language needs all three levels.
It seems a good place to start on spatial thought might be Hausdorff’s axioms of point set topology. This has the virtue of not only providing a formal link to:
1. spatial descriptions directly in the language of topology itself
2. symbolic descriptions via Hausdorff’s axioms
3. connectionism’s fundamental concept of the connection as it is in topology
but also providing an approach to proving the spatial analogue of universality — as in general recursive functions. Such proofs of universal description are available in the levels of the symbolic and connectionism. Similar proofs of universality in the spatial level seem likely founded upon Hausdorff’s axioms — likely enough that I suspect someone’s already published it somewhere.
Such proofs of universal description are available in the levels of the symbolic and connectionism.
I guess you’re referring to the Universal Turing machine for the symbolic level and the Cybenko theorem for the connectionist level. Your suggestion that there may be a spatial analogue to these is very interesting.
Yes. Thanks for the links.
I guess you’re referring to the Universal Turing machine for the symbolic level and the Cybenko theorem for the connectionist level.
Actually, it seems that three different groups independently proved some form of universality for connectionism:
1. Cybenko, Approximation by superpositions of a sigmoidal function, 1989.
2. Hornik, Stinchcombe, White, Multilayer feedforward networks are universal approximators, 1989.
3. Funahashi, On the approximate realization of continuous mappings by neural networks, 1989.
More evidence against The Heroic Theory of Scientific Development.
I am entering a math territory I don’t fully understand here, but could it be that Information Geometry has a role to play in this discussion:
http://cscs.umich.edu/~crshalizi/notebooks/info-geo.html
“This a slightly misleading name for applying differential geometry to families of probability distributions, and so to statistical models. Information does however play two roles in it: Kullback-Leibler information, or relative entropy, features as a measure of divergence (not quite a metric, because it’s asymmetric), and Fisher information takes the role of curvature.”
If so, Guy Lebanon’s work might be of interest:
http://www.stat.purdue.edu/~lebanon/research.html
“We studied and demonstrated the deficiency of Euclidean geometry in modeling documents. We proposed instead to use an embedding principle that uses Cencov’s theorem to obtain a canonical geometry for documents, based on the Fisher geometry of the multinomial distribution (joint work with J. Lafferty).”
Thanks for your insightful comments! I totally agree about your characterization of CYC. A couple of years ago I used the conceptual spaces approach to criticize the Semantic Web project along the same lines:
How to Make the Semantic Web More Semantic
… proving the spatial analogue of universality …
I think a natural approach to showing universality at the spatial level is to build on the work of Dominic Widdows and Keith van Rijsbergen:
Geometry and Meaning, Dominic Widdows
The Geometry of Information Retrieval, Keith van Rijsbergen
They link geometry to quantum logic. This might sound a bit abstruse or implausible at first, but Widdows makes it seem very natural and intuitive.
Thanks for your insightful comments!
You are most welcome! Thanks for the paper.
“This might sound a bit abstruse or implausible at first”
Not for those who are already familiar with the field: using quantum logic has also been suggested in the joint works of Diederik Aerts and Liane Gabora. However this is still focusing on logic, be it quantum or otherwise, and the most difficult point with semantics comes before logic, in the concept formation phase. Once we have some concepts, however fuzzy and ill defined they can be, we are in “safe territory”; we can philosophise, ponder, compute, argue ad libitum.
The problematic phase for a computer to do AI isn’t even to recognise an object or concept, a dog or a verb, but to come up with the “idea” that tagging a whole bunch of “inputs” with some unique label is a good way to organize the chaos and build a model of “reality”. As long as the AI researchers will skip this question by doing the conceptualisation themselves instead of leaving it to the computer, they won’t have any useful insights about “intelligence”. Before the computer can go ahead with any “logic”, it has to recognize concepts, and before recognising concepts, it has to discover them; it should not be told that there is a dog or a verb “out there”.
Unfortunately everyone is too eager to get to the “Holy Grail” of understanding, to clarify the “root problems” first; ever more so, the linguists and cognitive scientists: Implicit Understanding and Inference in Language.