The Symbol Grounding Problem

There is a view that the meaning of words (more generally, of symbols) must be grounded in sensory perception or in physical interaction with the world (embodiment). If symbols were merely defined in terms of other symbols, then it seems that we would have an infinite regression; we would spin in circles in symbol space, without ever reaching meaning. If symbols must be grounded, then a computer without sensors and actuators might possibly pass the Turing Test, yet everything it said would be meaningless (at least to the computer itself, if not to the audience). A stronger view holds that only an embodied computer could ever possibly pass the Turing Test. Introspection suggests that our symbols are grounded, but is grounding really necessary for meaning?

Many researchers view symbol grounding as a major issue. Lakoff and Johnson argue that it is a fundamental problem for Western philosophy. Regier has developed a computational model in which spatial words, such as above, below, left, right, around, and in, are grounded in visual perception. Woods et al. have developed an algorithm that learns to classify images of objects, such as cups and chairs, based on whether it appears that the objects could support the expected function.

It might seem that mathematics, at least, does not require grounding in sensory perception. Mathematics exists in the Platonic realm of ideas, which we see with our inner vision. However, Lakoff and Núñez argue (persuasively) that mathematics is based on metaphorical reasoning from embodied experience.

French has argued that there are subcognitive questions that can only be answered by an embodied entity, such as a human being, or possibly a very sophisticated robot. Subcognitive questions probe the network of cultural and perceptual associations that we build as we live our lives. For example, how good is the name Flugly for a glamorous Hollywood actress? I have shown that at least some of these subcognitive questions can be answered by statistical analysis of very large collections of text. How far can we go with this ungrounded approach to meaning?

Suppose that symbols are defined in terms of other symbols, either explicitly, as in a dictionary, an encyclopedia, or a semantic network, or implicitly, as in a very large collection of text. It seems possible that meaning might emerge from the complex interconnections among these symbols, without grounding in perception. I have not yet decided whether symbols must be grounded. Perhaps meaning consists of dynamic interconnections. Perhaps some meaning (e.g., the meaning of red) must be grounded in perception, but other meaning does not need perception. What is so special about sensory perception? Language can be used for perception (I can read about a distant country) or for action (talking can change people and it can change things in the world). In the end, whether it’s symbols or perceptions, it’s all bits and bytes to a computer.

3 Responses to “The Symbol Grounding Problem”

  1. I think symbols must be grounded/be the embodiment of someone’s interaction with the world, in order to communicate anything meaningful (i.e. useful for the interaction with the world). Designing a system of embodied symbols from scratch does not seem feasible, but also not necessary: there is a highly fine-tuned system of such symbols already - the human language. Thus, any symbol that is derived from natural language symbols (such as an average of co-occurrence vectors for a set of words, used to represent a certain semantic category), will be grounded and (in principle) meaningful.

  2. I just read this blog entry by Hal Daume III:

    http://nlpers.blogspot.com/2007/02/language-and-x.html

    It draws an interesting connection between symbol grounding and “cross-modality reinforcement of what we’re learning”. This reminds me of Marvin Minsky’s idea of “panalogy”: to understand something, you must be able to represent it in many different ways:

    http://csc.media.mit.edu/PanalogyHome.htm

    It also reminds me of the work in machine learning on co-training:

    http://citeseer.ist.psu.edu/blum98combining.html

  3. I have been following the work of Deb Roy’s group at the MIT Media Lab (http://www.media.mit.edu/cogmac). They have been looking into grounded or situated language systems. A recent thesis by Peter Gorniak (http://www.media.mit.edu/cogmac/publications/pgorniak_thesis.pdf) has a good discussion of the issues in Chapter 2.

    Your last sentence (everything is bits and bytes) does not give me much intuition - everything is atoms and molecules but that does not necessarily help one design a car.

    Maybe we can say that for certain applications, grounding is essential. If a real or simulated robot is to understand a sentence such as “Can you help me with this?” it is important that the words are not only connected to their Wordnet senses and corpus occurances but to the real world situation at hand. On the other hand maybe we can get away without grounding in applications like machine translation or information retrieval. Nevertheless I believe many of the current linguistic problems we are struggling with (wsd, syntactic ambiguity, anaphora), would benefit greatly from a better representation of non-linguistic context (communicating agents, their environment, goals, speech acts etc.) The non-linguistic context does not necessarily have to be the physical world, it can be a simulated environment, or even an abstract mathematical domain; just something which the language is “about”, and can help us with what makes sense and what doesn’t.

    One of the defining characteristics of language is its intentionality (aboutness), and I seem to have an increasingly harder time to motivate myself working on language while systematically ignoring what the words we study are actually about.

Leave a Reply