Monday, April 28, 2003
“It may well be that the way to build an intelligence is just to get your hands on dirty engineering problems. We don't have a theory of automobiles. We have good cars, but there are no fundamental equations of automotive science.” - Hans Moravec [1]
In short, I find Minsky’s descriptions of the problems and possible solutions to AI common sense highly congenial. Describing common sense knowledge as ‘all the things that don’t need to be stated’ neatly sums up our intuitions. As Lenat [2] remarked, “within a few months we realised that what [encyclopaedias] contain is almost the complement of common sense”. When thinking about common sense, Minsky emphasises the following ideas: a large and diverse compendium of methodologies, facts and representations; knowing “a little about a lot”; the consequent need to rely on ambiguity, analogy, metaphor, association
and other strategies; lots of exceptions, inconsistencies, inherently buggy, and yet very robust.
I
don’t disagree with any of these points. Instead, I’m going to discuss
a set of related issues that are problematic for any common sense or
large-scale intelligent system, which I believe might bear further analysis.
One way into these issues is to look at the problem of common sense
reasoning as a problem of search. Let us assume, for the moment, that
the knowledge we need to answer most common sense questions about a
simple story has been encoded in some declarative form, e.g. as a series
of natural language facts, or predicate calculus formulae. A real problem
of combinatorial explosion arises – unlike a search through chess moves,
for example, the kinds of search common sense reasoning requires usually
tends to be shallow but potentially much much wider. As Henry Lieberman
put it, the size of human common sense knowledge is around the “small
end of infinity” (estimates from 20m to several billions). However,
even brief introspection shows that however we manage common sense reasoning,
we seem to do it with great ease and rapidity. I’m going to argue that
perhaps a greater emphasis on simulation and models may help with some
aspects of the problems of relevance, context and combinatorial explosion. Perhaps we shouldn’t visualise common sense knowledge as a warehouse of facts, but as a set of integrated models of the world. This cuts across any facts vs methods distinction, since facts can be generated from instantiating a model with parameters, and methods devised or tested within a model. Perhaps common sense cannot be captured by any realistic number of assertions (and the brain doesn’t do it this way) – those assertions are the symptoms rather than the cause of common sense. We generate those assertions, but we cannot generate more assertions from them without being able to plug parameters into our model of the world. But because we don't want to have to fully instantiate a complete model of the world every time we consider how to deal with a particular closed situation, we divide our world model into many sub-models of different domains and at different levels. The same particular situation may be multiply represented (if you feed in the right sets of parameters to each model-representation) to give us robustness and flexibility, and to allow us to deal with uncertainty and incomplete information. This sounds very much like a standard description of a frame-array with common terminals, and frames and scripts are certainly vital representations in any common sense system. But frames alone can't do everything, and I don't think Minsky intends that they should. Imagining a domain close to my heart, a soccer game requires me to know a great deal about the way my body works, how the ball flies, and how people move in order to be played well. The niceties of soccer strategies might not seem to really come under the description of common sense (at least in America) until we consider that if we know enough about it, we can generate all sorts of useful common sense assertions like: it involves somewhere between two and approximately twenty people, it takes place in a confined space of a given size, it can be used as a basis for understanding lots of other sports and competitions etc. We could program in all these facts, but people have a much richer understanding of them from just participating themselves once. I am concerned though that frames are overly linguistic/symbolic, and cannot deal well with noisy or incomplete data in high-dimensional or continuous procedures or situations. The most important such realms/areas that I can think of are motor/body knowledge, local geography, naïve physics and emotional and social interactions. These comprise a very large proportion of common sense knowledge. In all such cases, I believe we can store such knowledge much more efficiently in some procedural form, and generate the relevant common sense assertions by simulation. Indeed, simulation theory is one of the leading theories of ‘theory of mind’, i.e. our ability to understand and predict others’ behaviour by simulating how we would feel in their situation if we had their beliefs and desires. I am not sure how literally I am arguing that we imagine what we would do if we were a glass of water teetering on the edge of a table, but we definitely have the procedural/modelling ability to do this, and it’s difficult to see how a declarative knowledge base could ever be rich or exhaustive enough to demonstrate the robustness that our common sense knowledge has. Knowing which model to use at a given time, and which default parameters, is the problem of context. In a sense, the problem of context is the problem of common sense. Common sense on this view is the business of knowing when to apply which rules and exceptions, when to override default parameters, and which level of granularity is sufficient. As Minsky puts it: “‘Lift’, for example, has different implications when an object weighs one gram, or a thousand, and really to understand lifting requires a network for making appropriate shifts among such different micro-senses”. Of course, I have put the case for procedural over declarative common sense representations more strongly than it should be. Multiple representations are clearly the way forward, as long as we have a means of moving between them.
The
second major point I want to make relates to the way that we think of
‘common sense knowledge’, in terms of how we define it and its psychological
qualities. It
feels salient somehow that there is a common, well-known word/phrase
in English for ‘common sense’, whose meaning when applied to different
situations is itself common sensical, and which feels as though it’s
pointing to some neatly delineable set of representations or methods.
In view of this, I hoped to be able to come up with a definition that
even more neatly captured the distinction between common sense and intelligence. Starting
with Minsky’s slightly tongue in cheek definition of intelligence as
being what’s exhibited “when you see someone do something you want to
be able to do”, I contrasted this with a preliminary redefinition of
common sense as: the baseline level of performance on tasks in
an environment you’re used to This
was intended to capture the fact that common sense tends to be used
as a kind of binary measure of whether or not some ability or
knowledge should be expected of every adult. I will return to this notion
of common sense as a threshold measure of difficulty below. By
adding the caveat ‘in an environment you’re used to’, where environment
is intended to encompass everything from jungles to classes, I was hoping
to make room for the kind of relativisation of common sense by culture
and specialisation/expertise that we clearly see. For instance, it’s
common sense to computer scientists that if I’ve only typed a couple
of characters and yet my C compiler suddenly reports tens of errors
that I should start by look for a missed semi-colon or bracket, but
that wouldn’t be common sense to a novice programmer. In a similar fashion,
the common sense that I have would be only barely applicable if I was
wandering around in a space suit on the moon. Think of the things that
my common sense would get wholly wrong – my estimations of the way my
body moves and the physical limitations of what is possible, my ‘naive
physics’ understanding about how flags move in the wind, chemicals react,
how to survive, social interactions with other astranauts – these sound
like small things, but if we were to take a less extremely different
environment, the usefulness of our common sense would be proportionately
affected. Our environment is defined in terms of our embodiment and
the way it limits and affects our interactions, which is why common
sense is so human-centred. Common sense is domain-specific expertise
where the domain is the everday. The
hope was that I could contrast this with intelligence, which is a more
‘inventive’ process for solving more novel or unexpected problems, perhaps
in unfamiliar domains, perhaps in the abstract, for learning new rules
and facts quickly and relating them together, often involving a deeper
search and more difficult cognitive abilities like dynamic chunking and
recursion. Although this definition of common sense works reasonably well, I’m going to try to argue for a stronger statement, namely that common sense is: a minimum set of optimised representations that
allow us to be ‘intelligent’ in a new knowledge-domain It
seems key to me that the best way to pin down common sense knowledge
is as the things that are always left implicit and unstated. The assumption
that Minsky makes is that this is because it’s knowledge that everyone
shares, and so it would be redundant to write it down every time. I
briefly entertained the idea that common sense is knowledge that doesn’t
get written down because it can’t easily be written down, or
because our linguistic concepts bootstrap themselves out of our common
sense knowledge – but of course, common sense can’t be wholly sub- or
non-linguistic since we can put it into words for the most part if we
try (witness Open Mind). So
instead, one possibility is that the threshold of common sense is the
point at which words and concepts become rich enough internally and
strongly inter-connected enough for us to be able to talk about them
and use them with ease, familiarity and confidence. The suggestion is
that at the core of any human knowledge-domain is a set of models, methodologies,
scripts and facts which have been expanded, optimised and inter-referenced
to allow rapid searching to only a couple of levels deep, with exceptions
and inconsistencies already clearly marked and bounded. These common
sense cores may use a different, more redundant representation than
knowledge in domains in which our common sense doesn’t apply. I’m imagining
something analogous to the way in which important speed-intensive loops
or inline functions are optimised by modern compilers. This
would fit with our vague intuition that common sense knowledge is somehow
distinct from normal knowledge, and also from the kinds of abilities
that we consider intelligent, and that this difference can be seen from
the outside both as a measure of performance and in terms of the familiarity
of the task. It would also explain why one can’t be intelligent in a
domain without first having some common sense notions of how to operate
within it.
Strangely enough, my initial idea was to try and conflate common sense and intelligence, viewing them as the same processes being applied to either familiar or novel problems. I took the view that more or less all of the ideas about how we learn and ways to think were applicable to common sense knowledge. I also took the view that any sort of approach that focused on storing knowledge, rather than generating knowledge from simulations and models would probably face an impossible combinatorial explosion, and that predicate calculus in particular was singularly ill-suited to the task of indexing the knowledge by relevance. I then independently argued for a view of common sense knowledge as being a highly-optimised, expanded, dense core at the heart of various familiar domains on which the processes we usually consider ‘intelligent’ depend. I now can’t decide whether a preference for procedural, simulation-based representations is consistent with this latter view. It may be that while procedural representations are more memory-efficient, declarative knowledge (e.g. a vast storehouse of ‘facts’) would be better suited to this optimisation of highly-relevant common sense knowledge.
--- IntentionalityOn a side note, I wasn’t sure whether the use of the term ‘intentionality’ in section 6.4.1 was quite how it’s usually used in philosophy. Intentionality in philosophy is summarised well by Daniel Dennett [3] : “Some things are about other things: a belief can be about icebergs, but an iceberg is not about anything … The term was coined by the Scholastics in the Middle Ages, and derives from the Latin verb intendo, meaning to point (at) or aim (at) or extend (toward). Phenomena with intentionality point outside themselves, in effect, to something else: whatever they are of or about.” It feels as though a more appropriate word for the seemingly wilful, purposive kind of intention (in the non-philosophical sense) that goals are described as having in section 6.4.1 would be ‘teleology’. Unfortunately, here the alliterative allure of the ‘intensity of intentionality’ may have proven a false friend :). |
Home - Blog - About me - Outbursts & outlets - Collected mishmash
Greg Detre, greg@remove-this.gregdetre.co.uk, http://www.gregdetre.co.uk - updated June 29, 2003