First draft February 98, September 99, ... (please ask before quoting)



Carlo Penco
penco@unige.it
Three alternatives on contexts*




Context is a concept used by philosophers and scientists with many different definitions. Since Dummett we speak of "context principle" in Frege and Wittgenstein: "an expression has a meaning only in the context of a sentence". The context principle finds an extension in some of Wittgenstein's ideas, especially in his famous passage where he says that "to understand a sentence is to understand a language". Given that Wittgenstein believes that "the" language does not exist but only language games exist, we should conclude that he is speaking of the need to consider any sentence always in the context of a language game1. This general attitude is certainly attuned with the contemporary tendency to place contextual restrictions to the interpretations of our sentences. However we find so many kinds and forms of restrictions that this general attitude is not enough to give us a viable tool to find an order in the web of so many different theories of context. To look for an order or, at least a clarification, we may start with two contrasting paradigms of theories: the "objective" theory of contexts, where context is a set of features of the world, and the "subjective" theories of context, where context is the cognitive background of a speaker or agent in respect to a situation2. We have here not only two different ways of using the term "context" but also two different conceptions of semantics and philosophy. The different conceptions are normally associated, respectively, with the classical paradigm of model theoretic semantics (Kaplan, Lewis Stalnaker) on one hand and with the A.I. paradigm (McCarthy, Buvac, Giunchiglia) on the other hand. For sake of simplicity I will restrict my attention3 mainly to Kaplan 1989 and to McCarthy 1993 and Giunchiglia 1993. The two different conceptions can be summarised with the following schema:

a) context as:

set of features of the world

"context is a package of whatever parameters are needed to determine the referent ... of the directly ref.express."
[Kaplan 1989]

"each parameter has an interpretation as a natural feature of a certain region of the world"
[Kaplan 1989]
b) context as:

set of assumptions on the world (+ rules)


"context is a group of assertions closed (under entailment) about which something can be said"
[McCarthy 1993]

"a theory of the world which encodes an individual's perspective about it"
[Giunchiglia 1993]



In "Afterthouhgts" Kaplan speaks explicitly of the "metaphysical" point of view in describing contexts, while in "Notes on formalizing contexts" McCarthy uses a notion of context which leads to the idea of "microtheory" (Guha) or towards the idea of a subjective point of view on the world (Giunchiglia). Given these differences I will call the two different conceptions of contexts as:
(a) "objective" or "metaphysical" (ontological) theory of context.
(b) "subjective" or "cognitive" (epistemic) theory of context.
We have here two very different interpretations of what a context is: features of the world or representation of features of the world. Apparently the concern of the cognitive theory is wider than the metaphysical one; the cognitive theory is concerned with any feature of the world, not only with the limited set devised by Kaplan (however enlarged by Lewis4). An inviting picture is often tacitly assumed: the two theories seem to correspond to two contrasting philosophical stances and two different kinds of formalism:
(a) the metaphysical theory is an expression of realism or objectivism and goes hand in hand with model theoretic semantics (particularly with the direct reference theory and with the double indexing).
(b) the cognitive theory is an expression of anti-realism attitude, typical of cognitivism and subjectivism; it goes hand in hand with computational, mostly syntactic, solutions (with predicates of belief wich take names of propositions as arguments).
I don't think this pairing of theoretical interpretations and kinds of formalism is correct; on the contrary it seems to give an oversimplifying and misleading picture. To link multi-context theories with a subjectivist view represents a dangerous step which would cast a useless restriction on such theories. On the other hand it would be possible to use model theoretic semantics to represent a subjective point of view (think also of autoepistemic logics). However, to make the contrast simpler, I will keep this general oversimplification as a starting point.
The discussion, eventually, should be done at a logical level. From a more specific logical approach, we may refer the contrast between model theoretic semantics and local model semantics5. Which formalism can better express our basic intuitions on the working of our language and reasoning? Shall we have a radical contraposition or may we find an equivalence relation between the two paradigms? After all alternative paradigms sometimes converge (think the unexpected equivalence proof between Montague grammar and transformational grammar given by Barbara Partee in the seventies).
In this paper, however, I will not deal with a confrontation at a logical level; I will discuss instead some philosophical aspects of the contrast between Kaplan's theory of demonstratives and McCarthy's theory of commonsense reasoning. Contrasting the two kinds of theories we are offered different possible strategies:
(1) the two theories are dealing with different problems and must be kept and developed each for its own matter.
(2) the two theories have a large amount of intersection and should co-operate to solve problems which are not solvable by the other one.
(3) the two theories are reducible one to the other, and it is to be decided which direction is the most promising.
As it often happens, probably no one of these possibilities is the right one; a most realistic and promising alternative could be a work of convergence which composes the best of each approach. The three alternatives however deserve a careful study because the problems posed by each of them can help to enrich our understanding of the possibility of the future research.

(1)

A SEPARATIST VISION

A separatist vision stresses the difference of aims and problems to be solved by the two kinds of theories.

The theory of the metaphysical context has been devised in order to treat the peculiar logical behavior of indexicals (expression like "I", "here", and so on). In classical semantics it was impossible to give a correct semantic value to sentences with indexicals because of their dependence on context. The classical example given by David Kaplan:

"I am here now"

is a sentence which is always true; however it is not a necessary truth, because we cannot say that it is true in all possible worlds. I might have been somewhere else. Kaplan 1977 (parr. VI-VII) proposed a solution for the formal treatment of this kind of sentences (which, following Kripke's terminology, we might call "contingent a priori "). We have to distinguish between two indexes in which to evaluate the sentences: on one hand we make an evaluation at all circumstances (time and possible world), on the other hand we make an evaluation at the context of utterance (speaker, time and location). From this work onwards logicians began to speak of "double indexing"6 to indicate this novel treatment of semantical evaluation. Double indexing is a tool to evaluate two different aspects of indexicals: one aspect deals with the objective context of utterance, and evaluate the linguistic meaning of the indexicals, the "character", intended as a function that - given the context - gives the "intension" or "content" of the indexical. E.g. the character of "I" will be a function which gives, depending on each context, the way to refer to the speaker of the utterance in any possible worlds. It will give the "intension" of "I" as used in that context, that is the constant function which gives the same individual at each possible world.
In short Kaplan develops the main idea of model theoretic semantics (the meaning of a sentence (intension or content) is its truth condition), enriching it with a new level of semantic analysis, the level of character. While content or intension is a function from a possible world to extension, character is a function from contexts to contents. The peculiar behavior of indexicals is summed up in saying that indexicals (and demonstratives in general) have stable content (they are rigid designators) and unstable character (they map on different contents, depending on the context).

The theory of cognitive context has been devised in artificial intelligence to solve a problem of common sense reasoning. After the attempts given by non-monotonic logic, especially circumscription, McCarthy thought that a problem was still unanswered: the problem of generality. Any system of axioms can be transcended: we may always find a wider context where the axioms are not valid. We need therefore to keep always in mind the cognitive context in which we reason and to make every assertion relative to a context. In order to realize this, we may use operations or rules among contexts, like:
- entering and exiting a context,
- discharging some sentence, true in some context, but false in a wider context.
- lifting some sentence true in some contexts into another context, verifying in this way different kinds of compatibility among contexts.
There is another aspect which has to be taken into account, which could be called "principle of laziness". Laziness guides most of our intellectual operations: we use the minimal set of information needed to solve a problem, importing information only when new facts come to the scene. Reasoning is always "local" to some context. In order to solve a problem due to some novel information we may create always a new context ("working context") importing information from other contexts: stereotypical contexts, data bases, partitions of knowledge representation (we have to remember that a point given in A.I. since the beginning is the partition of our knowledge in sub-theories, from toy words, to frames and scripts or partitioned representations7). A working context can be thought as a context in which it is possible to put the minimal set of axioms and rules necessary to solve a given problem.
The basic principles behind this strategy are the principles of locality and of compatibility. In short the two principles mean that, on one hand, reasoning is always local and, on the other hand, most contexts share rules, strategies and information which permit to navigate through them (example: I may assume that p is true in context A; then enter context A and derive q; eventually exit context A and assert that q is true in A). The study of the rules among contexts is one of the most promising novelties in this field of research. The framework inside which this work is done is the formal treatment of common sense reasoning, default reasoning and problem solving in actions
These operations or rules across contexts help to give a general framework for defining contexts as a rich formal object, a new tool for the analysis of reasoning. Actually McCarthy remarks that we cannot expect a definition of the concept of context in AI; we cannot expect to know what context is: "instead, as is usual in AI, various notions will be found useful" (93,p.1). Still, in most of the works on contextual reasoning, contexts are given as assumptions associated with some circumstance; we shall therefore maintain the distinction between contexts (set of assertions representing the cognitive state of an individual or a group) and situations (states of the world at a certain time)8.

Which conclusion can we give from this first glance to the two theories of context? The first conclusion cannot be anything but a modest answer: we have two theories with different purposes, different logical environments, different formalisms. Let us keep an eye on both of them and on their developments, but let us not try to mix oil and water.
This answer is too modest, because of the easy interconnections between the two theories. On the one hand, at the beginning of the definition of "index" in model theoretic semantics, Lewis considered the possibility to enclose in the index also the speaker's beliefs or background knowledge, so that indexes could become somehow "cognitive". On the other hand, the theories of cognitive context have to face the problem of the context of utterance and/or the context of the "external observer". After all cognitive theories of context have been devised for treating commonsense reasoning; and in reasoning we use indexicals and demonstratives; how to cope with them? May we find some kind of integration among the two theories? In the following I will try to check some possible direction of this option.

(2)

AN INTEGRATION VIEW

Perry has insisted in many papers on the cognitive difference between character and content9, and on the relevance of this difference in relation to belief and behavior. Just two examples in a rough reconstruction:
- I am in a supermarket and I see sugar on the floor; I think something like "He, who is pouring sugar on the floor is really stupid; (therefore) I will go to the cashier to protest". Later I realize that the sugar is falling down my pack of sugar and I think something like: " I am pouring sugar on the floor; (therefore) I will reverse the pack of sugar". Here the indexicals "he" and "I" have different character and the same content (me, who is the same in all possible worlds). Only the differences of character prompt the differences of practical inferences.
- I am near a mirror and I see a bear attacking somebody; I believe he (the prey) is very unfortunate and he runs the risk of being killed and I am very sad for him. A moment later I realize that the reflex in the mirror is a reflex of me, and the bear is attacking me. I believe that the best thing to do is to run as fast as I can. In these two cases, the different character of "he" and "I" prompts two different lines of reasoning and action, linked to two different cognitive states. The content of my thoughts is the same, but the characters are different.

However the Logic of Demonstratives devised by Kaplan allows only for the general strategy that permits us to give a content from a character + a context; we need something more for dealing with the differences envisaged by Perry. We need a theory which may help us to represent the cognitive relevance of the distinction between character and content. As Perry has abundantly shown, the difference in character has consequences on my cognitive state, on the set of my beliefs and on the inferences I may derive from what I say. It is a tempting suggestion to consider the two theories as co-operating on different levels towards an integrated theory: the Logic of Demonstratives (LD) will represent the mechanism which makes it possible to derive the content from the character; a Multi-Context Theory (MC) may represent the mechanism which makes it possible to show the different cognitive contexts in which such a derivation is admissible. MC will represent the relations among contexts which permit different kinds of inferences depending on the indexical used by the speaker in the objective context.
Think of a description of the two different contexts exemplified above:
(c1) a person is attacked by a bear without acknowledging that he is attacked
(c2) a person is attacked by a bear and he acknowledges that.
We might have axioms (using McCarthy operator "is true") such as
in c1: is true that the person referred to as "he" is attacked by a bear
in c2: is true that he(c1) = I
therefore
is true that I am attacked by a bear
Given that a general rule for reacting to an attack by a bear is to run away, if in c1 I believe that he has to run away, then in c2 I believe that I have to run away. However, in c1 I do not have the identity between the token "he" and the token "I". In c1 the person attacked by a bear does not run away and he is killed. That is, in c1 I am killed.
McCarthy speaks of the neutrality of his idea of context from a philosophical point of view; according to him contexts are a mathematical tool, as groups. A theory of contexts should be considered, like the group theory, as a theory which can be applied to everything it can be applied. However it is difficult even to think of a theory of context in McCarthy's sense as formalizing Kaplan's idea of context. In Kaplan context is just a set of parameters, features of the reality, while in McCarthy contexts are sets of assertions. It might be possible to embed a Kaplan-style theory of context inside a multi-context theory, using different names for Kaplan contexts (for instance "situations"). Actually some attempt has been given to apply a standard model theoretic semantics (Kripke models) to multi-context theories. However there is some deep doubt of the utility of doing such a compromise, while there are well grounded attempts to build a different kind of semantics for multi context systems10. Both a philosophical and a technical point can be used to suggest an alternative view, where the theory of objective context is reducible to the theory of cognitive content (warning: without reducing the objective context or situation to the cognitive context).

(3)

A REDUCTIONIST VIEW

The cases given above (Perry's supermarket and Perry's bear) just reveal which inferences the logic of demonstrative cannot account for. However we cannot ask a theory to do the job it has not been devised for. LD's work consists in making the step from contexts to contents. We cannot ask more from this theory which in itself represents the best treatment of indexicals.
However we may think of a general problem for LD; in order to work properly, LD presupposes we assign certain values to the parameters (speaker, location, time). What happens when we are not able to give a fixed value to the parameters? We may make a list of situations in which such a problem arises:

- situations of dialogue
(continuous shift among different "I")
- situations of vagueness
(when "here" depends on an intended "there")
- situations of lie
(when "here" is uttered to mean somewhere else)
- situations when cognitive context is relevant
(in general)
Let us take some examples: Kaplan gives much importance to the fact that utterances like "I am here now" are always true in the context. However I could say "I am not here now" in an answering machine or I could write "I am not here now" on a post-it, attaching it on the door. If this token is true in the context, therefore "I am here now" is false. However, while uttering or writing the token "I am not here now", I am there, and the token should be considered true. Or imagine, coming back to office, to write "I am here now" to inforò people that you are back; the intended meaning is that you are inside the room. However you are writing on the post it while you are in front of the door. Shall we say that the utterance is true just in the moment you are in front of the door? Not really. In order to interpret this token we need a cognitive context which asserts that "here" corresponds to the place where the speaker is tipically thought to be (that is "here", here, means beyond the door and not in front of the door). The example could be extended taking the case where you utters "I am here now", to tell somebody what to write on the post-it. In this case you would use "I" as in quotation marks. You will say: Kaplan's theory requires that we do not treat quotations. Besides the sentence is not strictly true and depends on the time of its reception and not on the time of its utterance (this is another typical distinction in the literature). But a hearer who arrives at that moment would hear just an utterance of "I am here now" and probably understands it at face value. She, the listener, has no hint to which kind of token is the one she hears, because she lacks the relative cognitive context. Another example: I pretend to leave town and I am going to a conference somewhere else (example given by somebody at a Conference on Context in Sophia, 1997). But I go to visit my mistress at a conference in town. From there I telephone my wife saying "I am here now. We will meet tomorrow as agreed." Is my utterance true or false? I am at a conference, but I am not in town. You will say: it is a half truth. Kaplan's theory requires that we rely on sincere assertions. However I am sincere. I am at the conference, here, now. My wife certainly understands that I am at the conference, there, out of town. I tell the truth and I am sure my wife understands the false. LD just tells you that the speaker is in town, making the utterance true. It gives no hint at all to understand what is really going on.11
We might treat the previous examples under the "integration view", where cognitive context has just the role to fix the value of indexicals, and LD begins after that. However, if we find too many uses of indexicals which require ad hoc adjustments, and many interesting uses of indexicals which cannot be accounted for in LD, we might think of an alternative paradigm. We have in the literature different suggestions to treat the multiform use of indexicals12; Quentin Smith suggests a rule for treating indexicals like "I" as referring to entities which have a relation with the speaker (with the speaker itself as a limiting case). This takes into account sentences like "I am short of petrol", and so on. Recanati suggests that indexicals and demostratives are not really self reflexive tokens, but tokens whose linguistic meaning (or character) is basically intended to pick up some "relevance" relation: "here" picks up the relevant place, "now" picks up the relevant time, and so on. Being the place and time of utterance is just one of the many possible relevance relations. Certainly, since Lewis, there have been many attempts to enrich the metaphysical "context" with background knowledge and standards of precision. However, as background knowledge becomes more and more important in principle, we find very difficult to give it a formal treatment, even in the setting of the already complex arrangements of model theoretic semantics. A look for alternative treatments could help in imagining new strategies of research.
As said before, the traditional answer of multi context theories is that background knowledge is partitioned. Sentences and utterances are interpreted relative to local models depending on cognitive contexts, which could be interpreted as partitions of the background knowledge. We need a representation of these different partitions of our knowledge and the rules which define accessibility among these partitions, which is exactly the aim of multi contexts theories. But we cannot always take a well defined partition as a starting point; we may build it up via rules among other already defined partitions. Therefore we need some mechanism to build up our cognitive contexts as the reasoning advances. We have already in the literature some tool useful for this purpose: some formal elaboration of bridge rules or other operations among contexts, the idea of working contexts and the idea of bridge rules, whose significance is partly still to be worked out.13 If this ideas are put to work, the general framework of multi context theory could be thought as a way of treating in a unifying way all the cases LD is able to treat AND all the cases LD is not able to treat.
What do these programmatic remarks mean concerning our intuitive contraposition between objective and cognitive context? The main point is that we cannot speak of objectivity unless within some point of view.14 The idea of the objective reality, independently of our access to it is derivative in respect of this relativistic position. This is confirmed by the suggestion that the idea of objective reality independent of us arises when there is conflict on different opinions and beliefs15. This does not mean that objective reality depends on a point of view; it means that we cannot express objectivity without placing ourselves in some contextual point of view16. We have therefore to give constraints, which go beyond a general definition of a logic of demonstratives, which represents a metaphysical point of view from nowhere. Let us try to explain better this point: Given a simple case, with an individual and an observer, some general constraints dealing with indexicals and demonstratives could be expressed in the following way:17
(1) the meaning of a sentence depends on speaker, place and time;
(2) the value of these parameters (speaker, place and time) must be represented
(a) as a part of the cognitive state of such a speaker
(b) as a part of the cognitive state of an observer.
(3) speakers and observers may give different evaluations of these parameters, therefore we need to have a representation which explicates always the cognitive context from which the evaluation of the parameters is made. We may explicitly represent this point of view as the point of view of the interpreter.

A main point of these remarks is to treat indexicals inside a framework of defeasible reasoning. Expressions as "I", "here", "now" have so different uses that we need to give rules to distinguish always not only the time and place of utterance, but also the time and place of the actual or intended audience, giving different restrictions when these are the same or different. What is more important, we need to plug in our formalism some rule to permit an intended interpretation to be defeased in face of new information on these aspects. We may give formal treatment of the working of our language as if there were an absolute point of view, from which to give values to any parameters we need. However in our linguistic interchange we just aim at objectivity and truth, and we have to embed explicitly in our formal representation of objectivity and truth the possibility of failure.
This result does not entail the elimination of an objectivity independent of human accessibility. It is a result about our forms of expressing objectivity as what we provisionally reach; we may also build theories with as-if condition (if the evaluations of the parameters are given the theory would work as in Kaplan). But evaluations can not always be given, and most often, when given, they are wrong. Our ontology is what we say the world is made of; therefore we need to take into account every time the point of view, the cognitive context where the objective state of affair is presented as such. What we think objective may always result in a mistake.

Conclusions

At most this paper could help to stimulate a comparison between theories, both on technical and philosophical point of view. The success of model theoretic semantics could give suggestions even in a different framework, where researchers deal with problems - such as limited knowledge - which were not the basic preoccupations of people working on the traditional logical methods.

At least this paper is supposed to give some material for reasoning about the different ways we use the term "context": context is what we know about a situation: we might speak of "situations" as the set of physical features of reality (and fiction), and of "context" as the way of representing them. Alternatively we might speak of "contexts" as the physical features of reality (and fiction), and of "views" as the way of representing them. We might also go on well using the same term for different entities, and using "context" both for some physical features (speaker, time and location) and for a representation of our knowledge of a situation. But we may do that only insofar as the two contexts in which we do that do not come in contact. When they do, we need a choice.



BIBLIOGRAPHY

  • Bianchi C. 1999 "Three forms of contextual dependence", in in P.Bouquet, L.Serafini, P.Brézillon, M.Benerecetti, F.Castellani (eds.) Modeling and using context -Lecture Notes in Artificial Intelligence 1688, Springer, 1999 (67-76).