2. Carnap’s Early Theories of Favoring
Carnap’s central idea is that a body of evidence favors one hypothesis over another just in case it renders the former more probable than the latter. To make this precise, he first represents evidence and hypothesis propositions in a first-order formal language . Such a language has predicates () that can be applied to constants () to form atomic sentences (). By combining atomic sentences with sentential connectives, we build more complex sentences. Carnap’s languages are interpreted: each constant represents a particular object in the world; each predicate represents a property such objects may display; each sentence represents a particular proposition.
With his interpreted language in place, Carnap defines a real-valued function 𝔠 on ordered pairs of sentences in
. 𝔠
measures how strongly the evidence represented by
e confirms the hypothesis represented by
h. The proposition represented by
e favors the proposition represented by
over the proposition represented by
just in case 𝔠
.
1 The substance of Carnap’s theory comes from the particular values of 𝔠. 𝔠 is defined in terms of a real-valued one-place function
on
. For any
(with
y non-contradictory),
Carnap requires that
be a
probability function in the sense of Kolmogorov [
5]: non-negative, normal, and finitely additive. He also requires that
be
regular, in the sense that it assigns positive values to all non-contradictions. Combined with Equation (
1), these requirements make
a probability function for any non-contradictory
. This squares with Carnap’s central idea that
—the degree to which
e confirms
h—is just the probability of
h on
e.
Carnap then needs to specify values for . He proves that a probability distribution over can be fully specified by assigning values to ’s state descriptions. A state description is a maximal consistent conjunction of literals of . (A literal is either an atomic sentence or its negation.) Any non-contradictory sentence is logically equivalent to a disjunction of state descriptions; that disjunction is called x’s disjunctive normal form.
For example, take a simple language
with two constants
a and
b and one predicate
G.
Table 1 lays out a set of state descriptions for
. (The table’s leftmost column provides a name for each state description.)
2 Suppose we assign
-values to each of the four state descriptions in
Table 1. Carnap proves that the state descriptions’
-values generate a probabilistic distribution over
just in case they are non-negative and sum to 1. If that condition is met, the
-value of any non-contradictory
can be calculated by summing the
-values of the state descriptions in its disjunctive normal form. (A contradictory
x receives an
-value of 0.)
Take the
column of
Table 1, for instance. Setting aside for the moment where
comes from, the
-values specified for the state descriptions of
are clearly non-negative and sum to 1. This induces a probability function
over the entire
language. For example, since the disjunctive normal form of sentence
is a disjunction of
and
,
is the sum of the
-values in
Table 1’s first two rows. In other words,
.
Carnap has now moved from the problem of specifying 𝔠-values to the problem of specifying
-values, and from the problem of specifying an entire
-distribution to the problem of specifying
over state descriptions. He notes that for any
x,
, where
is an arbitrary tautology in
. So he thinks of
as the probability of the proposition represented by
x relative to a tautology, or relative to an evidence set containing no empirical information.
3 The state descriptions of
describe the basic states of the world
is capable of discriminating among, and it is natural to think that lacking any empirical information, each of these states should be treated symmetrically. That is, we should assign each state description the same
-value. This gives us distribution
in
Table 1, which (applying Equation (
1)) yields a confirmation function Carnap calls
.
captures some intuitive favoring relations. For example, a bit of calculation with
Table 1 and Equation (
1) reveals that
Recall that evidence favors one hypothesis over another just in case the former receives a higher 𝔠-value on the evidence than the latter. So
is suggesting a sensible favoring relation: Since the evidence represented by
entails the hypothesis represented by
while refuting the hypothesis represented by
, that evidence favors the former over the latter.
But broadening our view reveals clear flaws in
as an explication of evidential favoring. Suppose, for example, that
a and
b represent two emeralds to be sampled, and
G represents the property of being green. We might think that the evidence represented by
favors the hypothesis represented by
over the hypothesis represented by
. Yet from
Table 1 we can calculate
The problem generalizes. When we construct
for larger languages, we find that even when many objects are represented the evidence that all but one have the property
G does not confirm the hypothesis that the last object has
G over the hypothesis that it does not. As Carnap puts it,
does not allow learning from experience.
The trouble is that is too symmetrical. To get learning from experience, we need a state in which all the emeralds are green to be more probable than a state in which the first run of emeralds are green but the last one is not. By treating all state descriptions symmetrically, renders our evidence incapable of discriminating among full states of the world consistent with that evidence.
So Carnap constructs a new confirmation function from the probability function . Instead of treating state descriptions symmetrically, treats structure descriptions symmetrically. Intuitively, a structure description describes the distribution of properties over objects at an abstract level, without specifying which objects occupy which places in the property distribution. A structure description might say “one sampled emerald is green while another is not,” without telling us which of the emeralds is the green one.
At the technical level, a structure description is a disjunction of state descriptions obtainable from each other by permuting constants. Our simple language
has three structure descriptions:
by itself (“both emeralds are green”), the disjunction of
and
(“exactly one of the emeralds is green”), and
by itself (“neither emerald is green”).
assigns each structure description the same value; within a given structure description,
-value is divided up equally among the state descriptions. The resulting
-values are displayed in
Table 1.
4 solves the learning from experience problem, because
This evidential favoring is possible because
has a higher
-value than
. There’s only one way to arrange the world such that both emeralds are green, so there’s only one state description in
’s structure description and that state description gets the full
probability. Yet there are two possible arrangements in which exactly one emerald is green (first emerald green, second emerald green), so
has to share a structure description with
and gets an
-value of only
. Symmetric treatment of structure descriptions yields asymmetric treatment of state descriptions, making the sought-after favoring relations possible.
5But
has a new problem: language dependence. Even with a language as simple as
, we can construct a version of Goodman’s [
6] “grue” problem. Suppose we define a language
with the same constants as
representing the same objects. But
has a predicate
H related to
G as follows: for any constant
x,
has the same truth-value as
.
6 Table 2 displays a set of state descriptions of
and their
-values, calculated just as before. But it also reveals a further fact: each state description of
expresses the same proposition as a state description of
. That means
can express every proposition expressible in
: given any
-sentence, we find its disjunctive normal form in
, replace its
state descriptions with the corresponding
state descriptions from
Table 2, and are left with an
synonym for the
original. Each
sentence has a
synonym in
: an
sentence that expresses the same proposition as the
original.
Consulting
Table 2, we find that
But
is a synonym for
and
is a synonym for
. Equation (
5) indicates exactly the opposite favoring relation—relative to the same evidence—as Equation (
4)!
Goodman reverses the confirmation relations among propositions indicated by a theory of evidential favoring by re-expressing the same propositions in a new language. But we can also make those favoring relations disappear entirely. Consider a new language
.
has one constant,
o, which names the ordered pair consisting of first the emerald named by
a and then the emerald named by
b in
.
has two predicates.
obtains when the first element of the ordered pair named by
x is green.
obtains when the two objects in the ordered pair are the same with respect to greenness—they are either both green or they are both not. Since there is only one constant in this new language, permuting constants does not turn any state description into another; each state description has its own structure description.
Table 3 displays the resulting
-values for
.
Again, each state description of
has a synonym in
, so every proposition expressible in
is expressible in
. And now we have
expresses the same proposition as
did in language
, while
is a synonym for
. In
, the favoring relations we found earlier disappear.
and
reveal that the facts
relies upon—facts about which propositions are expressed in sentences that share structure descriptions with others—are artifacts of language choice.
3. Alternative Approaches to Favoring
Evidential favoring is a relation among propositions—evidence and hypotheses. With only a few exceptions (which we’ll discuss later), favoring relations among propositions should turn out the same regardless of which language the propositions are expressed in. Yet Carnap’s formal analyses of confirmation yield different favoring judgments for the same propositions when those propositions are expressed in different languages.
Once the possibility of a language like
—a language in which all the objects are referred to via a single
n-tuple name—has come up, it may seem like any formal Carnap-style theory of favoring is doomed. But some features of evidence and hypotheses are invariant among all the languages we have considered, even
. Consider again the situation in which our evidence is that emerald
a is green, our first hypothesis is that emerald
b is green, and our second hypothesis is that emerald
b is not. In each of the three languages we have seen, that evidence and those hypotheses are each expressed by a disjunction of two state descriptions. For example, in
their disjunctive normal forms are:
7
Notice also that each hypothesis shares one state description with the evidence but no state descriptions with the other hypothesis. That remains true across
,
, and
.
We could imagine a formal confirmation theory that works with these facts about state-description counts and state-description sharing. For instance, we can invent a “Proportional Theory” that counts what proportion of its state descriptions a hypothesis shares with the evidence, then favors a hypothesis that has a higher proportion of shared state descriptions over one that has a lower proportion. In the example at hand each hypothesis would have a shared proportion of , so neither nor would be favored by e over the other. So the Proportional Theory fails to yield intuitively plausible favoring judgments about learning from experience—but it can still serve as our toy example of a theory that gives consistent favoring results across the transition from to and . It appears at least possible that with a bit of work something in the Proportional Theory’s neighborhood could yield plausible favoring results.
But this appearance is illusory. In
Section 5 and
Section 6 I will describe a general proof that rules out all formal theories of evidential favoring. Starting with very general conditions on what any formal favoring theory would have to achieve, I will show that even if a theory yields consistent indications of favoring for a set of evidence and hypotheses across
through
, those indications disappear when the relata are expressed in yet another language. After describing the proof, I explain in
Section 7 what we can learn from it about the underlying nature of evidential favoring.
Possessing such a general proof is also important for historical reasons. Despite the positive results
yields for simple cases of enumerative induction, Carnap was dissatisfied with that confirmation function’s inability to properly model what he called “arguments by analogy”. So Carnap produced and refined a number of successors to
. Meanwhile, Jaynes ([
7] and [
8]) developed a rival confirmation approach based on maximizing entropy in probabilistic distributions. Other authors, such as Maher [
9] have since tried to develop further formal theories of evidential favoring.
8While presenting Carnap’s first steps is useful for illustrative purposes, there’s no need to work through these further proposals because they all run into the same problem Carnap did: language dependence.
9 The result I present in the next few sections reveals that this is not a coincidence—any formal theory of favoring meeting very general conditions will have language dependence problems.
4. General Conditions on Evidential Favoring
What, in general, do we know about the evidential favoring relation? It is a relation among propositions, but we must express those propositions as sentences in a language to work with them. So if represent two hypotheses and a body of evidence in interpreted first-order language (with a finite number of constants and predicates, and no quantifiers or identity symbol), we will write when the evidence represented by e favors the hypothesis represented by over the hypothesis represented by . Unlike Carnap, we will not assume that f has anything to do with probabilities or numerical functions, though what we do assume about f will be compatible with that possibility. For example, we will assume that f is antisymmetric relative to a given e—that is, we cannot have both and .
We might also think that
f should obtain in some clear cases involving entailment relations. We saw one such case in Equation (
2): the evidential proposition represented in
by
favors the hypothesis represented by
over the hypothesis represented by
. But if we want a theory of
f to detect these entailment-based favorings, we cannot expect the theory to yield correct results for every conceivable language, because some languages will hide the relevant entailments. For example, a language with a single constant (like the ordered-pair constant in
) might express the evidence and pair of hypotheses from Equation (
2) with the atomic sentences
,
, and
. In that language there would be no way to formally recover the entailment facts in virtue of which the favoring holds.
This example shows that we cannot demand invariance of a confirmation theory’s results over every language capable of expressing the evidence and hypotheses of interest. As we go along, we will consider which kinds of language-independence we want and which are inessential. We will judge this by asking what it would reveal about the underlying evidential favoring relation if a correct theory of that relation were invariant across a particular language type. Since it seems plausible that some evidential favorings arise from entailments, we will require a formal theory to detect evidential favoring only when propositions are represented in
faithful languages. While faithfulness is defined precisely in [
4], the key condition is that the state descriptions of a faithful language express a set of mutually exclusive, exhaustive propositions. Because of this, two sentences
x and
y in a faithful language will have
just in case the proposition represented by
x entails the proposition represented by
y.
10 A faithful language captures in its syntax all the entailment relations among the propositions it represents.
Unfaithful languages go wrong by failing to capture entailment relations among the propositions they represent. But a language may also go wrong by failing to represent some propositions entirely. When we work with scientific hypotheses, a body of evidence may favor one hypothesis over another because it reveals the truth of a prediction made by the former but not the latter.
11 A language that faithfully represents the entailment relations among evidence and hypotheses but lacks a sentence representing that prediction may leave formal theories incapable of detecting the favoring relation that obtains.
We will call a language adequate for a particular set of evidence and hypotheses if it contains sentences representing not only those three propositions but also any other propositions necessary for detecting favoring relations among the three. I have no precise characterization of the conditions a language must meet to be adequate for a particular set of evidence and hypotheses. Luckily, all we need for our result is that the concern for adequacy is a concern about representational paucity. We will suppose that for any evidence and two hypotheses, there is a set of languages adequate for those three relata. We will require formal theories of evidential favoring to yield correct results when applied to adequate languages. And we will assume that if language is adequate for a set of evidence and hypotheses, and language contains synonyms for every sentence in , then is adequate for those relata as well.
To this point we have required that the
f relation be anti-symmetric for a given
e and that formal favoring theories detect the relation’s presence when its relata are represented in a language that is adequate for them and faithful. Although I believe various favorings based on entailments (such as the one represented in Equation (
2)) hold, we will be able to prove our result without assuming there are any such favorings in the extension of
f. We will, however, assume that the extension of
f contains something besides entailment-based favorings. This follows Hume’s point in [
13, Book I] that if favoring underwrites the inductive inferences that get us through our days, it cannot be restricted to cases in which the evidence either entails or refutes one of the hypotheses.
To be precise: We will say that
,
, and
e in faithful language
are
logically independent just in case any conjunction obtainable by inserting a non-negative number of negation symbols into “
” represents a non-empty (non-contradictory) proposition. If there exists at least one faithful
with logically independent
such that
, we will say that
f is
substantive. I take it Hume showed us evidential favoring is substantive.
12Finally, we need a condition capturing what it is to say that a theory of the evidential favoring relation is “formal”. Typically, formality means that a theory operates on the structure of sentences without noticing which particular items play which roles in that structure. For example, suppose a theory of evidential favoring said that in , favored over , but did not favor over . These two triples are structurally identical; such a theory would be differentiating between them strictly on the grounds that a appeared in the evidence in the first case while b appeared in the evidence in the second. A formal theory treats constants in a language as interchangeable. It also treats predicates as interchangeable, which is the condition that will play a central role in our proof. We will require f to treat predicate permutations identically, by which we mean that for any language that is faithful and adequate for and any permutation of the predicates of , entails . (Where is the sentence that results from replacing each predicate occurrence in x with its image under the permutation .)
5. First Stage of the Proof
With these notions in place, our general result can be stated simply:
General Result: If the evidential favoring relation is antisymmetric and substantive, it does not treat predicate permutations identically.
For the full details of the proof, I refer the reader to [
4]. Here I will simply explain how it works, starting with an overview of the proof strategy: Suppose for
reductio that evidential favoring is substantive and antisymmetric and treats predicate permutations identically. By
f’s substantivity there exist an
,
, and
e in faithful, adequate language
such that these three relata are logically independent and
. We will construct another faithful, adequate language
with
,
, and
representing the same propositions as
,
, and
e respectively. Since
f concerns a relation among the propositions expressed by sentences, we will have
. Moreover,
will be constructed so as to make available a predicate permutation
such that
,
, and
. Since
f treats predicate permutations identically,
. But that violates
f’s antisymmetry, yielding a contradiction.
The difficult part of this proof is demonstrating that the required , , , , and can be constructed in the general case. That generalization proceeds in two stages. The first stage begins by noting that while might contain multi-place predicates and any (finite) number of constants, given any faithful adequate we can always find another language that is faithful and adequate for the propositions expressed by , , and e but contains a single constant and only single-place predicates. We do this in much the same way we moved from language to language earlier: We make the new language’s constant represent a tuple of the objects represented by the constants of , then make the new language’s predicates represent properties of objects at particular places in the tuple (“the first object in the tuple is green”) or relations among those objects (“the objects in the tuple match with respect to greenness”). Since whenever there is a faithful, adequate language representing , , and e there is also a faithful, adequate language representing the same propositions with only one constant and single-place predicates, we will assume without loss of generality that our original language has only one constant, a, and only single-place predicates.
Having made this assumption, let’s fix a case in our minds by imagining that besides its one constant
has five predicates—it does not matter which particular predicates they are. We can then refer to the 32 state descriptions of
as
through
. And to further fix a particular case, we can imagine that the
,
, and
e in question have these disjunctive normal form equivalents:
13
The reader may establish that this particular
,
, and
e are logically independent as required.
We will now construct language with one constant, a, representing the same tuple as it represents in . Like , will have five predicates—they will be , , , , and . Like , will be an interpreted language; the key to our proof will be how we assign meanings to the sentences of . will be faithful, and will be capable of expressing all the propositions expressed by sentences of . So, for instance, will contain an , , and that are synonyms for , , and e respectively. Because expresses every proposition expressible in , and because we have assumed is adequate for , , and e, will be adequate for , , and . But is designed so that a predicate permutation that maps each -predicate to itself and interchanges with will map to , to , and to itself.
To construct
, we begin with the disjunctive normal form equivalents of
,
, and
e in
. These sentences have some state descriptions in common and some distinct. For instance,
appears in the disjunctive normal forms of
and
but not of
e. So we will refer to
as an “
-sd”. (A state description that does not appear in any of our three disjunctive normal forms will a “
-sd”.) The basic strategy for setting up
will be to make each state description of
the synonym of a different state description of
. (Once meanings are assigned to the state descriptions of
, the meanings of other
sentences will be built up by interpreting logical connectives in the usual way.)
14 This means that
, for instance, will have a synonym state description in
occurring in the disjunctive normal forms of
and
but not
. We will label that state description in
an “
-sd”.
We will achieve the permutations we want by carefully selecting which propositions are expressed by which state descriptions in
. For example, a proposition expressed by an
-sd in
will be expressed by an
-sd in
. But we get to select which
state description plays that role, so we will select one that is mapped by
to an
-sd. More generally, we will assign state descriptions of
to propositions so that
maps state description types to each other as described in
Table 4. (The rows of
Table 4 have been numbered for reference later.)
Notice how these mappings have been set up. Once we know which state descriptions express which propositions, we can determine the disjunctive normal form equivalent of
. When
is applied to that disjunctive normal form equivalent, it will replace each
-sd with an
-sd and each
-sd with an
-sd. In other words, it will replace each state description that appears in
but not
with a state description that appears in
but not
. As a result, applying
converts
into
. Similarly, if the mappings in
Table 4 hold,
will convert
to
while leaving
unchanged.
15How do we assign state descriptions to propositions so as to achieve the mappings described in
Table 4? For the specific example we have been working with, the state description assignments in
Table 5 will do the job. Here I have indicated state descriptions schematically, leaving out the “
a”s and replacing negations with overbars. The table indicates, for example, that the proposition expressed by
in
is expressed in
by the state description
. Notice that in our example
is an
-sd, and our permutation
maps the assigned
state description to itself. More generally, the positions in
Table 5 match the positions in
Table 4:
is an
-sd,
is an
-sd,
is an
e-sd,
etc. (
Table 5 does not assign state descriptions for
-sds because those assignments turn out not to matter, as long as each
-sd receives an
equivalent that has not been assigned to any other state description of
.) Assignments in the rows of
Table 5 ensure that the mappings in the same-numbered rows of
Table 4 hold. In row (i), for example, when
swaps
and
the
-sd synonym of
is exchanged with the
-sd synonym of
.
16The assignments in
Table 5 employ a system that can be extended to the general case. First, using guidance from
Table 4 we have divided the
state descriptions appearing in
,
, and/or
e into pairs and singletons, depending on whether
will map the relevant state description to another state description or to itself. The pairs and singletons have been aligned on the individual rows. Each row is assigned a unique binary code using the
-predicates. Row (i) is assigned the binary code
; row (ii) gets
,
etc. We then determine the
s and
s. If one state description is to be mapped onto another (and
vice versa) by
, the state description belonging to
affirms
and denies
, while the description it’ll be mapped onto denies
and affirms
. The singletons, meanwhile, affirm both
and
so that
will map them to themselves.
17With all this work done, maps to and vice versa while leaving intact. And this was our goal: We started with , so we have . We also have , , and in faithful, adequate language . So by the identical treatment of predicate permutations, . Together with f’s antisymmetry this yields our contradiction.
6. Second Stage of the Proof
Hopefully I have described the strategy for this single case so that the reader can generalize to similar cases. Unfortunately, though, there is a class of cases to which the strategy (as so far described) will not generalize. Suppose our
,
, and
e in faithful, adequate
(with five predicates) have the following disjunctive normal form equivalents:
These relata are exactly as before, except that I have added an extra disjunct
to
.
If we pursue the strategy of the previous section, each of ’s state descriptions will have a state-description synonym in . But even if we are clever and set things up so that the state description synonym for is mapped by to the state description synonym for , there will be nothing left in for to map the synonym of onto. So will not map onto as desired.
The problem is that in our new case the numbers of each type of state description do not line up in the right way. Looking at
Table 4, we can see that in order for our mapping scheme in
to work, we need the following equalities met:
(Notice from
Table 4 that the number of
-,
-,
-, and
-sds is irrelevant, as they are mapped by
to themselves.) As our strategy stands, the state descriptions of
match up one-to-one with state descriptions of
, so unstarred versions of those equalities must be met as well. But the first (unstarred) equality is not met by the
and
under consideration; in this case there are more
-sds than
.
And this case is important, because it represents a favoring instance under the Proportional Theory discussed in
Section 3. Recall that under the Proportional Theory, one hypothesis is favored over another if a greater proportion of its state descriptions is shared with the evidence. In the case at hand,
shares half the state descriptions in its disjunctive normal form with
e, while
shares only
. So according to the Proportional Theory
. Moreover, every case in which the Proportional Theory indicates that one hypothesis is favored over another by evidence will violate the unstarred version of at least one of the equalities above.
18 So for our proof to rule out the Proportional Theory as a viable theory of evidential favoring, it must apply to cases in which the unstarred version of at least one of these equalities fails.
In the present case there is one more -sd than there are -sds, but for our permutation mapping in to work out we need the number of -sds to match the number of -sds. To make this happen, we must break the one-to-one mapping between propositions expressed by state descriptions of and propositions expressed by state descriptions of . (Breaking the one-to-one mapping allows the starred equalities to hold when their unstarred analogs do not.) In particular, if the proposition expressed by —a member of ’s disjunctive normal form but not ’s—was expressed by the disjunction of two state descriptions in , the first of these state descriptions could be mapped by to the synonym of while the other could be mapped to the synonym of . “Splitting” —an -sd—into two state descriptions in would increase the number of -sds by one while leaving the number of -sds unchanged. This would bring the number of -sds and -sds into the required alignment.
The trick is to take the proposition expressed by in and express it as the disjunction of two state descriptions in , while leaving adequate and faithful and matching all the other state descriptions that participate in , , or e to state descriptions of one-to-one (so as not to alter the count of any non--sd-types besides the -sds). We will achieve this using a technique I call “explode and gather”. This technique involves introducing two intermediary languages, and , through which we move from to . Instead of assigning the state descriptions of synonyms directly in , we will assign them synonyms in . Sentences in will in turn receive synonyms in , which will finally receive synonyms in . The net result will be the representation of every proposition expressed by a state description of in , but the number of -sds in will be one greater than the number of -sds in .
To get a rough idea of how “explode and gather” works, imagine we had a language whose one constant p referred to a pig and whose one predicate N indicated that the pig was north of the barn. Our might be and our might be . We would then introduce a second language that contained the constant p, the predicate , and the predicate indicating that the pig was west of the barn. This is the “explosion” phase; while each of our hypotheses was represented by a single state description in , each is now represented by a disjunction of two state descriptions in . (, for instance, has become ). Notice that just like the state descriptions of , the state descriptions of express a set of mutually exclusive, exhaustive propositions; is faithful just like .
Now we “gather” with a new language . has three state descriptions. One is a synonym for , one is a synonym for , and the last is a synonym for . cannot express every proposition expressible in , but it can express every proposition expressible in , so it inherits its adequacy from that language. Also, the state descriptions of express a set of mutually exclusive, exhaustive propositions, so will be faithful as well. And in , the synonym of is a single state description while the synonym of is a disjunction of two. There is one more -sd than there are -sds, but the number of -sds equals the number of -sds.
This example, while hopefully suggestive, cannot actually be carried through in the details—for one thing, a standard first-order language cannot contain exactly three state descriptions. But now that we have got the broad idea down, we can implement the technical details of “explode and gather” for the -example we have been considering.
We begin by constructing
.
has the same single constant,
a, as
, representing the same tuple as in
.
contains every predicate in
, representing the same properties as they represent in
. But
has two additional predicates, which (for the sake of definiteness) we will say are
and
.
and
represent properties not already represented by any of the predicates of
. (Their introduction is analogous to the introduction of
in the pig example.)
19 Notice that under this construction each state description of
will have a synonym in
. But the
synonym of an
state description
will not be a state description of
. Instead, it will be a disjunction of four state descriptions: One that looks just like
but also affirms
, another that looks just like
but appends
,
etc. Thus we have “exploded” each state description of
into multiple state descriptions of
.
Now we “gather” the state descriptions of
into the language
.
has the same constant as
and
representing the same tuple. But
has only one more predicate than
. Instead of saying what the predicates of
represent, I will describe how we assign
state descriptions as synonyms of sentences in
. First, assign distinct
state descriptions as synonyms for the
sentences expressing state descriptions of
—except for the
sentences expressing
and the
-sds of
. (Recall that
is the state description we’re trying to “split” in two.)
, like all the state descriptions of
, has been exploded into four “parts” in
. Take one of those parts and give it a synonym that is a state description in
. Then take the other three parts and give their disjunction another state description synonym in
.
20The resulting
will not be able to express every proposition expressible in
. But it will be able to express every proposition expressible in
, because each
state description has a synonym in
. And since
is adequate for
,
, and
e,
will be adequate for the propositions expressed by those relata as well.
21 Moreover, each state description of
appearing in
,
, or
e has a state description synonym in
—except for
. The synonym of
is a disjunction of two state descriptions of
. Since they compose the synonym of
and
is a disjunct of
, both of these state descriptions will be disjuncts of
. So while there was only one
-sd in
, there will be two
-sds in
. Thus the disjunct-type counts in
will satisfy our earlier equalities (with primes taking the place of stars). That means we can construct
from
with a one-to-one mapping between state descriptions, cleverly assigning meanings to state descriptions of
as described in
Section 5.
will be adequate and faithful, and the permutation
that exchanges
with
while leaving
-predicates intact will effect exactly the mappings described in
Table 4. So our proof will go through as before.
Let’s take a step back and assess. The Proportional Theory and theories like it work by counting the number of disjuncts in hypotheses’ disjunctive normal forms and the number of disjuncts those hypotheses share with the evidence. Yet these count facts are artifacts of the language in which we choose to express our evidence and hypotheses. In this section I have worked through a particular example in which “explode and gather” takes us from one faithful, adequate language to another faithful, adequate language in which the disjunct counts of sentences expressing the same propositions have changed. No matter what logically independent evidence and hypotheses we are presented with, the explode and gather technique can be applied (multiple times if necessary) to bring disjunct counts into the balance we need to apply the proof maneuver from
Section 5. Accurate theories of evidential favoring yield the same favoring judgments across all faithful, adequate languages, and if they are formal they do so while treating predicate permutations identically. But given logically independent relata we will always be able to construct an
in which the hypotheses are predicate permutation variants, so that a formal theory will detect no favoring between them. This means that a formal theory will never find that a body of evidence differentially favors logically independent hypotheses. So if it can be captured by a formal theory, evidential favoring is not substantive.
One final note about the proof: One might wonder why
,
, and
e have to be logically independent for our argument to go through. The “explode and gather” technique increases the number of
-sds (say) by “splitting” an existing
-sd into two
-sds. This works only if there is an
-sd to begin with. If
,
, and
e were arranged so that there were some
-sds but no
-sds, we would not be able to split
-sds to make their number equal the
-sds’. Logical independence of
,
, and
e guarantees that there is at least one state description in
of each type, avoiding situations that would derail the proof. But once we have seen that proof, it is clear that the logical independence requirement could be relaxed. For example, we could run the argument for an
,
, and
e with no
-sds as long as there were no
-sds either. More generally, the relata we work with have to make some
-sds available, and if they make one sd-type on a row of
Table 4 available they have to make any other types on that row available as well. This means that we could, for instance, accommodate positions on which evidential favoring only ever occurs between mutually exclusive hypotheses.
22 In a faithful language two mutually exclusive hypotheses
and
have no
-sds and no
-sds. But eliminating those rows from
Table 4 would not disrupt our mapping scheme. So while I will continue to assume logical independence among the relata for simplicity’s sake, it is significant that this requirement could be relaxed.
7. The Favoring Relation Itself
Even if the logical independence requirement were relaxed, our general result would still have shortcomings. For example, it applies only to evidence and hypotheses adequately expressible in a first-order language with no quantifiers or identity symbol. In [
4, Appendix B] I argue that this is not as serious a limitation as it might seem. Certainly most of the evidential favoring on which we rely in daily life, scientific research,
etc. involves more complicated logical structures than this. But notice that all we need to get our proof going is one instance of logically independent evidence and hypotheses that are first-order expressible. Moreover, these syntactical limitations apply only to the evidence and hypotheses, not to the processes used by theories that detect evidential favoring. Even if an entropy calculation or something more mathematically complex is used to determine the extension of
f, our result will apply as long as at least one triple in that extension consists of sentences as simple as “This emerald is green” and “The next one will be too.”
Our result may also seem aimed at a project most philosophers have abandoned; very few people still maintain that evidential favoring can be ferreted out by formal means. So let’s stop thinking about formal theories of favoring, and start thinking about the evidential favoring relation itself. Most of the conditions we imposed in
Section 4 concerned that relation directly; the only one motivated by the prospect of formal theorizing was identical treatment of predicate permutations. But the identical treatment of predicate permutations is significant even when we set formal theories aside. If
f fails to treat predicate permutations identically, there exist hypotheses and evidence in adequate, faithful languages whose favoring relations disappear when their predicates are swapped around. Since predicates represent properties, a favoring relation that treats predicate permutations non-identically behaves differently towards propositions containing some properties than it does towards otherwise-identical propositions containing different properties. A favoring relation that does not treat predicate permutations identically plays favorites among properties. Our general result says that if evidential favoring is antisymmetric and substantive, it must play favorites.
The standard lesson drawn from language dependence results such as Goodman’s “grue” problem is that evidential favoring privileges some properties over others. Goodman himself [
6, Lecture IV] tried to distinguish “projectible” properties from nonprojectible; later philosophers such as Quine [
17] attempted to identify “natural properties” that play a special role in evidential favoring. If there are such special properties, favoring theories need not yield consistent results across faithful, adequate languages; such theories (formal or not) may restrict their attention to languages whose predicates represent the privileged properties. If greenness, for example, is a natural property, then Carnap’s
function should be applied only to language
—not
or
—and the intuitive confirmation result of Equation (
4) stands.
This standard response leaves an epistemic problem: How can agents determine which properties are natural? The obvious response is that natural properties are revealed by evidence from the natural world. But here the generality of our result kicks in. Imagine some multi-stage process agents could apply: first, the process would use an agent’s empirical evidence to determine a list of natural properties. Next, the process would sort languages whose predicates represented natural properties from those whose predicates did not. Finally, the process would work within the set of preferred languages to determine which hypotheses were favored over others by the agent’s evidence. Now consider a relation that captures the net effect of this entire process: holds just in case the natural property list generated by the evidence represented by e yields a favoring relation on which e favors the hypothesis represented by over the hypothesis represented by .
Now is just a relation, so the process that takes agents from the inputted e to an ultimate favoring judgment between and need not proceed in the sequence just described. The key point is that however it is generated, should be antisymmetric and had better be substantive, so that favoring relations may obtain among hypotheses neither entailed nor refuted by the evidence. And so (substituting for f) our general result applies: will fail to treat predicate permutations identically. That is, antecedent to the introduction of a particular body of evidence e, will already prefer some properties over others. Anyone who tries to work out the natural properties from a body of empirical evidence will need a preferred property list before that evidence is even consulted.
Apparently substantive favoring among hypotheses arises from something more than just evidence; it requires an extra-evidential element to select among properties. As far as the evidence and hypotheses themselves are concerned—considered alone as propositions, without any further information or influences—any two logically independent hypotheses are related to a given body of evidence in the same way. The appearance of asymmetry among these propositions, the sense that the evidence has more in common with one hypothesis or favors it over the other, is an artifact of the language in which the propositions are expressed. And the propositions themselves cannot tell you which language is best.
23This leaves a few options in the theory of evidential favoring. First, one can be an externalist: One can insist that there are facts in the world about which properties are natural, or projectible—it is just that those facts cannot be discerned by agents from their evidence. While there are favoring relations among hypotheses and evidence, agents cannot ever know that they have got those relations right. [
4] presents arguments against this option; I will simply say here that I find favoring facts inaccessible by agents highly unattractive. Second, one can hold that the preferred property list need not be discerned from empirical evidence because it can be determined
a priori. Besides requiring a very strong view of our
a priori faculties, this response contradicts the positive theories philosophers have offered of what makes projectible properties projectible. The standard theory (from [
18], among others) is that projectible properties recur regularly because they play a particular role in the natural laws of the universe, but such laws must be gleaned from empirical data.
24The remaining option maintains that the element privileging some properties over others comes from agents, and so is accessible to them and allows them to determine when favoring holds. This “subjectivist” option denies that there is a three-place evidential favoring relation (among two hypotheses and a body of evidence) at all; it relativizes favoring to a fourth relatum that is a feature of subjects. Perhaps agents grow up speaking a language whose predicates express certain properties; perhaps agents evolved to think using certain categories; perhaps for some reason agents have a prior disposition to project some properties more readily than others. Wherever their preferred property lists come from, subjects with different lists may not be able to adjudicate disputes between those lists (and the favoring relations that attend them) by citing evidence or a priori considerations. While this approach avoids the problems of the other options—it permits substantive evidential favoring, posits no unrealistic a priori faculties, and allows agents access to favoring facts—it may require us to radically rethink what we are doing when we ask which of two hypotheses is favored by our evidence.