**2. What Is Information?**

The term "information" is used with a wide variety of different meanings [10,26,27]. There is the Shannon notion of information that is meant to measure an amount of information and is quite divorced from semantics. There is also an algorithmic notion of information that captures a notion of complexity and originates in the work of Solomonov, Kolmogorov, and Chaitin [26]; there is a related notion of entropy as a minimum description length [28]. Furthermore, in the general context of the thermodynamics of computation, it is said that "information is physical" because systems "carry" or "contain" information about their own physical state [29–31] (see also [32,33]).

Here, we follow a different path [3,4]. We seek an epistemic notion of information that is closer to the everyday colloquial use of the term—roughly, information is what we request when we ask a question. In a Bayesian framework, this requires an explicit account of the relation between information and the beliefs of ideally rational agents. We emphasize that our concern here is with *idealized rational agents*. Our subject is not the psychology of actual humans who often change their beliefs by processes that are neither fully rational nor fully conscious. We adopt a Bayesian interpretation of probability as a degree of credibility: the degree to which we *ought* to believe that a proposition is true if only we were ideally rational. For a discussion of a decision theory that might be relevant to the economics and psychology of partially rational agents see [34–36]. An entropic framework for modelling economies that bypasses all issues of bounded rationality is described in [37].

It is implicit in the recognition that most of our beliefs are held on the basis of incomplete information that not all probability assignments are equally good; some beliefs are preferable to others in the very pragmatic sense that they enhance our chances to successfully navigate this world. Thus, a theory of probability demands a theory of updating probabilities in order to improve our beliefs.

We are now ready to address the question: What, after all, is "information"? The answer is pragmatic. *Information is what information does.* Information is defined by its effects: (a) it restricts our options as to what we are honestly and rationally allowed to believe; and (b) it induces us to update from prior beliefs to posterior beliefs. This, I propose, is a defining characteristic of information:

#### *Information is that which induces a change from one state of rational belief to another.*

One aspect of this notion is that for a rational agent, the identification of what constitutes information—as opposed to mere noise—already involves a judgment, an evaluation. Another aspect is that the notion that information is directly related to changing our minds does not involve any reference to *amounts* of information, but it nevertheless allows precise quantitative calculations. Indeed, constraints on the acceptable posterior probabilities are precisely the kind of information that the method of maximum entropy is designed to handle. In short,

#### *Information constrains probability distributions. The constraints are the information.*

To the extent that the probabilities are Bayesian, this definition captures the Bayesian notion that information is directly related to changing our minds, that it is the driving force behind the process of learning. It also incorporates an important feature of rationality: being rational means accepting that "not everything goes", and that our beliefs must be constrained in very specific ways. However, the indiscriminate acceptance of any arbitrary constraint does not qualify as rational behavior. To be rational, an agent must exercise some judgment before accepting a particular piece of information as a reliable basis for the revision of its beliefs, which raises questions about what judgments might be considered sound. Furthermore, there is no implication that the information must *be true*; only that we *accept it as true*. False information is information too, at least to the extent that we are prepared to accept it and allow it to affect our beliefs.

The paramount virtue of the definition above is that it is useful; it allows precise quantitative calculations. The constraints that constitute information can take a wide variety of forms. They can be expressed in terms of expected values, they can specify the functional form of a distribution, or be imposed through various geometrical relations. Examples are given in Section 5 and in [38].

Concerning the act of updating, it may be worthwhile to point out an analogy with dynamics. In Newtonian mechanics, the state of motion of a system is described in terms of momentum, and the change from one state to another is said to be "caused" by an applied force or impulse. Bayesian inference is analogous in that a state of belief is described in terms of probabilities, and the change from one state to another is "caused" by information. Just as a force is that which induces a change from one state of motion to another, so *information is that which induces a change from one state of belief to another*. Updating is a form of dynamics. In [39], the analogy is taken seriously: the logic is reversed and quantum mechanics is derived as an example of the entropic updating of probabilities.

#### **3. The Pragmatic Design of Entropic Inference**

Once we have decided, as a result of the confrontation of new information with old beliefs, that our beliefs require revision, the problem becomes one of deciding how precisely this ought to be done. First, we identify some general features of the kind of belief revision that one might count as rational. Then, we design a method—a systematic procedure—that implements those features. To the extent that the method performs as desired, we can claim success. The point is not that success derives from our method having achieved some intimate connection to the inner wheels of reality; success simply means that the method seems to be working.

The one obvious requirement is that the updated probabilities ought to agree with the newly acquired information. Unfortunately, this requirement, while necessary, is not sufficiently restrictive: we can update in many ways that preserve both internal consistency

and consistency with the new information. Additional criteria are needed. What rules would an ideally rational agent choose?

#### *3.1. General Criteria*

The rules are motivated by the same pragmatic criteria that motivate the design of probability theory itself [8]—universality, consistency, and practical utility. However, this is admittedly too vague; we must be very specific about the precise way in which the criteria are implemented.

### 3.1.1. Universality

In principle, different systems and different situations could require different problemspecific induction methods. However, in order to be useful in practice, the method we seek must be of *universal* applicability. Otherwise, it would fail us when most needed, for we would not know which method to choose when not much is known about the system. To put in different words, what we want to design is a general-purpose method that captures what all the other problem-specific methods might have in common. The idea is that the peculiarities of a particular problem will be captured by the specific constraints that describe the information that is relevant to the problem at hand.

The analogy with mechanics can be found here as well. The possibility of a science of mechanics hinges on identifying a law of motion of universal applicability (e.g., the Schrödinger equation), while the specifics of each system are introduced through initial conditions and the choice of potentials or forces. Here, we shall design an entropy of universal applicability, while the specifics of each problem are introduced through prior probabilities and the choice of constraints.

#### 3.1.2. Parsimony

To specify the updating, we adopt a very conservative criterion that recognizes the value of information: what has been laboriously learned in the past is valuable and should not be disregarded unless rendered obsolete by new information. The only aspects of one's beliefs that should be updated are those for which new evidence has been supplied. Thus, we adopt the following.
