What Did Bayes Really Say? — Problem and Definitions

© 19 June 2022 by Michael A. Kohn

Link to the pdf of this article

Introduction | Problem and Definitions | Propositions 1 – 7 | Bayes’s Billiards | Endnotes | References

Problem

Comment: Bayes’s statement of the problem is clear: What can we infer about the probability of a binary event by observing the number of times it does and doesn’t happen?

Original Text

Given the number of times in which an unknown event has happened and failed: Required the chance that the probability of its happening in a single trial lies somewhere between any two degrees of probability that can be named.

Modern Equivalent

Given: that n independent binary trials with unknown probability of success \theta result in k successes and n - k failures.

Required: the probability that \theta lies in the interval between \theta_{1} and \theta_{2}.

Comment: As we shall see, in Bayes’s “billiards” example, the prior distribution of \theta is the uniform distribution between 0 and 1; \theta \sim Unif(0,1).

Definitions

Comment: After presenting the problem, Bayes starts off with 7 definitions. Today’s textbooks, except for Jaynes (see references), use the notation of set theory. A \cap B means that events A and B both occur; A \cup B means that at least one of A and/or B occurs. They introduce a sample space S of all possible events and a probability function P that takes an event A \subseteq S as input and returns P(A), a real number between 0 and 1, as output.

Bayes preceded the use of set notation for probability definitions by more than 150 years. As we will see in his Definition 5, he defines the probability of an event as the ratio of its expected value to the value realized if it occurs. But he covers the same points about probability as do today’s textbooks.

Original Text

Definition 1. Several events are inconsistent when, if one of them happens, none of the rest can.

Modern Equivalent

We now use “disjoint” instead of “inconsistent”. Saying that events are disjoint means that they are mutually exclusive.

Original Text

2. Two events are contrary when one, or other of them must; and both together cannot happen.

Modern Equivalent

We now use “complementary” instead of “contrary”. The complement of event A is Not(A), which I will denote A^{c}.

Original Text

3. An event is said to fail, when it cannot happen; or, which comes to the same thing, when its contrary has happened.

4. An event is said to be determined when it has either happened or failed.

Comment: Definitions 3 and 4 do not require translation or elaboration.

Original Text

5. The probability of any event is the ratio between the value at which an expectation depending on the happening of the event ought to be computed, and the value of the thing expected upon its happening.

Modern Equivalent

The probability of an event is the ratio of its expected value to the value realized if the event occurs.

Comment: Later, Bayes will talk about receiving N if event A occurs. Definition 5 says that, if the expected value of event A is E(A), then P(A) = E(A)/N. He will denote E(A) with \textbf{a} and therefore P(A) = \textbf{a}/N. While Bayes defines probability as the ratio of expected value to amount received, essentially all others define expected value as probability times amount: E(A) = P(A)\times N.

When Bayes talks about receiving a value N, he means utility, not monetary value. (See Endnote #2.) Awkwardly, he assumes all events result in receiving N. If events A, B, and C all result in N and have expected values \textbf{a}, \textbf{b}, and \textbf{c}, respectively, then P(A) = \textbf{a}/N, P(B) = \textbf{b}/N, and P(C) = \textbf{c}/N.

Bayes’s definition is related to indicator random variables and what Blitzstein (p. 164) calls the “fundamental bridge” between probability and expectation. If indicator random variable I_A = 1 when event A occurs and I_A = 0 when A fails to occur, then E(I_A) = P(A). For Bayes, this isn’t a bridge between probability and expectation, it’s the definition of probability. You can see this, without loss of generality, by setting his N = 1. In short, Definition 5 defines the probability of an event as the expected value of its indicator variable.

Price points this out in his introductory letter.

[Bayes] has also made an apology for the peculiar definition he has given of the word chance or probability. His design herein was to cut off all dispute about the meaning of the word, which in common language is used in different senses by persons of different opinions, and according as it is applied to past or future facts. But whatever different senses it may have, all (he observes) will allow that an expectation depending on the truth of any past fact, or the happening of any future event, ought to be estimated so much the more valuable as-the fact is more likely to be true, or the event more likely to happen. Instead therefore, of the proper sense of the word probability, he has given that which all will allow to be its proper measure in every case where the word is used.

Price mentions “the truth of any past fact” in addition to “the happening of any future event”. So probabilities apply to the truth of propositions as well as to the occurrence of events. In his 1921 Treatise on Probability, J.M. Keynes said, “…it will be more than a verbal improvement to discuss the truth and the probability of propositions instead of the occurrence and the probability of events.”

Original Text

6. By chance I mean the same as probability.

7. Events are independent when the happening of any one of them does neither increase nor abate the probability of the rest.

Comment: Definitions 6 and 7 do not require translation or elaboration. On to the propositions…

(next) Propositions 1 – 7

Introduction | Problem and Definitions | Propositions 1 – 7 | Bayes’s Billiards | Endnotes | References