What Did Bayes Really Say? — Problem and Definitions

© 19 June 2022 by Michael A. Kohn

Link to the pdf of this article

Introduction | Problem and Definitions | Propositions 1 – 7 | Bayes’s Billiards | Endnotes | References

Problem

Comment: Bayes’s statement of the problem is clear: What can we infer about the probability of a binary event by observing the number of times it does and doesn’t happen?

Original Text

Given the number of times in which an unknown event has happened and failed: Required the chance that the probability of its happening in a single trial lies somewhere between any two degrees of probability that can be named.

Modern Equivalent

Given: that n independent binary trials with unknown probability of success \theta result in k successes and n - k failures.

Required: the probability that \theta lies in the interval between \theta_{1} and \theta_{2}.

Comment: As we shall see, in Bayes’s “billiards” example, the prior distribution of \theta is the uniform distribution between 0 and 1; \theta \sim Unif(0,1).

Definitions

Comment: After presenting the problem, Bayes starts off with 7 definitions. Today’s textbooks, except for Jaynes (see references), use the notation of set theory. A \cap B means that events A and B both occur; A \cup B means that at least one of A and/or B occurs. They introduce a sample space S of all possible events and a probability function P that takes an event A \subseteq S as input and returns P(A), a real number between 0 and 1, as output.

Bayes preceded the use of set notation for probability definitions by more than 150 years. As we will see in his Definition 5, he defines the probability of an event as the ratio of its expected value to the value realized if it occurs. But he covers the same points about probability as do today’s textbooks.

Original Text

Definition 1. Several events are inconsistent when, if one of them happens, none of the rest can.

Modern Equivalent

We now use “disjoint” instead of “inconsistent”. Saying that events are disjoint means that they are mutually exclusive.

Original Text

2. Two events are contrary when one, or other of them must; and both together cannot happen.

Modern Equivalent

We now use “complementary” instead of “contrary”. The complement of event A is Not(A), which I will denote A^{c}.

Original Text

3. An event is said to fail, when it cannot happen; or, which comes to the same thing, when its contrary has happened.

4. An event is said to be determined when it has either happened or failed.

Comment: Definitions 3 and 4 do not require translation or elaboration.

Original Text

5. The probability of any event is the ratio between the value at which an expectation depending on the happening of the event ought to be computed, and the value of the thing expected upon its happening.

Modern Equivalent

The probability of an event is the ratio of its expected value to the value realized if the event occurs.

Comment: Later, Bayes will talk about receiving N if event A occurs. Definition 5 says that, if the expected value of event A is E(A), then P(A) = E(A)/N. He will denote E(A) with \textbf{a} and therefore P(A) = \textbf{a}/N. While Bayes defines probability as the ratio of expected value to amount received, essentially all others define expected value as probability times amount: E(A) = P(A)\times N.

When Bayes talks about receiving a value N, he means utility, not monetary value. (See Endnote #2.) Awkwardly, he assumes all events result in receiving N. If events A, B, and C all result in N and have expected values \textbf{a}, \textbf{b}, and \textbf{c}, respectively, then P(A) = \textbf{a}/N, P(B) = \textbf{b}/N, and P(C) = \textbf{c}/N.

Bayes’s definition is related to indicator random variables and what Blitzstein (p. 164) calls the “fundamental bridge” between probability and expectation. If indicator random variable I_A = 1 when event A occurs and I_A = 0 when A fails to occur, then E(I_A) = P(A). For Bayes, this isn’t a bridge between probability and expectation, it’s the definition of probability. You can see this, without loss of generality, by setting his N = 1. In short, Definition 5 defines the probability of an event as the expected value of its indicator variable.

Original Text

6. By chance I mean the same as probability.

7. Events are independent when the happening of any one of them does neither increase nor abate the probability of the rest.

Comment: Definitions 6 and 7 do not require translation or elaboration. On to the propositions…

(next) Propositions 1 – 7

Introduction | Problem and Definitions | Propositions 1 – 7 | Bayes’s Billiards | Endnotes | References