

Bayes' Theorem
Outside the field of religion there are not very many prophets. In general, any additional information one may be able to obtain about uncertain states of nature in the real world cannot be assumed to be 100% accurate. Nature is free to do as she pleases irrespective of human attempts to box her in with equations (models).
Consider Goldie Lockes' predicament. She already has a probability distribution describing her (and Cactus's) beliefs about market demand for ACME's next-generation roadrunner traps. She knows, however, that additional information can be obtained (by, say, market research) that could induce her to revise her prior beliefs. This means that the original or prior distribution, which was derived subjectively, would change. For instance, she believes that the probability of a medium market demand is 0.6. Were she to conduct a market survey that pointed to a medium demand, it would reinforce her beliefs and lead her to revise the probability upwards. The question is, by how much? Surely, that depends on the degree of reliability of the market survey. The more reliable the information, the greater her confidence in it and therefore the greater its effect in modifying her prior beliefs.
In olden times, the mighty ones of the earth (the decision makers) made use of their judgment, experience and intuition, and somehow came up with a solution to whatever problem they were dabbling in at the time. Then they commanded their troops to follow them into battle. History shows that, on average, half of these decisions proved to be wrong. It's much easier to just flip a coin. Fortunately for us moderns (and postmoderns, too), the Rev. Thomas Bayes showed how to make inferences rationally in his "Essay Towards Solving a Problem in the Doctrine of Chances" (1763), published posthumously by the Royal Society. This was the beginning of humankind's incursion into the third epistemologically valid approach to acquiring knowledge. (Bayes derived a special case of the theorem. Laplace independently rediscovered it in generalized form.)
The gist of Bayes' theorem is easy to grasp. If one has encoded one's prior beliefs about a hypothesis with a probability distribution, and one has access to additional information that could support or contradict the hypothesis and whose degree of reliability is known or is estimable, then one can revise one's prior distribution to reflect the resulting beliefs one should have, given the additional information, about one's original hypothesis. The revised distribution is known as the posterior (or a posteriori), while the reliability data are called likelihoods (or likelihood function). The prior distribution is known as the prior (or a priori), naturally. Graphically:

Bayes' Theorem follows logically from the definition of conditional probability: P(A|B) = P(A Ç B) / P(B).
(Logical reasoning, which includes mathematics, is the second epistemologically valid approach to acquiring knowledge.)
Let's suppose there is an event E whose prior probability is known subjectively (works fine for objective probabilities, too). If you'd rather view this concretely, let E be the event «it will rain today when I go out to lunch». Let I be the indicator event (the additional information) «the weather forecast calls for rain at lunch». Clearly, a forecast of rain does not guarantee that it will rain. Moreover, I itself is an uncertain event because, a priori, you have no assurance what the forecast will be. It could call for sunny skies. This state of affairs can be depicted schematically with a Venn diagram:

Now, the prior of E is P(E), a marginal (not conditional) probability. Note that P(E) is simply the area of circle E divided by the total area U (the universe of possible events, defined to be 1). Thus, 0 = P(E) = 1. We would like to know the probability of E given a forecast I , that is, P( E | I ). By the definition of conditional probability:
P( E | I ) = P( E Ç I ) / P( I )
In other words, if forecast I is taken as given, then the probability of E changes to the area of the gray convex-lens shape E Ç I divided by the total area of circle I, for I is the given condition. In other words, the "background" event space is no longer U but I.
Note that by rearranging terms, we can restate the conditional probability definition as:
P( E Ç I ) = P( E | I ) · P( I )
Everybody knew that. Bayes' great insight was to see this:
P( E Ç I ) = P( I | E ) · P( E )
which follows by symmetry if E and I are both uncertain events, which a priori they are. That is to say, since both E and I are uncertain events and the intersection of those events is one and the same thing (the joint event), the probability of that joint event must be the same given either event E or event I. Consequently, substituting Bayes' insight for the joint probability in the definition of conditional probability, we obtain:
P( E | I ) = P( I | E ) · P( E ) / P( I )
Now look at the depiction of I shown under the Venn diagram above: I = ( E Ç I ) + ( E Ç I ), where E is the complement of E. Taking probabilities and expanding the terms as before:

which is obvious. It is also Bayes' Theorem. Note that the posterior boils down to a simple average: the probability of getting the forecast right (the numerator) divided by the probabilities of getting it right plus getting it wrong. Bayes simply measures the hit rate.

Checking out the formula we see that P( E ) and P( E ) are the priors and P( I | E ) and P( I | E ) are likelihoods. If you've got the latter, you've got the posteriors and you're in business, kid.
In general, if the event set E comprises several possible outcomes Ej ( j = 1, 2, ... , n ), as opposed to just E and its complement E, then the indicator I can be wrong in its prediction in (n - 1) ways. Bayes' formula is then:

where Ex is the outcome of interest.
By the way, experience is the first epistemologically valid approach to acquiring knowledge. Notice that what Bayes' Theorem does is to allow decision makers to combine their prior knowledge (as coded in the prior distribution) with factual (empirical) information (the likelihoods) so as to enhance their knowledge about a poblem characterized by uncertainty in a logically consistent manner. That is why Bayes' Theorem is the only other epistemologically valid approach to acquiring knowledge beyond what is possible by either experience or logical reasoning by themselves. Bayes is serious business.
Is this stuff boffola or what? Here are some boffo links to check out:
Bayes Rule Fenton
Bayes' Rule & History of Bayes' Rule Murphy & Yuille
Bayes' Theorem Brown
Bayes Theorem Haberstroh
Conditional Probability and Bayes' Theorem Jones
Derivation of Bayes Theorem & Principles of Logic Bayesian Systems
Probability, Conditional Probability and Bayes Formula Vidakovic
A Short Exposition on Bayesian Inference and Probability Stutz & Cheeseman
Tutorial for Bayes' Theorem Waner
And here are some sites about the good reverend:
Thomas Bayes O'Connor & Robertson / Mac Tutor - University of St Andrews - Scotland
Thomas Bayes Kyoto University
Thomas Bayes University of Minnesota Morris
Thomas Bayes Wikipedia
The Reverend Thomas Bayes, FRS D.R. Bellhouse
Recommended reference: An Introduction to Bayesian Inference and Decision, 2nd Ed by Robert L. Winkler
Terms
Likelihoods – conditional probabilities expressing the degree of reliability of the additional information.
Posterior (a posteriori) – the revised (conditional) probability distribution derived from Bayes' Theorem.
Prior (a priori) – the original probability distribution before Bayesian revision.


Bayes' Theorem
Bayesian Epistemology

Bayes' Theorem
Bayesian Inference
Bayesian Probability
Sample Space

Bayes' Theorem
Bayesian Analysis
Sample Space

Bayes' Theorem

