8.1.5: Probability
- Page ID
- 91193
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)As we have seen, a strong inductive argument is one in which the truth of the premises makes the conclusion highly probable. The distinction between strong inductive arguments and valid (deductive) arguments is that whereas the premises of strong inductive arguments make their conclusions highly probable, the premises of valid arguments make their conclusions certain. We can think of probability as how likely it is that something is (or will be) true, given a particular body of evidence. Using numbers between 0 and 1, we can express probabilities numerically. For example, if I have a full deck of cards and pick one at random, what is the probability that the card I pick is a queen? Since there are 52 cards in the deck, and only four of them are queens, the probability of picking a queen is 4/52, or .077. That is, I have about a 7.7% chance of picking a queen at random. In comparison, my chances of picking any “face” card would be much higher. There are three face cards in each suit and four different suits, which means there are 12 face cards total. So, 12/52 = .23 or 23%. In any case, the important thing here is that probabilities can be expressed numerically. In using a numerical scheme to represent probabilities, we take 0 to represent an impossible event (such as a contradiction) and 1 to represent an event that is certain (such as a tautology).
Probability is important to understand because it provides the basis for formal methods of evaluating inductive arguments. While there is no universally agreed upon method of evaluating inductive arguments in the way there is with deductive arguments, there are some basic laws of probability that it is important to keep in mind. As we will see in the next few sections, although these laws of probability are seemingly simple, we misapply them all the time.
We can think of the rules of probability in terms of some of the truth functional operators, introduced in chapter 2: the probability of conjunctions, the probability of negations, and the probability of disjunctions. The probability of conjunctions is the probability that two, independent events will both occur. For example, what is the probability that you randomly draw a queen and then (after returning it to the pile and reshuffling the deck) you draw another queen? Since we are asking what is the probability that these two events both occur, this is a matter of calculating the probability of a joint occurrence. In the following, “a” and “b” will refer to independent events, and the locution “P(a)” stands for “the probability of a.” Here is how we calculate the probability of conjunctions:
P(a and b) = P(a) × P(b)
So, to apply this to my example of drawing two queens, we have to multiply the probability of drawing one queen, “P(a)” by the probability of drawing yet another queen, “P(b).” Since we have already calculated the probability of drawing a queen at .077, the math is quite simple:
.077 × .077 = .0059
That is, there a less than 1% chance (.59% to be precise) of drawing two queens in this scenario. So, obviously, you’d not be wise to place a bet on that happening! Let’s try another example where we have to calculate the probability of a conjunction. Suppose I want to know what the probability that both my father and mother will die of brain cancer. (Macabre, I know.) I’d have to know the probability of dying of brain cancer, which is about 5/100,000. That is, 5 out of every 100,000 people die of brain cancer. That is a very small number: .00005. But the chance of both of them dying of brain cancer is going to be an even smaller number:
.00005 × .00005 = .0000000025
That is almost 1 in a billion chance. So not very likely. Let’s consider a final example with more manageable numbers. Suppose I wanted to know the probability of rolling a 12 when rolling two, six-sided dice. Since the only way to roll a 12 is when I roll a 6 on each die, I can compute the probability of rolling a 6 and then the independent probability of rolling another 6 on the other die. The probability of rolling a six on 1 die is just 1/6 = .166. Thus, .166 × .166 = .028 Thus, you have a 2.8% chance of rolling a 12. We could have also calculated this using fractions instead of decimals:
1/6 × 1/6 = 1/36
Calculating the probability of negations is simply a matter of subtracting the probability that some event, say event a, will occur from 1. The result is the probability that event a will not occur:
P(not-a) = 1 – P(a)
For example, suppose I am playing monopoly I wanted to determine the probability that I do not roll a 12 (since if I roll a 12 I will land on Boardwalk, which my opponent owns with hotels). Since we have already determined that the probability of rolling a 12 is .028, we can calculate the probability of not rolling a 12 thus:
1 – .028 = .972
Thus, I have 97.2% chance of not rolling a 12. So it is highly likely that I won’t (thank goodness).
Here’s another example. What are the chances that my daughter doesn’t get into Harvard? Since the acceptance rate at Harvard is about 6% (or .06), I simply subtract that from 1, which yields .94, or 94%. So my daughter has a 94% chance of not getting into Harvard.
We should pause here to make some comments about probability. The probability of an event occurring is relative to some reference class. So, for example, the probability of getting osteoporosis is much higher if you are a woman over 50 (16%) than if you are a man over 50 (4%). So if you want accurate data concerning probability, you have to take into account all the relevant factors. In the case of osteoporosis, that means knowing whether you are a woman or a man and are over or under 50. The same kind of point applies to my example of getting into Harvard. Here’s an anecdote that will illustrate the point. Some years ago, I agreed to be a part of an interviewing process for candidates for the “presidential scholarship” at the college at which I was teaching at the time. The interviewees were high school students and we could have calculated the probability that any one of them would win the scholarship simply by noting the number of scholarships available and the number of applicants for them. But after having interviewed the candidates I was given to interview, it was very clear to me that one of them easily outshined all the rest. Thus, given the new information I had, it would have been silly for me to assign the same, generic probability to this student winning the award. This student was extremely well-spoken, well-put-together, and answered even my hardest questions (with which other candidates struggled) with an ease and confidence that stunned me. On top of all of that, she was a Hispanic woman, which I knew would only help her in the process (since colleges value diversity in their student population). I recommended her highly for the scholarship, but I also knew that she would end up at a much better institution (and probably with one of their most competitive scholarships). Some time later, I was wondering where she did end up going to college, so I did a quick search on her name and, sure enough, she was a freshman at Harvard. No surprise to me. The point of the story is that although we could have said that this woman’s chances of not getting into Harvard are about 94%, this would neglect all the other things about her which in fact drastically increase her chances of getting into Harvard (and thus drastically decrease her chances of not getting in). So our assessments of probability are only as good as the information we use to assess them. If we were omniscient (i.e., all-knowing), then arguably we could know every detail and would be able to predict with 100% accuracy any event. Since we aren’t, we have to rely on the best information we do have and use that information to determine the chances that an event will occur.
Calculating the probability of disjunctions is simply a matter of figuring out the probability that either one event or another will occur. To calculate the probability of a disjunction we simply add the probability of the two events together:
P(a or b) = P(a) + P(b)
For example, suppose I wanted to calculate the probability of drawing randomly from a shuffled deck either a spade or a club. Since there a four suits (spades, clubs, diamonds, hearts) each with an equal number of cards, the probability of drawing a spade is 1⁄4 or .25. Likewise the probability of drawing a club is .25. Thus, the probability of drawing either a spade or club is:
.25 + .25 = .50
So you have a 50% chance of drawing either a spade or a club. Sometimes events are not independent. For example, suppose you wanted to know the probability of drawing 5 clubs from the deck (which in poker is called a “flush”). This time you are holding on to the cards after you draw them rather than replacing them back into the deck. The probability of drawing the first club is simply 13/52 (or 1⁄4). However, each of the remaining four draws will be affected by the previous draws. If one were to successfully draw all clubs then after the first draw, there would be only 51 cards left, 12 of which were clubs; after the second draw, there would be only 50 cards left, 11 of which were clubs, and so on, like this:
13/52 × 12/51 × 11/50 × 10/49 × 9/48 = 33/66,640
As you can see, we’ve had to determine the probability of a conjunction, since we want card 1 and card 2 and card 3 etc. to all be clubs. That is a conjunction of different events. As you can also see, the probability of drawing such a hand is extremely low—about .0005 or .05%. A flush is indeed a rare hand. But suppose we wanted to know, not the chances of drawing a flush in a specific suit, but just the chances of drawing a flush in any suit. In that case, we’d have to calculate the probability of a disjunction of drawing either a flush in clubs or a flush in spades or a flush in diamonds or a flush in hearts. Recall that in order to calculate a disjunction we must add together the probabilities:
.0005 + .0005 + .0005 + .0005 = .002
So the probability of drawing a flush in any suit is still only about .2% or one fifth of one percent—i.e., very low.
Let’s examine another example before closing this section on probability. Suppose we want to know the chances of flipping at least 1 head in 6 flips of a fair coin. You might reason as follows: There is a 50% chance I flip heads on the first flip, a 50% chance on the second, etc. Since I want to know the chance of flipping at least one head, then perhaps I should simply calculate the probability of the disjunction like this:
.5 + .5 + .5 + .5 + .5 + .5 = 3 (or 300%)
However, this cannot be right, because the probability of any event is between 1 and 0 (including 0 and 1 for events that are impossible and absolutely certain). However, this way of calculating the probability leaves us with an event that is three times more than certain. And nothing is more than 100% certain—100% certainty is the limit. So something is wrong with the calculation. Another way of seeing that something must be wrong with the calculation is that it isn’t impossible that I flip 6 tails in a row (and thus no heads). Since that is a real possibility (however improbable), it cannot be 100% certain that I flip at least one head. Here is the way to think about this problem. What is the probability that I flip all tails? That is simply the probability of the conjunction of 6 events, each of which has the probability of .5 (or 50%):
.5 × .5 × .5 × .5 × .5 × .5 = .015 (or 1.5%)
Then we simply use the rule for calculating the probability of a negation, since we want to know the chances that we don’t flip 6 tails in a row (i.e., we flip at least one head):
1 – .015 = .985
So the probability of flipping at least one head in 6 flips of the coin is 98.5%. (It would be exactly the same probability of flipping at least one tails in 6 flips.)
Exercise
Use the three different rules of calculating probabilities (conjunctions, negations, disjunctions) to calculate the following probabilities, which all related to fair, six-sided dice.
1. What is the probability of rolling a five on one throw one die?
2. What is the probability of not rolling a five on one throw of one die?
3. What is the probability of rolling a five on your first throw and another five on the second throw of that die?
4. If you roll two dice at one time, what are the chances that both dice will come up twos?
5. If you roll two dice at one time, what are the chances that one or the other (or both) of the dice will come up a two?
6. If you roll two dice at once, what are the chances that at most one of the dice will come up a two?
7. If you roll two dice at once, what are the chances that at least one of the dice will come up a four?
8. If you roll two dice at once, what are the chances that there will be no fours?
9. If you roll two dice at once, what are the chances of rolling double fives?
10. If you roll two dice at once, what are the chances of rolling doubles (of any number)?