What is Probability?

Community Article Published January 8, 2024

I am back with another blog post that is centered around exploring and communicating complex topics. This time the topic of choice is Probability.

Back in my undergraduate days, I was never given the intuition of probability. I think Chris Piech does more than just justice to this topic in his course CS-109. Please feel free to skip this blog post and head on to the freely available YouTube lectures.

Before we begin our journey into probability, two terms that need to be highlighted are:

1. Sample Space

Sample Space is the set of all the possible outcomes of an experiment. Consider a dice roll. The sample space is all possible outcomes: {1, 2, 3, 4, 5, 6}.

2. Event Space

It is a subset of the sample space. Essentially, the event space talks about how many of the outcomes in the sample space satisfy some semantic event. If we're interested in rolling an even number, the event space is a subset of the sample space: {2, 4, 6}.

Now that we have the terms out of our way, we are ready to understand what probability really is.

Probability is a term between 0 and 1 that we ascribe some meaning to.

Take a moment and think about this. It does not present a concrete idea, it is just a loosely typed (if you may) explanation. The pivotal idea here is how we ascribe a meaning to probability.

Imagine you were to repeat an experiment infinite times. You count the number of times the event that you want occurs. Upon normalizing the count to 0 and 1, we have our probability of the event.

P(E)=limnn(E)n P(E) = \lim_{n \to \infty} \frac{n(E)}{n}

Here, n(E)n(E) represents the number of times the event E occurs, and nn is the total number of experiments. As we repeat the experiment more and more, the ratio of these numbers approaches the true probability of the event.

This intuition of probability does not talk about the certainty of an event occuring, but the uncertainty the universe holds for us. Probability can be thought of an a language to communicate the uncertainty of an event (too philosophical?).

Axioms of Probability

Having understood probability in the very basics, we now move into the world of analytical probability. To solve all of analytical probability, we need to know only the following three axioms.

Axiom 1: 0P(E)10 \leq P(E) \leq 1

Axiom 2: P(S)=1P(S)=1

Identity 3: P(Ec)=1P(E)P(E^{c}) = 1 - P(E)

Equally Likely Events

When we're talking about probability, it's crucial to understand how we define our sample space - that's the set of all possible outcomes. In some cases, like a fair coin toss, all outcomes in the sample space are equally likely. But in many other situations, this isn't the case, and assuming otherwise can lead to incorrect conclusions.

Take a lottery, for example. We might think the sample space consists of just two outcomes: winning or losing. But it's misleading to say the probability of winning is 50%, because not all outcomes are equally likely. Winning a lottery is much less likely than losing.

Now, let's consider a scenario where the outcomes are indeed equally likely: rolling two dice.

The Sample Space of Rolling Two Dice

When we roll two dice, each die has 6 faces, so there are 36 possible combinations (6 faces on the first die times 6 faces on the second die). These combinations are all equally likely. Our sample space, therefore, consists of pairs representing the outcome on each die, like so:

[(1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6),
 (2, 1), (2, 2), (2, 3), (2, 4), (2, 5), (2, 6),
 (3, 1), (3, 2), (3, 3), (3, 4), (3, 5), (3, 6),
 (4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (4, 6),
 (5, 1), (5, 2), (5, 3), (5, 4), (5, 5), (5, 6),
 (6, 1), (6, 2), (6, 3), (6, 4), (6, 5), (6, 6)]

Calculating Probability for a Specific Event

Suppose we're interested in the event where the sum of the two dice equals 7. The combinations that satisfy this are:

(1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)

There are 6 outcomes that result in a sum of 7. Since there are 36 possible outcomes in total, the probability of this event is 6/36, or 1/6.

Common Mistakes in Setting Up the Sample Space

Understanding the structure of the sample space is key to calculating probabilities correctly. Let's look at some common errors:

  1. Considering Only the Sums: If we only look at the possible sums (2 to 12), our sample space would be [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], with the event space being just [7]. Calculating probability this way gives us 1/11, which is incorrect. This error happens because each sum is not equally likely; for example, there are more ways to roll a 7 than a 2.

  2. Assuming the Dice Are Indistinct: If we treat the dice as indistinct, our sample space becomes smaller, including only unique pairs like

{{1, 1}, {1, 2}, {1, 3}, {1, 4}, {1, 5}, {1, 6},
         {2, 2}, {2, 3}, {2, 4}, {2, 5}, {2, 6},
                 {3, 3}, {3, 4}, {3, 5}, {3, 6},
                         {4, 4}, {4, 5}, {4, 6},
                                 {5, 5}, {5, 6},
                                         {6, 6}}

In this case, the event space is {1, 6}, {2, 5}, {3, 4}. The probability then appears to be 3/21, which is wrong. The mistake here is not recognizing that outcomes like {1, 2} and {2, 1} are distinct and should be counted separately.

  1. Correct Approach - Distinct Dice: The accurate way is to consider each die as distinct, leading to the correct sample space and probability, as we initially discussed.

Key Takeaway

In probability, understanding and correctly defining the sample space is crucial. When outcomes are not equally likely, as in a lottery, assumptions about the sample space can lead to incorrect probabilities. In contrast, for two distinct dice, each outcome is equally likely, allowing for a straightforward calculation.

Acknowledgement

I would like to thank Sohini Sengupta for an initial review of the blog post.