Introduction
In the previous post, we learned about Kolmogorov's axioms — the three foundational rules of probability. But how do we actually use these axioms?
That's where properties of probability come in. These are rules and formulas derived directly from the axioms. They tell us:
- How to handle opposite events
- How to combine probabilities of different events
- How to reason about conditional probabilities
- When and how to multiply probabilities
In this post, we'll explore the major properties of probability, prove them from the axioms, and see how to apply them in practice.
Property 1: Probability of the Complement
The complement of event A (denoted A' or A^c) is the event that A does NOT occur.
Complement Rule
Example
P(no rain) = 1 - P(rain) = 1 - 0.3 = 0.7
Property 2: Probability of the Impossible Event
Impossible Event
The probability of an impossible event (the empty set ∅) is zero.
Example
P(rolling a 7) = 0 (impossible on a 6-sided die)
Property 3: Bounded Probability
Probability Bounds
Probabilities are always between 0 and 1 (inclusive).
Property 4: Monotonicity
Monotonicity
If event A is a subset of event B (all outcomes in A are in B), denoted A ⊆ B, then:
Example
A = Drawing a heart
B = Drawing a red card
Since all hearts are red: A ⊆ B
Therefore: P(heart) ≤ P(red)
Verification: P(heart) = 13/52, P(red) = 26/52
13/52 ≤ 26/52 ✓
Property 5: Addition Rule (General Form)
General Addition Rule
For any two events A and B:
Why the "-P(A ∩ B)"?
When we add P(A) + P(B), we count the overlap (A ∩ B) twice. So we subtract it once to correct for this double-counting.
Special Case: Mutually Exclusive Events
When A and B cannot both occur (mutually exclusive), P(A ∩ B) = 0, so:
Example
A = Drawing a King (4 cards)
B = Drawing a Heart (13 cards)
A ∩ B = Drawing a King of Hearts (1 card)
P(King ∪ Heart) = P(King) + P(Heart) - P(King of Hearts)
= 4/52 + 13/52 - 1/52
= 16/52
≈ 30.77%
Property 6: Inclusion-Exclusion Principle
Three Events
For three events A, B, and C:
Intuition
We add all individual probabilities, subtract pairwise overlaps (to avoid double-counting), then add back the triple overlap (which we subtracted too many times).
Example
P(C ∪ T ∪ J) = 50 + 40 + 30 - 20 - 15 - 10 + 5 = 80
80 out of 100 people drink at least one beverage.
Property 7: Conditional Probability
Conditional Probability Definition
The probability of A given that B has occurred:
Interpretation
We're restricting the sample space to only outcomes where B occurred, then asking: what fraction of those also satisfy A?
Example
P(2nd Ace | 1st Ace) = (3 remaining aces) / (51 remaining cards)
= 3/51 ≈ 5.88%
Without knowing the first card was an Ace, it would be 4/52 ≈ 7.69%. The information changed the probability!
Property 8: Multiplication Rule
Joint Probability
The probability of both A and B occurring:
Or equivalently: P(A ∩ B) = P(B) × P(A|B)
Special Case: Independent Events
When A and B are independent (one doesn't affect the other):
Example
P(heads on flip 1) = 1/2
P(heads on flip 2 | heads on flip 1) = 1/2 (independent!)
P(two heads) = 1/2 × 1/2 = 1/4 = 25%
Property 9: Law of Total Probability
Total Probability
If B₁, B₂, ..., Bₙ partition the sample space (mutually exclusive and exhaustive), then:
What It Means
To find P(A), we consider all ways A can happen through each Bᵢ, weight by P(Bᵢ), and sum.
Example
P(positive) = P(positive|disease) × P(disease)
+ P(positive|no disease) × P(no disease)
= 0.95 × 0.01 + 0.05 × 0.99
= 0.0095 + 0.0495
= 0.059 = 5.9%
Property 10: Bayes' Theorem
Bayes' Theorem
Why It Matters
Bayes' Theorem lets us reverse conditional probabilities. If we know P(B|A), we can find P(A|B). This is foundational for machine learning, medical diagnosis, and Bayesian inference.
Example
P(disease|positive) = P(positive|disease) × P(disease) / P(positive)
= 0.95 × 0.01 / 0.059
≈ 16.1%
Only 16.1% chance! This is the base rate fallacy — even with a positive test,
it's unlikely the person has the disease when it's rare.
Property 11: Independence
Independence Definition
Events A and B are independent if:
Equivalently: P(A ∩ B) = P(A) × P(B)
What It Means
Knowing that B occurred doesn't change the probability of A. The events don't influence each other.
Example
P(second roll is 3 | first roll is 3) = P(second roll is 3) = 1/6
The first roll doesn't affect the second.
Quick Reference: Summary of Properties
- Complement: P(A') = 1 - P(A)
- Impossible: P(∅) = 0
- Bounds: 0 ≤ P(A) ≤ 1
- Monotonicity: A ⊆ B → P(A) ≤ P(B)
- Addition: P(A ∪ B) = P(A) + P(B) - P(A ∩ B)
- Conditional: P(A|B) = P(A ∩ B) / P(B)
- Multiplication: P(A ∩ B) = P(A) × P(B|A)
- Total Probability: P(A) = Σ P(A|Bᵢ) × P(Bᵢ)
- Bayes: P(A|B) = P(B|A) × P(A) / P(B)
- Independence: P(A ∩ B) = P(A) × P(B)
How These All Connect
These properties form a unified system derived from the three axioms:
- Properties 1-4 establish the basic structure (bounds, complements, ordering)
- Properties 5-6 tell us how to combine independent events
- Properties 7-10 handle dependence and conditional information
- Property 11 identifies when events are independent
Conclusion
The properties of probability aren't arbitrary rules — they're logical consequences of Kolmogorov's axioms. Understanding where they come from gives you confidence in using them and helps you remember why they work the way they do.
These properties are the tools you'll use constantly in statistics, machine learning, and data science. Master them, and you'll understand the foundation of probabilistic reasoning.