International College of Digital Innovation, CMU
June 7, 2025
Axioms of Probability are three fundamental rules that define the properties of probability.
They were proposed by Andrey Kolmogorov in 1933 and form the foundation of modern probability theory.
Kolmogorov’s Axioms:
Let \(S\) be a sample space and let \(A\) be any event that is a subset of \(S\). The three axioms are as follows:
Sample space \(S\): All 52 unique cards \[ S = \{\text{Ace of Hearts}, \text{2 of Hearts},\\ \ldots, \text{King of Spades}\} \] Event \(A\): Drawing a red card
\[ A = \{\text{all Hearts and Diamonds}\\ \text{(26 cards)}\} \]
Axiom 1 Probability must be non-negative
\[ P(A) \geq 0 \]
For every event \(A\), this means the probability must always be a non-negative value—either positive or at least zero.
Axiom 2 The probability of the sample space is 1
\[ P(S) = 1 \]
This means the probability of the event that covers all possible outcomes must equal 1.
Axiom 3 Additivity for mutually exclusive events
If \(A\) and \(B\) are mutually exclusive events, meaning they have no outcomes in common (i.e., \(A \cap B = \emptyset\)), then
\[
P(A \cup B) = P(A) + P(B)
\]
This means if two events cannot occur at the same time, the probability of either one occurring is the sum of their individual probabilities.
From the three axioms, we can infer other important properties. For example:
The probability of an impossible event is zero
\[P(\emptyset) = 0\]
An impossible event has a probability of zero because it cannot occur.
Complement Rule
\[P(A^c) = 1 - P(A)\]
This means that if the probability of event \(A\) is \(P(A)\), then the probability of the event not \(A\) is \[1−P(A)1−P(A)\]
General Addition Rule of Probability
For any events \(A\) and \(B\):
\[P(A \cup B) = P(A) + P(B) - P(A \cap B)\]
This formula applies even when the events may overlap.
Rolling a single die
Let \(S = \{1, 2, 3, 4, 5, 6\}\)
Using the addition rule:
\[
P(A \cup B) = P(A) + P(B) - P(A \cap B)
\]
\[ = 0.5 + 0.333 - 0.167 = 0.666 \]
A random variable is a variable that represents the outcome of a random experiment.
Its value is determined by chance or probability.
Random variables are commonly used in statistics and probability theory to describe probability distributions of data.
There are two main types of random variables:
Discrete Random Variable
Continuous Random Variable
Takes on a countable number of possible values
Commonly used in events where outcomes can be counted, such as the number rolled on a die or the number of correct answers on a test
Examples
Rolling a die: Let \(X\) be the value shown on the die → \(X = \{1, 2, 3, 4, 5, 6\}\)
Flipping a coin: Let \(Y\) be the number of heads when flipping a coin 3 times → \(Y = \{0, 1, 2, 3\}\)
Number of customers per day: Let \(X\) be the number of customers arriving each day → \(X = 0, 1, 2, 3, 4, \cdots\)
A variable that can take on any value within a range of real numbers
Used for measurable quantities such as weight, height, or time
Examples
Customer service time: Variable \(T\) may take values between 0 and 10 minutes
Temperature in a city: Variable \(Z\) may range from 25°C to 35°C
Investment return rate: Variable \(r \in (-100\%, \infty)\)
When we can define a specific functional form for the distribution, it is called a Probability Distribution
A probability distribution describes how often each value of a random variable is expected to occur or its likelihood.
Properties of a Discrete Probability Distribution
Let \(X\) be a random variable and \(P(X)\) be the probability of each possible value of \(X\). It must satisfy the following conditions:
\(0 \leq P(X) \leq 1\) for all values of \(X\)
\(\sum P(X) = 1\) (The total probability must sum to 1)
Rolling a Die
Let the random variable \(X\) represent the number shown on a single six-sided die (\(X = 1, 2, 3, 4, 5, 6\))
\[ P(X) = \begin{cases} \frac{1}{6}, & X = 1, 2, 3, 4, 5, 6 \\ 0, & \text{otherwise} \end{cases} \]
Important Discrete Probability Distributions
Bernoulli Distribution:
Used for events with only two possible outcomes, such as success/failure
Binomial Distribution:
Models multiple independent trials where each trial has two outcomes
Poisson Distribution:
Used to model the number of events occurring within a fixed interval of time or space
Properties of a Continuous Probability Distribution
Let \(f(x)\) be a Probability Density Function (PDF). It must satisfy the following conditions:
\(f(x) \geq 0\) for all \(x\)
\(\int_{-\infty}^{\infty} f(x) \, dx = 1\)
The probability that \(X\) falls within the interval \(a \leq X \leq b\) is given by:
\[ \begin{aligned} P(a < X < b) &= P(a \leq X < b) \\ &= P(a < X \leq b) \\ &= P(a \leq X \leq b) = \int_{a}^{b} f(x) \, dx \end{aligned} \]
Normal Distribution
\[ f(x) = \frac{1}{\sigma \sqrt{2\pi}} e^{-\frac{(x - \mu)^2}{2\sigma^2}}, \quad \mu \in \mathbb{R},\ \sigma^2 > 0,\ x \in \mathbb{R} \]
\(\mu\) is the mean
\(\sigma^2\) is the variance
The shape is a bell curve (symmetrical)
import { Inputs, Plot } from "@observablehq/inputs"
// Input for mean (mu)
viewof mu = Inputs.range([-5, 5], { step: 0.1, value: 0, label: "Mean (μ)" })
// Input for standard deviation (sigma)
viewof sigma = Inputs.range([0.1, 10], { step: 0.1, value: 1, label: "Standard Deviation (σ)" })
// Input for a1 and a2
viewof a1 = Inputs.number({ label: "a₁", value: 0 })
viewof a2 = Inputs.number({ label: "a₂: (a₂ > a₁)", value: 1 })
// Selection of probability type
viewof probType = Inputs.select(["P(x < a₁)", "P(x > a₁)", "P(a₁ < x < a₂)"], { label: "Choose Probability Type" })
Normal Distribution:
Commonly used in statistics
Uniform Distribution:
All values within a given interval have equal probability
Exponential Distribution:
Often used for modeling waiting times
Statistics is the science of collecting, analyzing, interpreting, and presenting data to support decision-making or to better understand phenomena.
There are two main branches of statistics
1. Descriptive Statistics
Used to summarize and describe data, such as:
Mean
Median
Standard Deviation
Frequency Table
Various types of charts and graphs (Previous chapter)
2. Inferential Statistics
Used to analyze data in order to draw conclusions or make predictions about a population, based on a sample.
Hypothesis Testing
Parameter Estimation
Regression Analysis
1. Business and Marketing
Analyze market trends and customer behavior
Forecast product sales using Time Series Analysis
Use A/B Testing to compare the effectiveness of advertisements or marketing campaigns
2. Economics and Finance
Analyze economic conditions, such as calculating inflation and unemployment rates
Assess risk and return in investment portfolios (Portfolio Analysis)
Use econometric models to study factors influencing the economy
3. Science and Engineering
Design experiments (Design of Experiments) to develop new products
Analyze data from experiments in physics, chemistry, and biology
Perform quality control using Statistical Quality Control (SQC)
4. Medicine and Public Health
Analyze the effects of drugs or vaccines using Biostatistics
Study disease risks through Epidemiological data analysis
Use Machine Learning and AI to analyze medical records and assist in diagnosis
5. Data Science and Artificial Intelligence (AI)
Analyze Big Data to gain insights for data-driven decision making
Use Machine Learning techniques to develop predictive models
Perform Text Mining and analyze social media data
6. Education and Research
Analyze students’ academic performance and evaluate the effectiveness of curricula
Use statistics to design research studies that yield reliable conclusions
Analyze experimental data to test scientific hypotheses
Devore, J. L. (2019). Probability and statistics for engineering and the sciences (9th ed.). Cengage Learning.
Ross, S. M. (2020). Introduction to probability and statistics for engineers and scientists (6th ed.). Academic Press.
Montgomery, D. C., & Runger, G. C. (2021). Applied statistics and probability for engineers (7th ed.). Wiley.
Rice, J. A. (2006). Mathematical statistics and data analysis (3rd ed.). Cengage Learning.
Wasserman, L. (2004). All of statistics: A concise course in statistical inference. Springer.