\(~\)Probability and Statistics\(~\)

Somsak Chanaim

International College of Digital Innovation, CMU

June 7, 2025

Axiom of Probability

Axioms of Probability are three fundamental rules that define the properties of probability.

They were proposed by Andrey Kolmogorov in 1933 and form the foundation of modern probability theory.

Kolmogorov’s Axioms:

~~~~~~~Andrey N. Kolmogorov — \(~~~~~~~\)Andrey N. Kolmogorov

Let \(S\) be a sample space and let \(A\) be any event that is a subset of \(S\). The three axioms are as follows:

Example 1: Tossing a fair coin

Sample space \(S\):
All possible outcomes \[ S = \{\text{Heads}, \text{Tails}\} \] Event \(A\): Getting a Head \[ A = \{\text{Heads}\} \]

Example 2: Rolling a six-sided die

Sample space \(S\): \[ S = \{1, 2, 3, 4, 5, 6\} \] Event \(A\): Rolling an even number \[ A = \{2, 4, 6\} \]

Example 3: Drawing a card from a standard 52-card deck

Sample space \(S\): All 52 unique cards \[ S = \{\text{Ace of Hearts}, \text{2 of Hearts},\\ \ldots, \text{King of Spades}\} \] Event \(A\): Drawing a red card

\[ A = \{\text{all Hearts and Diamonds}\\ \text{(26 cards)}\} \]

Axiom 1 Probability must be non-negative

\[ P(A) \geq 0 \]

For every event \(A\), this means the probability must always be a non-negative value—either positive or at least zero.

Axiom 2 The probability of the sample space is 1

\[ P(S) = 1 \]

This means the probability of the event that covers all possible outcomes must equal 1.

Axiom 3 Additivity for mutually exclusive events

If \(A\) and \(B\) are mutually exclusive events, meaning they have no outcomes in common (i.e., \(A \cap B = \emptyset\)), then
\[ P(A \cup B) = P(A) + P(B) \]
This means if two events cannot occur at the same time, the probability of either one occurring is the sum of their individual probabilities.

Results Derived from the Axioms of Probability

From the three axioms, we can infer other important properties. For example:

The probability of an impossible event is zero

\[P(\emptyset) = 0\]

An impossible event has a probability of zero because it cannot occur.

Complement Rule

\[P(A^c) = 1 - P(A)\]

This means that if the probability of event \(A\) is \(P(A)\), then the probability of the event not \(A\) is \[1−P(A)1−P(A)\]

General Addition Rule of Probability

For any events \(A\) and \(B\):

\[P(A \cup B) = P(A) + P(B) - P(A \cap B)\]

This formula applies even when the events may overlap.

Example Applying the Axioms of Probability

Rolling a single die

Let \(S = \{1, 2, 3, 4, 5, 6\}\)

Event \(A\): rolling an odd number → \(A = \{1, 3, 5\}\)
Event \(B\): rolling a number greater than 4 → \(B = \{5, 6\}\)
\(P(A) = \frac{3}{6} = 0.5\), \(P(B) = \frac{2}{6} = 0.333\),
\(P(A \cap B) = \frac{1}{6} = 0.167\)

Using the addition rule:
\[ P(A \cup B) = P(A) + P(B) - P(A \cap B) \]

\[ = 0.5 + 0.333 - 0.167 = 0.666 \]

Random Variable

A random variable is a variable that represents the outcome of a random experiment.
Its value is determined by chance or probability.

Random variables are commonly used in statistics and probability theory to describe probability distributions of data.

There are two main types of random variables:

Discrete Random Variable
Continuous Random Variable

1. Discrete Random Variable

Takes on a countable number of possible values
Commonly used in events where outcomes can be counted, such as the number rolled on a die or the number of correct answers on a test

Examples

Rolling a die: Let \(X\) be the value shown on the die → \(X = \{1, 2, 3, 4, 5, 6\}\)
Flipping a coin: Let \(Y\) be the number of heads when flipping a coin 3 times → \(Y = \{0, 1, 2, 3\}\)
Number of customers per day: Let \(X\) be the number of customers arriving each day → \(X = 0, 1, 2, 3, 4, \cdots\)

2. Continuous Random Variable

A variable that can take on any value within a range of real numbers
Used for measurable quantities such as weight, height, or time

Examples

Customer service time: Variable \(T\) may take values between 0 and 10 minutes
Temperature in a city: Variable \(Z\) may range from 25°C to 35°C
Investment return rate: Variable \(r \in (-100\%, \infty)\)

When we can define a specific functional form for the distribution, it is called a Probability Distribution

Probability Distribution

A probability distribution describes how often each value of a random variable is expected to occur or its likelihood.

Properties of a Discrete Probability Distribution

Let \(X\) be a random variable and \(P(X)\) be the probability of each possible value of \(X\). It must satisfy the following conditions:

\(0 \leq P(X) \leq 1\) for all values of \(X\)
\(\sum P(X) = 1\) (The total probability must sum to 1)

Example

Rolling a Die

Let the random variable \(X\) represent the number shown on a single six-sided die (\(X = 1, 2, 3, 4, 5, 6\))

\[ P(X) = \begin{cases} \frac{1}{6}, & X = 1, 2, 3, 4, 5, 6 \\ 0, & \text{otherwise} \end{cases} \]

Important Discrete Probability Distributions

Bernoulli Distribution:
Used for events with only two possible outcomes, such as success/failure
Binomial Distribution:
Models multiple independent trials where each trial has two outcomes
Poisson Distribution:
Used to model the number of events occurring within a fixed interval of time or space

Properties of a Continuous Probability Distribution

Let \(f(x)\) be a Probability Density Function (PDF). It must satisfy the following conditions:

\(f(x) \geq 0\) for all \(x\)
\(\int_{-\infty}^{\infty} f(x) \, dx = 1\)

The probability that \(X\) falls within the interval \(a \leq X \leq b\) is given by:

\[ \begin{aligned} P(a < X < b) &= P(a \leq X < b) \\ &= P(a < X \leq b) \\ &= P(a \leq X \leq b) = \int_{a}^{b} f(x) \, dx \end{aligned} \]

Key Example

Normal Distribution

\[ f(x) = \frac{1}{\sigma \sqrt{2\pi}} e^{-\frac{(x - \mu)^2}{2\sigma^2}}, \quad \mu \in \mathbb{R},\ \sigma^2 > 0,\ x \in \mathbb{R} \]

\(\mu\) is the mean
\(\sigma^2\) is the variance
The shape is a bell curve (symmetrical)

import { Inputs, Plot } from "@observablehq/inputs"

// Input for mean (mu)
viewof mu = Inputs.range([-5, 5], { step: 0.1, value: 0, label: "Mean (μ)" })

// Input for standard deviation (sigma)
viewof sigma = Inputs.range([0.1, 10], { step: 0.1, value: 1, label: "Standard Deviation (σ)" })

// Input for a1 and a2
viewof a1 = Inputs.number({ label: "a₁", value: 0 })
viewof a2 = Inputs.number({ label: "a₂: (a₂ > a₁)", value: 1 })

// Selection of probability type
viewof probType = Inputs.select(["P(x < a₁)", "P(x > a₁)", "P(a₁ < x < a₂)"], { label: "Choose Probability Type" })

Important Continuous Probability Distributions

Normal Distribution:
Commonly used in statistics
Uniform Distribution:
All values within a given interval have equal probability
Exponential Distribution:
Often used for modeling waiting times

Statistics

Statistics is the science of collecting, analyzing, interpreting, and presenting data to support decision-making or to better understand phenomena.

There are two main branches of statistics

1. Descriptive Statistics

Used to summarize and describe data, such as:

Mean
Median
Standard Deviation
Frequency Table
Various types of charts and graphs (Previous chapter)

2. Inferential Statistics

Used to analyze data in order to draw conclusions or make predictions about a population, based on a sample.

Hypothesis Testing
Parameter Estimation
Regression Analysis

Applications of Statistics

1. Business and Marketing

Analyze market trends and customer behavior
Forecast product sales using Time Series Analysis
Use A/B Testing to compare the effectiveness of advertisements or marketing campaigns

2. Economics and Finance

Analyze economic conditions, such as calculating inflation and unemployment rates
Assess risk and return in investment portfolios (Portfolio Analysis)
Use econometric models to study factors influencing the economy

3. Science and Engineering

Design experiments (Design of Experiments) to develop new products
Analyze data from experiments in physics, chemistry, and biology
Perform quality control using Statistical Quality Control (SQC)

4. Medicine and Public Health

Analyze the effects of drugs or vaccines using Biostatistics
Study disease risks through Epidemiological data analysis
Use Machine Learning and AI to analyze medical records and assist in diagnosis

5. Data Science and Artificial Intelligence (AI)

Analyze Big Data to gain insights for data-driven decision making
Use Machine Learning techniques to develop predictive models
Perform Text Mining and analyze social media data

6. Education and Research

Analyze students’ academic performance and evaluate the effectiveness of curricula
Use statistics to design research studies that yield reliable conclusions
Analyze experimental data to test scientific hypotheses

References

Devore, J. L. (2019). Probability and statistics for engineering and the sciences (9th ed.). Cengage Learning.
Ross, S. M. (2020). Introduction to probability and statistics for engineers and scientists (6th ed.). Academic Press.
Montgomery, D. C., & Runger, G. C. (2021). Applied statistics and probability for engineers (7th ed.). Wiley.
Rice, J. A. (2006). Mathematical statistics and data analysis (3rd ed.). Cengage Learning.
Wasserman, L. (2004). All of statistics: A concise course in statistical inference. Springer.

\(~~~~~~~~~\)Probability and Statistics\(~~~~~~~~~\)

Axiom of Probability

Example 1: Tossing a fair coin

Example 2: Rolling a six-sided die

Example 3: Drawing a card from a standard 52-card deck

Results Derived from the Axioms of Probability

Example Applying the Axioms of Probability

Random Variable

1. Discrete Random Variable

2. Continuous Random Variable

Probability Distribution

Example

Key Example

Important Continuous Probability Distributions

Statistics

Applications of Statistics

References

\(~\)Probability and Statistics\(~\)