\(~~~~~~~~~\)Probability and Statistics\(~~~~~~~~~\)

Somsak Chanaim

International College of Digital Innovation, CMU

June 7, 2025

Axiom of Probability

Axioms of Probability are three fundamental rules that define the properties of probability.

They were proposed by Andrey Kolmogorov in 1933 and form the foundation of modern probability theory.

Kolmogorov’s Axioms:


~~~~~~~Andrey N. Kolmogorov

\(~~~~~~~\)Andrey N. Kolmogorov

Let \(S\) be a sample space and let \(A\) be any event that is a subset of \(S\). The three axioms are as follows:

Example 1: Tossing a fair coin

Sample space \(S\):
All possible outcomes \[ S = \{\text{Heads}, \text{Tails}\} \] Event \(A\): Getting a Head \[ A = \{\text{Heads}\} \]

Example 2: Rolling a six-sided die

Sample space \(S\): \[ S = \{1, 2, 3, 4, 5, 6\} \] Event \(A\): Rolling an even number \[ A = \{2, 4, 6\} \]

Example 3: Drawing a card from a standard 52-card deck


Sample space \(S\): All 52 unique cards \[ S = \{\text{Ace of Hearts}, \text{2 of Hearts},\\ \ldots, \text{King of Spades}\} \] Event \(A\): Drawing a red card

\[ A = \{\text{all Hearts and Diamonds}\\ \text{(26 cards)}\} \]

Axiom 1 Probability must be non-negative

\[ P(A) \geq 0 \]

For every event \(A\), this means the probability must always be a non-negative value—either positive or at least zero.

Axiom 2 The probability of the sample space is 1

\[ P(S) = 1 \]

This means the probability of the event that covers all possible outcomes must equal 1.

Axiom 3 Additivity for mutually exclusive events

If \(A\) and \(B\) are mutually exclusive events, meaning they have no outcomes in common (i.e., \(A \cap B = \emptyset\)), then
\[ P(A \cup B) = P(A) + P(B) \]
This means if two events cannot occur at the same time, the probability of either one occurring is the sum of their individual probabilities.

Results Derived from the Axioms of Probability

From the three axioms, we can infer other important properties. For example:

The probability of an impossible event is zero

\[P(\emptyset) = 0\]

An impossible event has a probability of zero because it cannot occur.

Complement Rule

\[P(A^c) = 1 - P(A)\]

This means that if the probability of event \(A\) is \(P(A)\), then the probability of the event not \(A\) is \[1−P(A)1−P(A)\]

General Addition Rule of Probability

For any events \(A\) and \(B\):

\[P(A \cup B) = P(A) + P(B) - P(A \cap B)\]

This formula applies even when the events may overlap.

Example Applying the Axioms of Probability

Rolling a single die

Let \(S = \{1, 2, 3, 4, 5, 6\}\)

  • Event \(A\): rolling an odd number → \(A = \{1, 3, 5\}\)
  • Event \(B\): rolling a number greater than 4 → \(B = \{5, 6\}\)
  • \(P(A) = \frac{3}{6} = 0.5\), \(P(B) = \frac{2}{6} = 0.333\),
    \(P(A \cap B) = \frac{1}{6} = 0.167\)

Using the addition rule:
\[ P(A \cup B) = P(A) + P(B) - P(A \cap B) \]

\[ = 0.5 + 0.333 - 0.167 = 0.666 \]

Random Variable

A random variable is a variable that represents the outcome of a random experiment.
Its value is determined by chance or probability.

Random variables are commonly used in statistics and probability theory to describe probability distributions of data.

There are two main types of random variables:

  1. Discrete Random Variable

  2. Continuous Random Variable

1. Discrete Random Variable

  • Takes on a countable number of possible values

  • Commonly used in events where outcomes can be counted, such as the number rolled on a die or the number of correct answers on a test

Examples

  • Rolling a die: Let \(X\) be the value shown on the die → \(X = \{1, 2, 3, 4, 5, 6\}\)

  • Flipping a coin: Let \(Y\) be the number of heads when flipping a coin 3 times → \(Y = \{0, 1, 2, 3\}\)

  • Number of customers per day: Let \(X\) be the number of customers arriving each day → \(X = 0, 1, 2, 3, 4, \cdots\)

2. Continuous Random Variable

  • A variable that can take on any value within a range of real numbers

  • Used for measurable quantities such as weight, height, or time

Examples

  • Customer service time: Variable \(T\) may take values between 0 and 10 minutes

  • Temperature in a city: Variable \(Z\) may range from 25°C to 35°C

  • Investment return rate: Variable \(r \in (-100\%, \infty)\)

When we can define a specific functional form for the distribution, it is called a Probability Distribution

Probability Distribution

A probability distribution describes how often each value of a random variable is expected to occur or its likelihood.

Properties of a Discrete Probability Distribution

Let \(X\) be a random variable and \(P(X)\) be the probability of each possible value of \(X\). It must satisfy the following conditions:

  1. \(0 \leq P(X) \leq 1\) for all values of \(X\)

  2. \(\sum P(X) = 1\) (The total probability must sum to 1)

Example

Rolling a Die

Let the random variable \(X\) represent the number shown on a single six-sided die (\(X = 1, 2, 3, 4, 5, 6\))

\[ P(X) = \begin{cases} \frac{1}{6}, & X = 1, 2, 3, 4, 5, 6 \\ 0, & \text{otherwise} \end{cases} \]

Important Discrete Probability Distributions

  1. Bernoulli Distribution:
    Used for events with only two possible outcomes, such as success/failure

  2. Binomial Distribution:
    Models multiple independent trials where each trial has two outcomes

  3. Poisson Distribution:
    Used to model the number of events occurring within a fixed interval of time or space

Properties of a Continuous Probability Distribution

Let \(f(x)\) be a Probability Density Function (PDF). It must satisfy the following conditions:

  1. \(f(x) \geq 0\) for all \(x\)

  2. \(\int_{-\infty}^{\infty} f(x) \, dx = 1\)

The probability that \(X\) falls within the interval \(a \leq X \leq b\) is given by:

\[ \begin{aligned} P(a < X < b) &= P(a \leq X < b) \\ &= P(a < X \leq b) \\ &= P(a \leq X \leq b) = \int_{a}^{b} f(x) \, dx \end{aligned} \]

Key Example

Normal Distribution

\[ f(x) = \frac{1}{\sigma \sqrt{2\pi}} e^{-\frac{(x - \mu)^2}{2\sigma^2}}, \quad \mu \in \mathbb{R},\ \sigma^2 > 0,\ x \in \mathbb{R} \]

  • \(\mu\) is the mean

  • \(\sigma^2\) is the variance

  • The shape is a bell curve (symmetrical)

Important Continuous Probability Distributions

  1. Normal Distribution:
    Commonly used in statistics

  2. Uniform Distribution:
    All values within a given interval have equal probability

  3. Exponential Distribution:
    Often used for modeling waiting times

Statistics

Statistics is the science of collecting, analyzing, interpreting, and presenting data to support decision-making or to better understand phenomena.

There are two main branches of statistics

1. Descriptive Statistics

Used to summarize and describe data, such as:

  • Mean

  • Median

  • Standard Deviation

  • Frequency Table

  • Various types of charts and graphs (Previous chapter)

2. Inferential Statistics

Used to analyze data in order to draw conclusions or make predictions about a population, based on a sample.

  • Hypothesis Testing

  • Parameter Estimation

  • Regression Analysis

Applications of Statistics

1. Business and Marketing

  • Analyze market trends and customer behavior

  • Forecast product sales using Time Series Analysis

  • Use A/B Testing to compare the effectiveness of advertisements or marketing campaigns

2. Economics and Finance

  • Analyze economic conditions, such as calculating inflation and unemployment rates

  • Assess risk and return in investment portfolios (Portfolio Analysis)

  • Use econometric models to study factors influencing the economy

3. Science and Engineering

  • Design experiments (Design of Experiments) to develop new products

  • Analyze data from experiments in physics, chemistry, and biology

  • Perform quality control using Statistical Quality Control (SQC)

4. Medicine and Public Health

  • Analyze the effects of drugs or vaccines using Biostatistics

  • Study disease risks through Epidemiological data analysis

  • Use Machine Learning and AI to analyze medical records and assist in diagnosis

5. Data Science and Artificial Intelligence (AI)

  • Analyze Big Data to gain insights for data-driven decision making

  • Use Machine Learning techniques to develop predictive models

  • Perform Text Mining and analyze social media data

6. Education and Research

  • Analyze students’ academic performance and evaluate the effectiveness of curricula

  • Use statistics to design research studies that yield reliable conclusions

  • Analyze experimental data to test scientific hypotheses

References

  • Devore, J. L. (2019). Probability and statistics for engineering and the sciences (9th ed.). Cengage Learning.

  • Ross, S. M. (2020). Introduction to probability and statistics for engineers and scientists (6th ed.). Academic Press.

  • Montgomery, D. C., & Runger, G. C. (2021). Applied statistics and probability for engineers (7th ed.). Wiley.

  • Rice, J. A. (2006). Mathematical statistics and data analysis (3rd ed.). Cengage Learning.

  • Wasserman, L. (2004). All of statistics: A concise course in statistical inference. Springer.