International College of Digital Innovation, CMU
September 13, 2025
Be able to identify and explain appropriate techniques for handling different types of datasets
Be able to create visualizations to show model performance
Gain knowledge of using software to apply those techniques
Demonstrate the ability to communicate results from applying selected learning methods to data
Supervised learning is a method in Machine Learning where the system learns from data that already contains answers or outcomes (labels).
It can be expressed as
\[y= f(x)+\varepsilon\]
where
y is the output, also called the dependent variable, target variable, or label
x is the input, also called the independent variable, feature, or attribute
\(f(\cdot)\) is the function that maps input to output
\(\varepsilon\) is the error term
1. Regression
Used to predict the value of a target variable that is numerical
Example algorithms: Linear Regression, Ridge Regression,Lasso, Support Vector Regression, Gradient Boosting
(Examples typically predict continuous outcomes such as prices, demand, temperature)
2. Classification
Used to predict the value of a target variable that is categorical, e.g., Yes/No, Group A/B/C
Example algorithms: Logistic Regression, Decision Tree, Random Forest, Support Vector Machine (SVM), Neural Networks
1. Regression
Sales Forecasting Retail businesses can use Regression (e.g., Linear Regression) to forecast next month’s sales.
Input: Season, price, promotions, inventory level
Output: Future sales (amount)
\[\begin{aligned} \text{Sales}_{t+1} &= \beta_0 + \beta_1 \cdot \text{Season}_t + \beta_2 \cdot \text{Price}_t \\&~~~~+ \beta_3 \cdot \text{Promotion}_t + \beta_4 \cdot \text{Inventory}_t + \varepsilon_t \end{aligned}\]
Where:
Example with numbers (for illustration):
\[ \begin{aligned} \text{Sales}_{t+1} &= 200 + 50 \cdot \text{Season}_t - 30 \cdot \text{Price}_t \\ &~~~~+ 80 \cdot \text{Promotion}_t + 0.5 \cdot \text{Inventory}_t + \varepsilon_t \end{aligned} \]
So if:
Season = 1 (holiday season)
Price = 20 (per unit)
Promotion = 1 (campaign active)
Inventory = 500 units
Then predicted sales would be:
\[\begin{aligned} 200 + 50(1) - 30(20) + 80(1) + 0.5(500) &= \\200 + 50 - 600 + 80 + 250 &= -20 \end{aligned}\]
(meaning the pricing is too high — sales prediction goes negative, signaling a bad strategy).
Another example
Price Prediction: Real estate businesses use Regression models such as Ridge Regression or Random Forest to predict house prices.
Input: House size, number of rooms, location, year built
Output: House price (numeric value)
\[\begin{aligned} \text{Price} &= \beta_0 + \beta_1 \cdot \text{Size} + \beta_2 \cdot \text{Rooms} + \beta_3 \cdot \text{Location}\\&~~~ + \beta_4 \cdot \text{YearBuilt} + \varepsilon \end{aligned}\]
Where:
Example with numbers:
\[\begin{aligned} \text{Price} &= 50{,}000 + 200 \cdot \text{Size} + 15{,}000 \cdot \text{Rooms}\\&~~~~ + 80{,}000 \cdot \text{LocationIndex} + 500 \cdot \text{YearBuilt} + \varepsilon \end{aligned}\]
If:
Then:
\[ \text{Price} = 50{,}000 + 200(120) + 15{,}000(3) + 80{,}000(2) + 500(2015) \]
\[ = 50{,}000 + 24{,}000 + 45{,}000 + 160{,}000 + 1{,}007{,}500 = 1{,}286{,}500 \]
Predicted house price = 1.29 million (approx.)
2. Classification
Churn Prediction Telecom companies or subscription-based services use Classification models (e.g., Random Forest, Logistic Regression) to detect which customers are likely to stop using the service and take marketing actions to retain them.
Input: Age, gender, usage history, complaints
Output: Churn (Yes/No)
\[ Pr(\text{Churn} = 1) \;=\; \frac{1}{1 + e^{-(\beta_0 + \beta_1 \cdot \text{Age} + \beta_2 \cdot \text{Gender} + \beta_3 \cdot \text{UsageHistory} + \beta_4 \cdot \text{Complaints})}} \]
Where:
Example (illustrative coefficients):
\[ Pr(\text{Churn} = 1) = \frac{1}{1 + e^{-( -2.5 + 0.03 \cdot \text{Age} + 0.8 \cdot \text{Gender} + 1.2 \cdot \text{Complaints} - 0.05 \cdot \text{UsageHistory})}} \]
If a transaction has Amount = 250, what will the model classify it as?
A1:
Amount >= 226.A transaction has Amount = 180 and Location = Rural. What is the prediction?
A2:
A transaction has Amount = 120. What does the model predict?
A3:
If a transaction has Amount = 170, Location = Urban, what is the prediction?
A4:
If Amount = 90, what happens?
A5:
Advantages of Supervised Learning
High accuracy when training data is of good quality
Models can be easily adjusted to fit the problem
Disadvantages of Supervised Learning
Requires labeled data, which may be costly to collect
Model performance depends on data completeness
Provides a wide range of tools such as Classification, Regression, Clustering, PCA, Text Mining.
Orange provides visualization tools that support interactive displays, such as Scatter Plot, Heatmap, Decision Tree, Network Graph.
These help users understand data and results more easily.