\(~~~~~~~~~\)Hierarchical Clustering\(~~~~~~~~~\)

Somsak Chanaim

International College of Digital Innovation, CMU

September 30, 2025

What is Hierarchical Clustering?

Hierarchical Clustering is one of the Clustering techniques
used in Exploratory Data Analysis, with the goal of dividing data into groups based on similarity.

Applications of Hierarchical Clustering in Business

Hierarchical Clustering can be applied in business to analyze and segment large datasets.

This helps companies better understand customer behavior, improve marketing strategies, and enhance overall operational efficiency.

1. Customer Segmentation

Objective: Segment customers based on behavior, interests, or other factors
to enable targeted marketing.

Real-world examples:

  • A retail company uses Hierarchical Clustering to divide customers into groups such as
    Loyal Customers, Occasional Buyers, and New Customers.
  • Banks apply this technique to segment customers by their risk levels in lending.

Benefits:

  • Design targeted advertising campaigns

  • Develop loyalty or reward programs for each customer group

  • Improve customer retention rates

2. Market Basket Analysis

Objective: Analyze which products are most frequently purchased together
to support sales strategy planning.

Real-world examples:

  • A supermarket applies Hierarchical Clustering to group items that are often bought together,
    e.g., customers who buy bread 🍞 often purchase peanut butter 🧈 as well.
  • Online stores use this information to recommend products through a Recommendation System.

Benefits:

  • Tailor promotions for specific customer groups

  • Optimize product placement within stores

  • Increase sales by suggesting related products

3. Product Categorization

Objective: Group similar products to support better inventory and product management.

Real-world examples:

  • Retail stores can use Hierarchical Clustering to categorize products into
    premium, regular, and budget items.
  • E-commerce platforms can organize products into categories, making it easier for users to search.

Benefits:

  • Improve menu structures in websites or applications

  • Plan inventory management more effectively

  • Develop suitable pricing strategies

4. Credit Risk Analysis

Objective: Segment customers based on their credit risk levels.

Real-world examples:

  • Financial institutions use Hierarchical Clustering to group customers by risk level,
    e.g., good repayment history 😊, medium risk 😐, and high risk 😨.
  • Credit card companies use this information to set appropriate credit limits for different customer groups.

Benefits:

  • Reduce the risk of lending
  • Adjust interest rates according to customer profiles
  • Prevent non-performing loans (NPL)

5. Employee Segmentation

Objective: Segment employees to design policies tailored to each group.

Real-world examples:

  • Companies can use Hierarchical Clustering to divide employees into
    🚀 High Performers 🌟, 🏢 General Staff 🙂, and 🌱 Employees needing further development 📚.
  • Human Resources (HR) departments can use this information to design training programs that fit the needs of each group.

Benefits:

  • Adjust bonus and benefits strategies
  • Develop career paths for different employee groups
  • Reduce employee turnover rates

6. Market Trend Analysis

Objective: Group market trends or customer segments to support the development of new products.

Real-world examples:

  • Technology companies can use Hierarchical Clustering to track customer trends,
    e.g., those who always adopt the latest smartphones vs. those who upgrade only when necessary.
  • Cosmetic companies can apply this technique to segment consumers,
    e.g., those who prefer organic products vs. those who prioritize budget-friendly items.

Benefits:

  • Help companies understand market trends and customer behavior
  • Enable the development of products that meet target customer needs
  • Increase competitiveness in the market

Interactive of Hierachical Clustering

How It Works

Hierarchical Clustering is based on the concept of a hierarchical structure,
by building a Dendrogram, which is a tree diagram showing the relationships among data.

  1. Start by treating each data point as its own cluster (Singleton Cluster).

  2. Merge the closest clusters based on distance or similarity.

  3. Repeat until all data points are combined into a single cluster.

Types of Hierarchical Clustering

Agglomerative Hierarchical Clustering (AHC)

  • Bottom-up approach

  • Start with each data point as its own cluster

  • Iteratively merge the closest clusters

  • Continue until only one cluster remains

What is Linkage?

Linkage is a method for measuring the distance between clusters in Hierarchical Clustering,
which directly affects how data points are merged into clusters.

  • Single Linkage

  • Complete Linkage

  • Average Linkage

  • Ward’s Method

Types of Linkage Methods

1. Single Linkage (Nearest Neighbor)

  • Uses the minimum distance between two clusters

  • Suitable for data with chain-like or connected structures

  • May suffer from the “Chain Effect,” where clusters form long chains

2. Complete Linkage (Farthest Neighbor)

  • Uses the maximum distance between two clusters

  • Produces more compact clusters

  • Reduces the chance of the Chain Effect

3. Average Linkage (Unweighted Pair Group Method with Arithmetic Mean - UPGMA)

  • Uses the average distance between all points in the two clusters

  • A compromise between Single and Complete Linkage

  • Produces balanced results in terms of cluster size and distance

4. Weighted Linkage (WPGMA - Weighted Pair Group Method with Arithmetic Mean)

  • Similar to Average Linkage but gives weight to the size of the clusters being merged

5. Centroid Linkage (Unweighted Pair Group Method with Centroid - UPGMC)

  • Uses the distance between the centroids of two clusters

  • May cause issues with overlapping clusters

6. Ward’s Method

  • Focuses on minimizing within-cluster variance
  • Often produces well-structured clusters and works well with large datasets
  • Widely used in business and data science applications

Choosing a Linkage Method

  • Single Linkage → Suitable when data tends to form connected or chain-like clusters

  • Complete Linkage → Use when compact clusters are desired

  • Average Linkage → Works well when clusters vary in size

  • Ward’s Method → Good for general use and minimizing within-cluster variance

Interactive Linkage

Example Workflow

1. Load the Data

Example dataset:

x y label
1 1 A
1 2 B
6 6 C
8 4 D
8 7 E

Visualized

2. Rescale (Normalize or Standardize) if Necessary

Not required in this example.

3. Compute the Distance Matrix

A B C D E
A 0.00 1.22 8.66 9.33 11.29
B 1.22 0.00 7.84 8.92 10.54
C 8.66 7.84 0.00 3.46 2.74
D 9.33 8.92 3.46 0.00 3.67
E 11.29 10.54 2.74 3.67 0.00

4. Perform Hierarchical Clustering

The default method is ‘complete linkage’, but other methods such as ‘single’ or ‘average’ can also be used.

5. Plot the Dendrogram

A dendrogram is a tree-like diagram that shows the hierarchical structure of clusters.

Hierachical Clustering Step by Step

Hierarchical Clustering with Orange

The workflow

The workflow

Download dataset from Google Drive

Two-cluster dataset

Two-cluster dataset

From the workflow steps, double-click the Distance widget.

Choose Euclidean distance

Choose Euclidean distance

Next, double-click the Hierarchical Clustering widget.

From the Linkage menu, you can choose among Single, Average, Weighted, Complete, and Ward.

You can select clusters in Hierarchical Clustering by adjusting the vertical cutoff line.

In this case, we known that this data set has 2 group (C1 and C2).

The result from Single linkage.

single linkage

single linkage

The result from Average linkage.

average linkage

average linkage

The result from Weight linkage.

weight linkage

weight linkage

The result from Complete linkage.

complete linkage

complete linkage

The result from Ward linkage.

ward linkage

ward linkage

How the linkage work, you can see from my slide.

References

  • AJDA, “Hierarchical Clustering: A Simple Explanation”,https://orangedatamining.com/blog/hierarchical-clustering-a-simple-explanation/

  • Karypis, G., Han, E. H., & Kumar, V. (1999). Chameleon: Hierarchical clustering using dynamic modeling. Computer, 32(8), 68–75.

  • Murtagh, F., & Contreras, P. (2012). Algorithms for hierarchical clustering: An overview. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2(1), 86–97.

  • Jain, A. K., & Dubes, R. C. (1988). Algorithms for clustering data. Prentice-Hall.