Visualizing Data in R with ggplot2:
Bar Plot

Somsak Chanaim

International College of Digital Innovation, CMU

November 6, 2025

What is bar plot?

A bar plot (or bar chart) is a type of graph that represents categorical data with rectangular bars.

Each bar’s length or height is proportional to the value it represents.

Bar plots are commonly used to compare the sizes of different categories, making it easy to visualize and interpret differences between groups.

Key Features of a Bar Plot:

  • Categories on One Axis: Typically, the categories are plotted along the x-axis (horizontal axis) in a vertical bar plot. In a horizontal bar plot, the categories are plotted along the y-axis.

  • Bar Length/Height: The length or height of each bar corresponds to the value of the category it represents. The higher or longer the bar, the greater the value.

  • Spacing Between Bars: Bars are usually separated by spaces to distinguish between different categories.

  • Customizable: Bar plots can be customized in various ways, including the color of the bars, the orientation (vertical or horizontal), whether the bars are stacked or grouped, and the use of additional elements like labels, legends, and grid lines.

Types of Bar Plots:

  1. Vertical Bar Plot: Bars are vertical, with categories on the x-axis and values on the y-axis.

  2. Horizontal Bar Plot: Bars are horizontal, with categories on the y-axis and values on the x-axis.

  3. Stacked Bar Plot: Bars are stacked on top of each other to show sub-group values within the same category.

  4. Grouped Bar Plot: Bars for different sub-groups are placed next to each other, making it easier to compare sub-group values within each category.

Uses of Bar Plots:

  • Comparing categories: Bar plots are ideal for comparing the size of different categories within a dataset.

  • Visualizing distributions: They can show the distribution of a categorical variable.

  • Highlighting trends: Bar plots can help highlight trends or patterns in categorical data over time or across different groups.

The geom_bar() function in ggplot2

This function can generate bar plots in two primary ways:

  • By counting the occurrences of each category in a dataset (the default behavior).

  • By plotting pre-summarized data where the height of the bars represents the values provided.

Explanation:

  • aes(x = class): Maps the class variable to the x-axis.

  • geom_bar(): Automatically counts the number of occurrences for each class.

Bar Plot with Pre-Summarized Data

If you already have summarized data (e.g., counts), you can use geom_bar(stat = "identity").

Explanation

  • aes(x = group, y = count): Maps grou to the x-axis and count to the y-axis.

  • geom_bar(stat = "identity"): Uses the provided count values directly.

Stacked Bar Plot

A stacked bar plot shows the distribution of a second categorical variable within each bar.

Explanation

  • aes(fill = drv): Fills the bars based on the drv (drive type) variable.

Grouped Bar Plot

To create grouped bars instead of stacked bars, use position = "dodge" or position = "dodge2".

Explanation:

  • position = "dodge": Places bars for each group side by side instead of stacking them.

  • position = "dodge2": Similar to “dodge”, but with slightly more spacing between the bars.

Bar Plot with Custom Colors

You can customize the colors of the bars using scale_fill_manual() or other color scales.

Remark: The left-hand side shows the class types in the variable class, and the right-hand side shows the color names.

Horizontal Bar Plot

To flip the axes and create a horizontal bar plot, use coord_flip().

Explanation:

  • coord_flip(): Swaps the x and y axes to create a horizontal bar plot.

Bar Plot with Facets

You can use facets to create multiple bar plots for different subsets of data.

Explanation:

  • facet_grid(.~ drv): Creates a separate plot for each level of drv.

To sort the bar plot

you can reorder the factor levels of the variable mapped to the x-axis. This can be done using the reorder() function within aes().

Explanation:

  • reorder(class, -table(class)[class]): Reorders the levels of class based on the frequency of each level in descending order.

  • The - sign before table(class)[class] sorts the bars in descending order.

Sorting Bars by a Continuous Variable

If we have a bar plot where the y-axis represents a continuous variable, you can sort the bars based on that variable.

Explanation:

  • reorder(group, -count): Reorders the group variable based on count in descending order.

  • stat = "identity": Tells ggplot2 to use the provided value directly.

exercise

Exercise 1: Basic Bar Plot

  • Task: Create a bar plot using the mpg dataset to show the count of cars in each class.

  • Hint: Use geom_bar() with aes(x = class).

solution

mpg |> 
 ggplot() +
  aes(x = class) +
  geom_bar() +
  labs(title = "Count of Cars by Class", 
           x = "Class", 
           y = "Count")

Exercise 2: Bar Plot with Custom Colors

  • Task: Modify the bar plot from Exercise 1 to fill the bars with a custom color, such as steelblue.

  • Hint: Use fill = "steelblue" inside geom_bar().

solution

mpg |>  
ggplot() +
  aes(x = class) +
  geom_bar(fill = "steelblue") +
  labs(title = "Count of Cars by Class", 
           x = "Class", 
           y = "Count")

Exercise 3: Grouped Bar Plot

  • Task: Create a grouped bar plot showing the count of cars by class and drv (drive type).

  • Hint: Map drv to the fill aesthetic and set position = "dodge" in geom_bar().

solution

mpg |> 
ggplot() +
  aes(x = class, fill = drv) +
  geom_bar(position = "dodge") +
  labs(title = "Count of Cars by Class and Drive Type", 
           x = "Class", 
           y = "Count", 
        fill = "Drive Type")

Exercise 4: Stacked Bar Plot

  • Task: Create a stacked bar plot showing the count of cars by class and drv.

  • Hint: Map drv to the fill aesthetic. The default position is "stack".

solution

mpg |>  
ggplot() +
  aes(x = class, fill = drv) +
  geom_bar() +
  labs(title = "Stacked Bar Plot of Cars by Class and Drive Type", 
           x = "Class", 
           y = "Count", 
        fill = "Drive Type")

Exercise 5: Horizontal Bar Plot

  • Task: Create a horizontal bar plot of car counts by class.

  • Hint: Use coord_flip() to flip the axes.

solution

mpg |>  
ggplot() +
  geom_bar(fill = "lightgreen") +
  aes(x = class) +
  labs(title = "Horizontal Bar Plot of Cars by Class", 
           x = "Count", 
           y = "Class") +
    coord_flip() 

Exercise 6: Bar Plot Sorted by Count

  • Task: Create a bar plot of car counts by class, sorting the bars in descending order by count.

  • Hint: Use reorder(class, -table(class)[class]) within aes(x = ...).

solution

mpg |> 
 ggplot() +
   aes(x = reorder(class, -table(class)[class])) +
   geom_bar() +
   labs(title = "Sorted Count of Cars by Class", 
            x = "Class", 
            y = "Count")

Exercise 7: Bar Plot with Custom Labels

  • Task: Add custom labels to the x and y axes of the bar plot, and give the plot a descriptive title.

  • Hint: Use labs(title = ..., x = ..., y = ...) to customize labels.

solution

mpg |>  
 ggplot() +
   aes(x = class) +
   geom_bar(fill = "orange") +
   labs(title = "Car Count by Class", 
            x = "Car Class", 
            y = "Total Count")

Exercise 8: Bar Plot with Facets

  • Task: Create a faceted bar plot showing the count of cars by class, with separate panels for each drv value.

  • Hint: Use facet_grid(.~ drv) to create facets.

solution

ggplot(mpg, aes(x = class)) +
  geom_bar(fill = "skyblue") +
  facet_grid(.~ drv) +
  labs(title = "Car Count by Class, Faceted by Drive Type", 
           x = "Class", 
           y = "Count")

Exercise 9: Bar Plot with Custom Fill Colors

  • Task: Create a bar plot showing the count of cars by class, using a custom color palette for the bars.

  • Hint: Use scale_fill_manual(values = c("red", "blue", "green", ...)) to apply custom colors.

solution

mpg |> 
  ggplot() +
   aes(x = class, fill = class) +
    geom_bar() +
    scale_fill_manual(values = c("compact" = "red", 
                                 "midsize" = "blue", 
                                     "suv" = "green", 
                                 "minivan" = "purple", 
                                  "pickup" = "orange", 
                              "subcompact" = "pink", 
                                 "2seater" = "brown")) +
  labs(title = "Custom Color Bar Plot by Car Class", 
           x = "Class", 
           y = "Count", 
        fill = "Class")

The gglikert() Function in the ggstats Package

Overview

The gglikert() function from the ggstats package provides a convenient way to visualize Likert-scale survey data using the grammar of graphics.

It produces clean, publication-ready plots that show the distribution of responses across multiple items.

install.packages("ggstats")

What is a Likert Scale?

A Likert scale is commonly used to measure attitudes, perceptions, and agreement levels.

Examples:

  • Strongly disagree
  • Disagree
  • Neutral
  • Agree
  • Strongly agree

gglikert() is designed specifically for this type of ordered categorical data.

Key Features of gglikert()

  • Automatically detects ordered factors

  • Displays stacked bar charts for each item

  • Provides percentage labels

  • Supports customization through ggplot2 themes

  • Works well in tidyverse pipelines

Example

Quick plot

Customizing the plot

Sorting the questions

We can sort the plot with sort.

By default, the plot is sorted based on the proportion being higher than the center level, i.e. in this case the proportion of answers equal to “Agree” or “Strongly Agree”. Alternatively, the questions could be transformed into a score and sorted accorded to their mean.

Sorting the answers

We can reverse the order of the answers with reverse_likert = TRUE.

Proportion labels

Proportion labels could be removed with add_labels = FALSE.

or customized.

Custom center

By default, Likert plots will be centered, i.e. displaying the same number of categories on each side on the graph. When the number of categories is odd, half of the “central” category is displayed negatively and half positively.

It is possible to control where to center the graph, using the cutoff argument, representing the number of categories to be displayed negatively: 2 to display the two first categories negatively and the others positively; 2.25 to display the two first categories and a quarter of the third negatively

Reference

All code and content from ggstat package.