International College of Digital Innovation, CMU
November 6, 2025
A bar plot (or bar chart) is a type of graph that represents categorical data with rectangular bars.
Each bar’s length or height is proportional to the value it represents.
Bar plots are commonly used to compare the sizes of different categories, making it easy to visualize and interpret differences between groups.
Key Features of a Bar Plot:
Categories on One Axis: Typically, the categories are plotted along the x-axis (horizontal axis) in a vertical bar plot. In a horizontal bar plot, the categories are plotted along the y-axis.
Bar Length/Height: The length or height of each bar corresponds to the value of the category it represents. The higher or longer the bar, the greater the value.
Spacing Between Bars: Bars are usually separated by spaces to distinguish between different categories.
Customizable: Bar plots can be customized in various ways, including the color of the bars, the orientation (vertical or horizontal), whether the bars are stacked or grouped, and the use of additional elements like labels, legends, and grid lines.
Types of Bar Plots:
Vertical Bar Plot: Bars are vertical, with categories on the x-axis and values on the y-axis.
Horizontal Bar Plot: Bars are horizontal, with categories on the y-axis and values on the x-axis.
Stacked Bar Plot: Bars are stacked on top of each other to show sub-group values within the same category.
Grouped Bar Plot: Bars for different sub-groups are placed next to each other, making it easier to compare sub-group values within each category.
Uses of Bar Plots:
Comparing categories: Bar plots are ideal for comparing the size of different categories within a dataset.
Visualizing distributions: They can show the distribution of a categorical variable.
Highlighting trends: Bar plots can help highlight trends or patterns in categorical data over time or across different groups.
This function can generate bar plots in two primary ways:
By counting the occurrences of each category in a dataset (the default behavior).
By plotting pre-summarized data where the height of the bars represents the values provided.
Explanation:
aes(x = class): Maps the class variable to the x-axis.
geom_bar(): Automatically counts the number of occurrences for each class.
If you already have summarized data (e.g., counts), you can use geom_bar(stat = "identity").
Explanation
aes(x = group, y = count): Maps grou to the x-axis and count to the y-axis.
geom_bar(stat = "identity"): Uses the provided count values directly.
A stacked bar plot shows the distribution of a second categorical variable within each bar.
Explanation
aes(fill = drv): Fills the bars based on the drv (drive type) variable.To create grouped bars instead of stacked bars, use position = "dodge" or position = "dodge2".
Explanation:
position = "dodge": Places bars for each group side by side instead of stacking them.
position = "dodge2": Similar to “dodge”, but with slightly more spacing between the bars.
You can customize the colors of the bars using scale_fill_manual() or other color scales.
Remark: The left-hand side shows the class types in the variable class, and the right-hand side shows the color names.
To flip the axes and create a horizontal bar plot, use coord_flip().
Explanation:
coord_flip(): Swaps the x and y axes to create a horizontal bar plot.You can use facets to create multiple bar plots for different subsets of data.
Explanation:
facet_grid(.~ drv): Creates a separate plot for each level of drv.you can reorder the factor levels of the variable mapped to the x-axis. This can be done using the reorder() function within aes().
Explanation:
reorder(class, -table(class)[class]): Reorders the levels of class based on the frequency of each level in descending order.
The - sign before table(class)[class] sorts the bars in descending order.
If we have a bar plot where the y-axis represents a continuous variable, you can sort the bars based on that variable.
Explanation:
reorder(group, -count): Reorders the group variable based on count in descending order.
stat = "identity": Tells ggplot2 to use the provided value directly.
Task: Create a bar plot using the mpg dataset to show the count of cars in each class.
Hint: Use geom_bar() with aes(x = class).
Task: Modify the bar plot from Exercise 1 to fill the bars with a custom color, such as steelblue.
Hint: Use fill = "steelblue" inside geom_bar().
Task: Create a grouped bar plot showing the count of cars by class and drv (drive type).
Hint: Map drv to the fill aesthetic and set position = "dodge" in geom_bar().
Task: Create a stacked bar plot showing the count of cars by class and drv.
Hint: Map drv to the fill aesthetic. The default position is "stack".
Task: Create a horizontal bar plot of car counts by class.
Hint: Use coord_flip() to flip the axes.
Task: Create a bar plot of car counts by class, sorting the bars in descending order by count.
Hint: Use reorder(class, -table(class)[class]) within aes(x = ...).
Task: Add custom labels to the x and y axes of the bar plot, and give the plot a descriptive title.
Hint: Use labs(title = ..., x = ..., y = ...) to customize labels.
Task: Create a faceted bar plot showing the count of cars by class, with separate panels for each drv value.
Hint: Use facet_grid(.~ drv) to create facets.
Task: Create a bar plot showing the count of cars by class, using a custom color palette for the bars.
Hint: Use scale_fill_manual(values = c("red", "blue", "green", ...)) to apply custom colors.
solution
mpg |>
ggplot() +
aes(x = class, fill = class) +
geom_bar() +
scale_fill_manual(values = c("compact" = "red",
"midsize" = "blue",
"suv" = "green",
"minivan" = "purple",
"pickup" = "orange",
"subcompact" = "pink",
"2seater" = "brown")) +
labs(title = "Custom Color Bar Plot by Car Class",
x = "Class",
y = "Count",
fill = "Class")The gglikert() function from the ggstats package provides a convenient way to visualize Likert-scale survey data using the grammar of graphics.
It produces clean, publication-ready plots that show the distribution of responses across multiple items.
A Likert scale is commonly used to measure attitudes, perceptions, and agreement levels.
Examples:
gglikert() is designed specifically for this type of ordered categorical data.
Automatically detects ordered factors
Displays stacked bar charts for each item
Provides percentage labels
Supports customization through ggplot2 themes
Works well in tidyverse pipelines
We can sort the plot with sort.
By default, the plot is sorted based on the proportion being higher than the center level, i.e. in this case the proportion of answers equal to “Agree” or “Strongly Agree”. Alternatively, the questions could be transformed into a score and sorted accorded to their mean.
We can reverse the order of the answers with reverse_likert = TRUE.
Proportion labels could be removed with add_labels = FALSE.
or customized.
By default, Likert plots will be centered, i.e. displaying the same number of categories on each side on the graph. When the number of categories is odd, half of the “central” category is displayed negatively and half positively.
It is possible to control where to center the graph, using the cutoff argument, representing the number of categories to be displayed negatively: 2 to display the two first categories negatively and the others positively; 2.25 to display the two first categories and a quarter of the third negatively
All code and content from ggstat package.