International College of Digital Innovation, CMU
October 30, 2024
A bar plot is a type of chart or graph that uses bars to represent data values.
It is essentially a visual display of information where individual bars or columns correspond to different categories or groups, and the length or height of each bar reflects the magnitude of the data it represents.
Bar plots are commonly used for visualizing categorical data and making comparisons between different groups.
The bar plot components:
Axes:
The horizontal axis typically represents categories or groups.
The vertical axis represents the values, frequencies, or percentages associated with each category.
Bars:
Each bar is a rectangular column that extends from the baseline (zero) to a height corresponding to the value it represents.
The width of the bars may vary, but they are usually uniform.
Spacing:
bar plots are useful for showing the distribution of data, identifying patterns, and making comparisons between different categories.
In R, you can create a bar plot using the barplot()
function.
In this example, I’ll create a basic bar plot with random data:
vector:
values
is a vector containing the corresponding values for each category.
categories
is a vector containing the names of the categories.
argument:
height
is the data you want to visualize. (numeric variable)
names.arg
is used to specify the names of the categories. (character/factor variable)
col
sets the color of the bars (you can choose any color).
main
sets the title of the plot.
ylab
and xlab
set the labels for the y-axis and x-axis, respectively.
The table() function in R is used to create a contingency table, which shows the frequency distribution of a variable or the cross-tabulation of multiple variables.
It counts the number of occurrences of each unique value in a vector, factor, or set of factors.
Rotate graph by set horiz =TRUE
color: “#ffff00”
color: “#0000ff”
Use t() to create another bar plot
The argument beside = TRUE
is used to create grouped bar plots, where bars corresponding to different categories are placed side by side, rather than stacked on top of each other.
color: “#ff0000”
color: “#00ff00”
color: “#0000ff”
Task:
Create a basic bar plot to visualize the frequency of different gear types in the mtcars
dataset.
Instructions:
Load the mtcars
dataset using data(mtcars)
.
Use the table()
function to calculate the frequency of the gear
variable.
Use the barplot()
function to create a bar plot.
Add a title and labels to the axes using the main
, xlab
, and ylab
arguments.
Task:
Create a horizontal bar plot of the frequency of cylinder types in the mtcars
dataset.
Instructions:
Use the table()
function to calculate the frequency of the cyl
variable.
Use the barplot()
function with the horiz = TRUE
argument to create a horizontal bar plot.
Customize the colors of the bars using the col
argument.
Task:
Create a grouped bar plot to compare the frequency of cylinder types (cyl
) by the number of gears (gear
) in the mtcars
dataset.
Instructions:
Use the table()
function to create a contingency table of cyl
and gear
.
Use the barplot()
function to create a grouped bar plot by setting beside = TRUE
.
Add a legend to the plot using the legend.text
and args.legend
arguments.
Solution:
# Create a contingency table of cyl and gear
cyl_gear_table <- table(mtcars$cyl, mtcars$gear)
# Create a grouped bar plot
barplot(cyl_gear_table,
beside = TRUE,
main = "Cylinder Types by Gear Types",
xlab = "Number of Gears",
ylab = "Frequency",
col = c("red", "green", "blue"),
legend.text = rownames(cyl_gear_table),
args.legend = list(title = "Cylinders", x = "topright"))
Task:
Create a stacked bar plot to show the distribution of cylinder types (cyl
) within different gear types (gear
) in the mtcars
dataset.
Instructions:
Use the table()
function to create a contingency table of cyl
and gear
.
Use the barplot()
function to create a stacked bar plot (default behavior).
Customize the colors of the bars using a color palette.
Solution:
# Create a contingency table of cyl and gear
cyl_gear_table <- table(mtcars$cyl, mtcars$gear)
# Create a stacked bar plot
barplot(cyl_gear_table,
main = "Stacked Bar Plot of Cylinder Types by Gear Types",
xlab = "Number of Gears",
ylab = "Frequency",
col = c("red", "green", "blue"),
legend.text = rownames(cyl_gear_table),
args.legend = list(title = "Cylinders", x = "topright"))
Task:
Create a bar plot showing the counts of different species in the iris
dataset, with custom axis labels, colors, and bar widths.
Instructions:
Use the table()
function to calculate the frequency of the Species
variable in the iris
dataset.
Use the barplot()
function to create the bar plot.
Customize the bar plot with the following:
Use custom colors for each bar.
Add custom names to the x-axis labels using the names.arg
argument.
Adjust the width of the bars using the width
argument.
Solution:
# Calculate the frequency of species
species_freq <- table(iris$Species)
# Create the bar plot with customizations
barplot(species_freq,
main = "Frequency of Species in Iris Dataset",
xlab = "Species",
ylab = "Frequency",
col = c("purple", "orange", "cyan"),
names.arg = c("Setosa", "Versicolor", "Virginica"),
width = 0.7,
border = "black")
A pie chart is a circular statistical graphic that is divided into slices to illustrate numerical proportions. In R, you can create a pie chart using the pie()
function.
See Listing 1 for details.
pie() Function:
The pie()
function creates a pie chart.
The first argument x
is the data you want to visualize, which in this case is the frequency of different cylinder types.
The main argument specifies the title of the chart.
The col
argument is used to set custom colors for the slices of the pie chart.
The labels
argument customizes the labels that appear on the slices of the pie chart.