Visualizing Data in R with Default Package:
Bar Plot and Pie Chart

Somsak Chanaim

International College of Digital Innovation, CMU

October 30, 2024

Bar Plot

A bar plot is a type of chart or graph that uses bars to represent data values.

It is essentially a visual display of information where individual bars or columns correspond to different categories or groups, and the length or height of each bar reflects the magnitude of the data it represents.

Bar plots are commonly used for visualizing categorical data and making comparisons between different groups.

The bar plot components:

Axes:

  • The horizontal axis typically represents categories or groups.

  • The vertical axis represents the values, frequencies, or percentages associated with each category.

Bars:

  • Each bar is a rectangular column that extends from the baseline (zero) to a height corresponding to the value it represents.

  • The width of the bars may vary, but they are usually uniform.

Spacing:

  • There is often some space between adjacent bars to visually separate them.

bar plots are useful for showing the distribution of data, identifying patterns, and making comparisons between different categories.

Bar Plot Using R Programming

In R, you can create a bar plot using the barplot() function.

In this example, I’ll create a basic bar plot with random data:

Let’s break down the code:

vector:

  • values is a vector containing the corresponding values for each category.

  • categories is a vector containing the names of the categories.

argument:

  • height is the data you want to visualize. (numeric variable)

  • names.arg is used to specify the names of the categories. (character/factor variable)

  • col sets the color of the bars (you can choose any color).

  • main sets the title of the plot.

  • ylab and xlab set the labels for the y-axis and x-axis, respectively.

The table() Function

The table() function in R is used to create a contingency table, which shows the frequency distribution of a variable or the cross-tabulation of multiple variables.

It counts the number of occurrences of each unique value in a vector, factor, or set of factors.

Create the Bar Plot from table() Function

Example with Pipe Operator

Add More Color to each Bar

Rotate Graph

Rotate graph by set horiz =TRUE

Ordering the variable ‘money’ from the minimum to the maximum value.

color: “#ffff00”

color: “#0000ff”

Transpose table

Listing 1

Stacked bar plot

Use t() to create another bar plot

Grouped Bar Plot

The argument beside = TRUE is used to create grouped bar plots, where bars corresponding to different categories are placed side by side, rather than stacked on top of each other.

color: “#ff0000”

color: “#00ff00”

color: “#0000ff”

Exercise

Exercise 1: Basic Bar Plot

Task:

Create a basic bar plot to visualize the frequency of different gear types in the mtcars dataset.

Instructions:

  • Load the mtcars dataset using data(mtcars).

  • Use the table() function to calculate the frequency of the gear variable.

  • Use the barplot() function to create a bar plot.

  • Add a title and labels to the axes using the main, xlab, and ylab arguments.

Solution:

# Load the mtcars dataset
data(mtcars)

# Calculate the frequency of gear types
gear_freq <- table(mtcars$gear)

# Create a basic bar plot
  barplot(gear_freq,
        main = "Frequency of Gear Types in mtcars",
        xlab = "Number of Gears",
        ylab = "Frequency",
        col = "lightblue",
        border = "black")

Exercise 2: Horizontal Bar Plot

Task:

Create a horizontal bar plot of the frequency of cylinder types in the mtcars dataset.

Instructions:

  • Use the table() function to calculate the frequency of the cyl variable.

  • Use the barplot() function with the horiz = TRUE argument to create a horizontal bar plot.

  • Customize the colors of the bars using the col argument.

Solution:

# Calculate the frequency of cylinder types
cyl_freq <- table(mtcars$cyl)

# Create a horizontal bar plot
  barplot(cyl_freq,
        horiz = TRUE,
        main = "Frequency of Cylinder Types in mtcars",
        xlab = "Frequency",
        ylab = "Number of Cylinders",
        col = "lightgreen",
        border = "black")

Exercise 3: Grouped Bar Plot

Task:

Create a grouped bar plot to compare the frequency of cylinder types (cyl) by the number of gears (gear) in the mtcars dataset.

Instructions:

  • Use the table() function to create a contingency table of cyl and gear.

  • Use the barplot() function to create a grouped bar plot by setting beside = TRUE.

  • Add a legend to the plot using the legend.text and args.legend arguments.

Solution:

# Create a contingency table of cyl and gear
cyl_gear_table <- table(mtcars$cyl, mtcars$gear)

# Create a grouped bar plot
  barplot(cyl_gear_table,
        beside = TRUE,
        main = "Cylinder Types by Gear Types",
        xlab = "Number of Gears",
        ylab = "Frequency",
        col = c("red", "green", "blue"),
        legend.text = rownames(cyl_gear_table),
        args.legend = list(title = "Cylinders", x = "topright"))

Exercise 4: Stacked Bar Plot

Task:

Create a stacked bar plot to show the distribution of cylinder types (cyl) within different gear types (gear) in the mtcars dataset.

Instructions:

  • Use the table() function to create a contingency table of cyl and gear.

  • Use the barplot() function to create a stacked bar plot (default behavior).

  • Customize the colors of the bars using a color palette.

Solution:

# Create a contingency table of cyl and gear
cyl_gear_table <- table(mtcars$cyl, mtcars$gear)

# Create a stacked bar plot
  barplot(cyl_gear_table,
        main = "Stacked Bar Plot of Cylinder Types by Gear Types",
        xlab = "Number of Gears",
        ylab = "Frequency",
        col = c("red", "green", "blue"),
        legend.text = rownames(cyl_gear_table),
        args.legend = list(title = "Cylinders", x = "topright"))

Exercise 5: Bar Plot with Custom Axis Labels and Colors

Task:

Create a bar plot showing the counts of different species in the iris dataset, with custom axis labels, colors, and bar widths.

Instructions:

  • Use the table() function to calculate the frequency of the Species variable in the iris dataset.

  • Use the barplot() function to create the bar plot.

  • Customize the bar plot with the following:

    • Use custom colors for each bar.

    • Add custom names to the x-axis labels using the names.arg argument.

    • Adjust the width of the bars using the width argument.

Solution:

# Calculate the frequency of species
species_freq <- table(iris$Species)

# Create the bar plot with customizations
  barplot(species_freq,
        main = "Frequency of Species in Iris Dataset",
        xlab = "Species",
        ylab = "Frequency",
        col = c("purple", "orange", "cyan"),
        names.arg = c("Setosa", "Versicolor", "Virginica"),
        width = 0.7,
        border = "black")

Pie Chart

A pie chart is a circular statistical graphic that is divided into slices to illustrate numerical proportions. In R, you can create a pie chart using the pie() function.

See Listing 1 for details.

pie() Function:

  • The pie() function creates a pie chart.

  • The first argument x is the data you want to visualize, which in this case is the frequency of different cylinder types.

  • The main argument specifies the title of the chart.

  • The col argument is used to set custom colors for the slices of the pie chart.

  • The labels argument customizes the labels that appear on the slices of the pie chart.