Histogram
Dataset
First, create this sample data frame, sales_data, which contains random data about sales representatives in a company:
This data frame includes:
SalesRepID
: A unique identifier for each sales representative.Age
: The age of each sales representative.Sales
: Total sales in dollars (normally distributed).YearsExperience
: Number of years of experience.Region
: The region where the representative is based.
Plot two histograms of Sales
, one for representatives in the “East” region and another for those in the “West” region. Display these histograms side-by-side using par(mfrow = c(1, 2))
.
Solution:
library(dplyr)
par(mfrow = c(1, 2)) # Set layout for side-by-side plots
sales_data2 <- sales_data |>
filter(Region == "East")
# plot 1
hist(sales_data2$Sales,
main = "Sales in East Region",
xlab = "Sales",
col = "blue")
sales_data3 <- sales_data |>
filter(Region == "West")
# plot 2
hist(sales_data3$Sales,
main = "Sales in West Region",
xlab = "Sales",
col = "orange")
par(mfrow = c(1, 1)) # Reset layout
Plot a histogram of Sales for representatives in the “North” region, then overlay a histogram of Sales for representatives in the “South” region on the same plot. Use different colors and adjust transparency to compare the sales distribution between these two regions.
Solution:
# Subset data by region
sales_north <- sales_data |>
filter(Region == "North")
sales_south <- sales_data |>
filter(Region == "South")
# Plot the first histogram
hist(sales_north$Sales,
col = rgb(1, 0, 0, 0.5),
main = "Sales Distribution (North vs South)",
xlab = "Sales",
xlim = range(sales_data$Sales),
ylim = c(0, 15))
# Add the second histogram
hist(sales_south$Sales,
col = rgb(0, 0, 1, 0.5), add = TRUE)
# Add legend
legend("topright",
legend = c("North", "South"),
fill = c(rgb(1, 0, 0, 0.5),
rgb(0, 0, 1, 0.5)))
Note Normally, I use the yarrr
package to apply transparency to colors.”
Create two histograms on the same plot to show the Age distribution of representatives with YearsExperience less than 10 and those with YearsExperience of 10 or more. Use different colors and transparency to clearly visualize the overlap.
Solution:
# Subset data by experience level
age_less_10 <- sales_data |>
filter(YearsExperience < 10)
age_10_or_more <- sales_data |>
filter(YearsExperience >= 10)
# Plot the first histogram
hist(age_less_10$Age,
col = rgb(0, 1, 0, 0.5),
main = "Age Distribution (Experience Level)",
xlab = "Age",
xlim = range(sales_data$Age),
ylim = c(0, 20))
# Add the second histogram
hist(age_10_or_more$Age,
col = rgb(1, 0.5, 0, 0.5),
add = TRUE)
# Add legend
legend("topright",
legend = c("Experience < 10", "Experience >= 10"),
fill = c(rgb(0, 1, 0, 0.5),
rgb(1, 0.5, 0, 0.5)))
Note Normally, I use the yarrr
package to apply transparency to colors.”
Plot histograms of Sales for each region (“North”, “South”, “East”, and “West”) on the same graph. Use a different color for each region and add a legend to indicate which color corresponds to each region.
Solution:
# Subset data by region
sales_north <- sales_data |>
filter(Region == "North")
sales_south <- sales_data |>
filter(Region == "South")
sales_east <- sales_data |>
filter(Region == "East")
sales_west <- sales_data |>
filter(Region == "West")
# Plot each histogram with different colors
hist(sales_north$Sales,
col = rgb(1, 0, 0, 0.4),
main = "Sales Distribution by Region",
xlab = "Sales",
xlim = range(sales_data$Sales),
ylim = c(0, 12))
hist(sales_south$Sales,
col = rgb(0, 1, 0, 0.4), add = TRUE)
hist(sales_east$Sales,
col = rgb(0, 0, 1, 0.4), add = TRUE)
hist(sales_west$Sales,
col = rgb(1, 1, 0, 0.4), add = TRUE)
# Add legend
legend("topright",
legend = c("North", "South", "East", "West"),
fill = c(rgb(1, 0, 0, 0.4),
rgb(0, 1, 0, 0.4),
rgb(0, 0, 1, 0.4),
rgb(1, 1, 0, 0.4)))
Note Normally, I use the yarrr
package to apply transparency to colors.”