Scatter Plot
November 5, 2024
September 24, 2025
Dataset
Create a scatter plot of age against income, and use spend_score as the size of the bubbles (points). Use cex to control the size of points based on spend_score/20, and use blue color.
In a scatter plot of age vs. income, color the points based on education_years using col. Use cex for bubble size.
Solution:
plot(data$age, data$income,
main = "Income vs Age Colored by Education Years",
xlab = "Age", ylab = "Income",
col = data$education_years,
cex = data$spend_score / 20) # Adjust bubble size
legend("topright",
legend = unique(data$education_years),
col = unique(data$education_years),
pch = 1,
title = "Education Years")Plot income against age and add a regression line to show the trend. Use the abline() function after fitting a linear model with lm().
Plot spend_score against income and add a smoothing line using the lowess() function.
Create a bubble plot of age against income, with bubble size based on spend_score. Add a legend to indicate what the bubble sizes represent.
Create a 2x2 panel layout with four scatter plots showing different relationships (income vs. age, spend_score vs. age, etc.). Use par(mfrow = c(2, 2)) to set up the layout.
Solution:
par(mfrow = c(2, 2)) # Set up 2x2 plotting layout
plot(data$age, data$income,
main = "Income vs Age",
xlab = "Age", ylab = "Income")
plot(data$age, data$spend_score,
main = "Spending Score vs Age",
xlab = "Age", ylab = "Spending Score")
plot(data$education_years, data$income,
main = "Income vs Education Years",
xlab = "Education Years", ylab = "Income")
plot(data$education_years, data$spend_score,
main = "Spending Score vs Education Years",
xlab = "Education Years", ylab = "Spending Score")
par(mfrow = c(1, 1)) # Reset layoutDivide age into groups (e.g., 20–30, 30–40, etc.) and use different colors to represent each age group in a scatter plot of income vs. spend_score.
Solution:
# Divide age into groups
age_group <- cut(data$age,
breaks = c(20, 30, 40, 50, 60),
labels = c("20-30", "30-40", "40-50", "50-60"))
# Plot with color based on age groups
plot(x = data$income, y = data$spend_score,
col = as.numeric(age_group),
main = "Income vs Spend Score by Age Group",
xlab = "Income", ylab = "Spending Score",
pch = 16)
legend("topright",
legend = levels(age_group),
col = 1:4, pch = 16, title = "Age Group")In a plot of income vs. age, add labels for the points with high spend_score (above 90) using the text() function to identify individuals with high spending scores.
Solution:
plot(data$age, data$income,
main = "Income vs Age with Labels for High Spending Scores",
xlab = "Age", ylab = "Income")
# Label points with spend_score > 90
high_spenders <- data$spend_score > 90
text(x = data$age[high_spenders], y = data$income[high_spenders],
labels = data$spend_score[high_spenders],
pos = 4, col = "red")