Scatter Plot

Dataset

Plot income against age to understand the relationship between these two variables. Use basic plot().

Solution:

plot(data$age, data$income, 
     main = "Income vs Age", 
     xlab = "Age", ylab = "Income")

Plot spend_score against age. Add a title, x-axis label, and y-axis label to describe the plot.

Solution:

plot(data$age, data$income, 
     main = "Income vs Age", 
     xlab = "Age", ylab = "Income")

Create a scatter plot of age against income, and use spend_score as the size of the bubbles (points). Use cex to control the size of points based on spend_score/20, and use blue color.

Solution:

plot(data$age, data$income, 
     main = "Bubble Plot of Income vs Age", 
     xlab = "Age", 
     ylab = "Income", 
     cex = data$spend_score / 20,  # Adjust bubble size
     col = "blue") 

In a scatter plot of age vs. income, color the points based on education_years using col. Use cex for bubble size.

Solution:

plot(data$age, data$income, 
     main = "Income vs Age Colored by Education Years", 
     xlab = "Age", ylab = "Income", 
     col = data$education_years, 
     cex = data$spend_score / 20) # Adjust bubble size
     
legend("topright", 
       legend = unique(data$education_years), 
          col = unique(data$education_years), 
          pch = 1, 
        title = "Education Years")

Plot income against age and add a regression line to show the trend. Use the abline() function after fitting a linear model with lm().

Solution:

plot(data$age, data$income, 
     main = "Income vs Age with Regression Line", 
     xlab = "Age", 
     ylab = "Income")
model <- lm(income ~ age, data = data) 
abline(model, col = "red", lwd = 2) # Add regression line

Plot spend_score against income and add a smoothing line using the lowess() function.

Solution:

plot(data$income, data$spend_score, 
     main = "Spending Score vs Income with Smoothing Line", 
     xlab = "Income", 
     ylab = "Spending Score")
     
lines(lowess(data$income, data$spend_score), 
      col = "blue", lwd = 2) # Add smoothing line

Create a bubble plot of age against income, with bubble size based on spend_score. Add a legend to indicate what the bubble sizes represent.

Solution:

plot(data$age, data$income, 
     main = "Bubble Plot of Income vs Age", 
     xlab = "Age", 
     ylab = "Income", 
     cex = data$spend_score / 20, # Adjust bubble size
     col = "purple")              # Use any color.
     
legend("topright", 
       legend = "Bubble Size ~ Spend Score", 
       pch = 1, 
       col = "purple", pt.cex = 2)

Create a 2x2 panel layout with four scatter plots showing different relationships (income vs. age, spend_score vs. age, etc.). Use par(mfrow = c(2, 2)) to set up the layout.

Solution:

par(mfrow = c(2, 2)) # Set up 2x2 plotting layout

plot(data$age, data$income, 
     main = "Income vs Age", 
     xlab = "Age", ylab = "Income")
     
plot(data$age, data$spend_score, 
     main = "Spending Score vs Age", 
     xlab = "Age", ylab = "Spending Score")
     
plot(data$education_years, data$income, 
     main = "Income vs Education Years", 
     xlab = "Education Years", ylab = "Income")
     
plot(data$education_years, data$spend_score, 
     main = "Spending Score vs Education Years", 
     xlab = "Education Years", ylab = "Spending Score")
     
par(mfrow = c(1, 1)) # Reset layout

Divide age into groups (e.g., 20–30, 30–40, etc.) and use different colors to represent each age group in a scatter plot of income vs. spend_score.

Solution:

# Divide age into groups
age_group <- cut(data$age, 
                 breaks = c(20, 30, 40, 50, 60), 
                 labels = c("20-30", "30-40", "40-50", "50-60"))

# Plot with color based on age groups
plot(x = data$income, y = data$spend_score, 
     col = as.numeric(age_group), 
     main = "Income vs Spend Score by Age Group", 
     xlab = "Income", ylab = "Spending Score", 
     pch = 16)
     
legend("topright", 
       legend = levels(age_group), 
       col = 1:4, pch = 16, title = "Age Group")

In a plot of income vs. age, add labels for the points with high spend_score (above 90) using the text() function to identify individuals with high spending scores.

Solution:

plot(data$age, data$income, 
     main = "Income vs Age with Labels for High Spending Scores", 
     xlab = "Age", ylab = "Income")

# Label points with spend_score > 90
high_spenders <- data$spend_score > 90

text(x = data$age[high_spenders], y = data$income[high_spenders], 
     labels = data$spend_score[high_spenders], 
     pos = 4, col = "red")