Multiple Choices: Data Wrangling with dplyr

Questions

  1. Given df <- data.frame(A = c(5, 3, 2, 8), B = c("X", "Y", "Z", "X")), which command will filter rows where column A is greater than 4? ::: {.cell}
  1. If df <- data.frame(A = 1:5, B = letters[1:5]), what will df |> select(B) return? ::: {.cell}

:::

  1. Which dplyr function would you use to create a new column C that is double the values in column A in df <- data.frame(A = 1:3, B = 4:6)? ::: {.cell}

:::

  1. Given df <- data.frame(A = c(3, 1, 2), B = c("apple", "banana", "apple")), which command will sort df by column A in ascending order? ::: {.cell}

:::

  1. If df <- data.frame(A = c("apple", "banana", "apple"), B = c(5, 10, 15)), which command will calculate the mean of B grouped by A? ::: {.cell}

:::

  1. What will df |> filter(B == "apple") do for df <- data.frame(A = 1:3, B = c("apple", "banana", "apple"))? ::: {.cell}

:::

  1. To select columns A and C from a data frame df, which command is correct? ::: {.cell}

:::

  1. Given df <- data.frame(A = c(1, 2, 3), B = c(10, 20, 30)), which command will add a new column C with values A + B? ::: {.cell}

:::

  1. If df <- data.frame(A = c("X", "Y", "X", "Y"), B = c(2, 4, 6, 8)), which command will sum the values of B for each group in A? ::: {.cell}

:::

  1. Which command will arrange df in descending order by column B for df <- data.frame(A = c("apple", "banana", "cherry"), B = c(5, 3, 8))? ::: {.cell}

:::

  1. Given df <- data.frame(A = c(10, 20, 30), B = c(100, 200, 300)), which command will filter rows where B is greater than 150? ::: {.cell}

:::

  1. If df <- data.frame(A = c(3, 1, 4), B = c(2, 6, 8)), which command will sort df by column B in descending order? ::: {.cell}

:::

  1. Given df <- data.frame(A = c(1, 2, 3), B = c(4, 5, 6)), which command will select only column A from df? ::: {.cell}

:::

  1. If df <- data.frame(A = c("apple", "banana", "cherry"), B = c(10, 20, 30)), which command will create a new column C that is the square of B? ::: {.cell}

:::

  1. What does the command df |> summarise(avg_B = mean(B)) do for df <- data.frame(A = c(1, 2, 3), B = c(4, 5, 6))? ::: {.cell}

:::

  1. If df <- data.frame(A = c("X", "Y", "X", "Y"), B = c(5, 10, 15, 20)), which command will calculate the total sum of B for each group in A? ::: {.cell}

:::

  1. Which dplyr function allows you to select columns by their position in the data frame? ::: {.cell}

:::

  1. Given df <- data.frame(A = c(10, 20, 30), B = c(100, 200, 300)), which command will filter rows where A is less than or equal to 20? ::: {.cell}

:::

  1. If df <- data.frame(A = c("X", "Y", "Z"), B = c(5, 6, 7)), which command will create a new column C that is the sum of A and B as a string? ::: {.cell}

:::

  1. Which dplyr function would you use to group the data frame df by column A and then filter out groups where the sum of B is less than 10? ::: {.cell}

:::

  1. Given df <- data.frame(A = c(5, 10, 15), B = c("apple", "banana", "cherry")), which command will return rows where column B is either “apple” or “banana”? ::: {.cell}

:::

  1. If df <- data.frame(A = c(5, 6, 7), B = c(3, 6, 9)), which command will create a new column C that is the ratio of B to A? ::: {.cell}

:::

  1. What does the command df |> summarise(avg_A = mean(A, na.rm = TRUE)) do for df <- data.frame(A = c(1, 2, 3, NA))? ::: {.cell}

:::

  1. Which of the following commands will sort the data frame df by column A in descending order and then by column B in ascending order? ::: {.cell}

:::

  1. Given df <- data.frame(A = c("X", "Y", "X", "Z"), B = c(10, 20, 30, 40)), which command will filter df to only include rows where A is either “X” or “Z”? ::: {.cell}

:::

  1. If df <- data.frame(A = c(10, 20, 30), B = c(100, 200, 300)), which command will create a new column C that is 10% of column B? ::: {.cell}

:::

  1. Given df <- data.frame(A = c(1, 2, 3, 4, 5), B = c(10, 20, 30, 40, 50)), which command will return the rows where column B is greater than 25 and less than 45? ::: {.cell}

:::

  1. Which dplyr function is used to rename an existing column in a data frame? ::: {.cell}

:::

  1. If df <- data.frame(A = c(5, 6, 7), B = c(8, 9, 10)), which command will return a summary of the total sum of A and B? ::: {.cell}

:::

  1. Which command will group the data frame df by column A and calculate the mean of column B for each group, then filter to only include groups where the mean of B is greater than 15? ::: {.cell}

:::

:::