The Final Exam 2/2 will be held on March 14, 2025, from 8:00 to 11:00 AM at the ICDI building. Total Score: 15% of 100% - 5%: Multiple-choice questions. - 10%: Answering questions, editing, or showing the code based on the given results. You are allowed to use your computer/iPad/Tablet to run the R program. However, you are not allowed to use the internet to search for solutions via search engines (e.g., Google) or generative AI (e.g., ChatGPT). You are not allowed to discuss anything in the testing room. You are allowed to bring and write notes on one A4 page (both sides).
Multiple Choices: Data Wrangling with dplyr
Questions
Given df <- data.frame(A = c(5, 3, 2, 8), B = c("X", "Y", "Z", "X")), which command will filter rows where column A is greater than 4?
If df <- data.frame(A = 1:5, B = letters[1:5]), what will df |> select(B) return?
Which dplyr function would you use to create a new column C that is double the values in column A in df <- data.frame(A = 1:3, B = 4:6)?
Given df <- data.frame(A = c(3, 1, 2), B = c("apple", "banana", "apple")), which command will sort df by column A in ascending order?
If df <- data.frame(A = c("apple", "banana", "apple"), B = c(5, 10, 15)), which command will calculate the mean of B grouped by A?
What will df |> filter(B == "apple") do for df <- data.frame(A = 1:3, B = c("apple", "banana", "apple"))?
To select columns A and C from a data frame df, which command is correct?
Given df <- data.frame(A = c(1, 2, 3), B = c(10, 20, 30)), which command will add a new column C with values A + B?
If df <- data.frame(A = c("X", "Y", "X", "Y"), B = c(2, 4, 6, 8)), which command will sum the values of B for each group in A?
Which command will arrange df in descending order by column B for df <- data.frame(A = c("apple", "banana", "cherry"), B = c(5, 3, 8))?
Given df <- data.frame(A = c(10, 20, 30), B = c(100, 200, 300)), which command will filter rows where B is greater than 150?
If df <- data.frame(A = c(3, 1, 4), B = c(2, 6, 8)), which command will sort df by column B in descending order?
Given df <- data.frame(A = c(1, 2, 3), B = c(4, 5, 6)), which command will select only column A from df?
If df <- data.frame(A = c("apple", "banana", "cherry"), B = c(10, 20, 30)), which command will create a new column C that is the square of B?
What does the command df |> summarise(avg_B = mean(B)) do for df <- data.frame(A = c(1, 2, 3), B = c(4, 5, 6))?
If df <- data.frame(A = c("X", "Y", "X", "Y"), B = c(5, 10, 15, 20)), which command will calculate the total sum of B for each group in A?
Which dplyr function allows you to select columns by their position in the data frame?
Given df <- data.frame(A = c(10, 20, 30), B = c(100, 200, 300)), which command will filter rows where A is less than or equal to 20?
If df <- data.frame(A = c("X", "Y", "Z"), B = c(5, 6, 7)), which command will create a new column C that is the sum of A and B as a string?
Which dplyr function would you use to group the data frame df by column A and then filter out groups where the sum of B is less than 10?
Given df <- data.frame(A = c(5, 10, 15), B = c("apple", "banana", "cherry")), which command will return rows where column B is either “apple” or “banana”?
If df <- data.frame(A = c(5, 6, 7), B = c(3, 6, 9)), which command will create a new column C that is the ratio of B to A?
What does the command df |> summarise(avg_A = mean(A, na.rm = TRUE)) do for df <- data.frame(A = c(1, 2, 3, NA))?
Which of the following commands will sort the data frame df by column A in descending order and then by column B in ascending order?
Given df <- data.frame(A = c("X", "Y", "X", "Z"), B = c(10, 20, 30, 40)), which command will filter df to only include rows where A is either “X” or “Z”?
If df <- data.frame(A = c(10, 20, 30), B = c(100, 200, 300)), which command will create a new column C that is 10% of column B?
Given df <- data.frame(A = c(1, 2, 3, 4, 5), B = c(10, 20, 30, 40, 50)), which command will return the rows where column B is greater than 25 and less than 45?
Which dplyr function is used to rename an existing column in a data frame?
If df <- data.frame(A = c(5, 6, 7), B = c(8, 9, 10)), which command will return a summary of the total sum of A and B?
Which command will group the data frame df by column A and calculate the mean of column B for each group, then filter to only include groups where the mean of B is greater than 15?