Multiple Choices: Data Wrangling with dplyr
Questions
- Given
df <- data.frame(A = c(5, 3, 2, 8), B = c("X", "Y", "Z", "X"))
, which command will filter rows where columnA
is greater than 4? ::: {.cell}
- If
df <- data.frame(A = 1:5, B = letters[1:5])
, what willdf |> select(B)
return? ::: {.cell}
:::
- Which
dplyr
function would you use to create a new columnC
that is double the values in columnA
indf <- data.frame(A = 1:3, B = 4:6)
? ::: {.cell}
:::
- Given
df <- data.frame(A = c(3, 1, 2), B = c("apple", "banana", "apple"))
, which command will sortdf
by columnA
in ascending order? ::: {.cell}
:::
- If
df <- data.frame(A = c("apple", "banana", "apple"), B = c(5, 10, 15))
, which command will calculate the mean ofB
grouped byA
? ::: {.cell}
:::
- What will
df |> filter(B == "apple")
do fordf <- data.frame(A = 1:3, B = c("apple", "banana", "apple"))
? ::: {.cell}
:::
- To select columns
A
andC
from a data framedf
, which command is correct? ::: {.cell}
:::
- Given
df <- data.frame(A = c(1, 2, 3), B = c(10, 20, 30))
, which command will add a new columnC
with valuesA + B
? ::: {.cell}
:::
- If
df <- data.frame(A = c("X", "Y", "X", "Y"), B = c(2, 4, 6, 8))
, which command will sum the values ofB
for each group inA
? ::: {.cell}
:::
- Which command will arrange
df
in descending order by columnB
fordf <- data.frame(A = c("apple", "banana", "cherry"), B = c(5, 3, 8))
? ::: {.cell}
:::
- Given
df <- data.frame(A = c(10, 20, 30), B = c(100, 200, 300))
, which command will filter rows whereB
is greater than 150? ::: {.cell}
:::
- If
df <- data.frame(A = c(3, 1, 4), B = c(2, 6, 8))
, which command will sortdf
by columnB
in descending order? ::: {.cell}
:::
- Given
df <- data.frame(A = c(1, 2, 3), B = c(4, 5, 6))
, which command will select only columnA
fromdf
? ::: {.cell}
:::
- If
df <- data.frame(A = c("apple", "banana", "cherry"), B = c(10, 20, 30))
, which command will create a new columnC
that is the square ofB
? ::: {.cell}
:::
- What does the command
df |> summarise(avg_B = mean(B))
do fordf <- data.frame(A = c(1, 2, 3), B = c(4, 5, 6))
? ::: {.cell}
:::
- If
df <- data.frame(A = c("X", "Y", "X", "Y"), B = c(5, 10, 15, 20))
, which command will calculate the total sum ofB
for each group inA
? ::: {.cell}
:::
- Which
dplyr
function allows you to select columns by their position in the data frame? ::: {.cell}
:::
- Given
df <- data.frame(A = c(10, 20, 30), B = c(100, 200, 300))
, which command will filter rows whereA
is less than or equal to 20? ::: {.cell}
:::
- If
df <- data.frame(A = c("X", "Y", "Z"), B = c(5, 6, 7))
, which command will create a new columnC
that is the sum ofA
andB
as a string? ::: {.cell}
:::
- Which
dplyr
function would you use to group the data framedf
by columnA
and then filter out groups where the sum ofB
is less than 10? ::: {.cell}
:::
- Given
df <- data.frame(A = c(5, 10, 15), B = c("apple", "banana", "cherry"))
, which command will return rows where columnB
is either “apple” or “banana”? ::: {.cell}
:::
- If
df <- data.frame(A = c(5, 6, 7), B = c(3, 6, 9))
, which command will create a new columnC
that is the ratio ofB
toA
? ::: {.cell}
:::
- What does the command
df |> summarise(avg_A = mean(A, na.rm = TRUE))
do fordf <- data.frame(A = c(1, 2, 3, NA))
? ::: {.cell}
:::
- Which of the following commands will sort the data frame
df
by columnA
in descending order and then by columnB
in ascending order? ::: {.cell}
:::
- Given
df <- data.frame(A = c("X", "Y", "X", "Z"), B = c(10, 20, 30, 40))
, which command will filterdf
to only include rows whereA
is either “X” or “Z”? ::: {.cell}
:::
- If
df <- data.frame(A = c(10, 20, 30), B = c(100, 200, 300))
, which command will create a new columnC
that is 10% of columnB
? ::: {.cell}
:::
- Given
df <- data.frame(A = c(1, 2, 3, 4, 5), B = c(10, 20, 30, 40, 50))
, which command will return the rows where columnB
is greater than 25 and less than 45? ::: {.cell}
:::
- Which
dplyr
function is used to rename an existing column in a data frame? ::: {.cell}
:::
- If
df <- data.frame(A = c(5, 6, 7), B = c(8, 9, 10))
, which command will return a summary of the total sum ofA
andB
? ::: {.cell}
:::
- Which command will group the data frame
df
by columnA
and calculate the mean of columnB
for each group, then filter to only include groups where the mean ofB
is greater than 15? ::: {.cell}
:::
:::