Subsetting extracts parts of vectors, matrices, and data frames with [ ], logical masks, and which(). Master this before dplyr shortcuts.
Logical subsetting
df <- data.frame(name = c("Ada", "Lin"), score = c(92, 88))
high <- df[df$score >= 90, ]
print(high)
which() and names
idx <- which(df$score >= 90)
print(df[idx, , drop = FALSE])
drop = FALSE keeps data frame shape when one row remains.
Important interview questions and answers
- Q: [ ] vs [[ ]]
A: [ ] can return subset structure; [[ ]] extracts a single element/column simplistically. - Q: Logical length match?
A: Recycling applies—ensure mask length matches nrow for data frames.
Self-check
- What does df[df$score >= 90, ] do?
- Why use drop = FALSE?
Pitfall: Dropping to a vector loses column names—use drop = FALSE on single-row subsets.
Interview prep
- drop = FALSE?
Prevents a one-row subset from becoming a vector and losing column names.