R represents missing values with NA—propagate through calculations unless handled. Use is.na(), na.omit(), and explicit imputation strategies.
NA behavior
x <- c(10, NA, 30)
print(mean(x))
print(mean(x, na.rm = TRUE))
print(is.na(x))
Handling strategies
- Remove:
na.omit(df)orcomplete.cases() - Impute: mean/median/model-based (document choices in reports)
- Flag: keep NA and let models handle with
na.action
Important interview questions and answers
- Q: NA vs NULL?
A: NA is a missing value placeholder in atomic vectors; NULL means absence of an object. - Q: Why na.rm = TRUE?
A: Many summary functions return NA if any input is NA unless you opt to remove them.
Self-check
- What does mean(c(1, NA, 3)) return without na.rm?
- What function tests for NA?
Tip: NA == NA is NA—always use is.na() for tests.
Interview prep
- NA propagation?
Most ops return NA if any input is NA unless
na.rm = TRUEor explicit handling.