Classic Unix tools—cut, sort, uniq, awk, sed—transform columnar and line-based text without loading a full language runtime.
cut, sort, uniq
printf "b\na\nb\n" | sort | uniq -c
printf "name:Ada\nage:30\n" | cut -d: -f2uniq only collapses adjacent duplicates—sort first.
awk one-liner
awk -F, '{sum+=$2} END {print sum}' numbers.csv-F, sets the field separator; END runs after all lines—typical for totals.
sed substitution
sed 's/error/ERROR/g' log.txt | heads/old/new/g replaces globally per line—test on a sample before editing files in place (-i).
Important interview questions and answers
- Q: Why sort before uniq?
A: uniq only removes consecutive duplicate lines. - Q: awk vs Python?
A: awk is great for quick column sums; Python wins for complex data structures.
Self-check
- What does uniq -c add?
- What sed expression replaces all occurrences on a line?
Tip: sort | uniq -c is interview bread-and-butter for log summaries.
Interview prep
- sort before uniq?
uniq only removes adjacent duplicate lines.
- awk role?
Column-oriented text processing.