Sharding splits data across shards by shard key—horizontal scale when one replica set is not enough.
Components
- mongos — query router
- config servers — cluster metadata
- shards — replica sets holding data chunks
Shard key choice
High cardinality, even distribution—avoid monotonic _id-only hot shard unless hashed.
// Hashed shard key example concept:
sh.shardCollection('practice.events', { userId: 'hashed' })Practice: Concepts apply to Atlas and self-hosted; try read-only commands in mongosh where safe.
Important interview questions and answers
- Q: Shard key immutable?
A: Generally fixed at shardCollection time—bad keys are expensive to fix. - Q: Targeted vs scatter-gather?
A: Queries including shard key hit one shard; others fan out.
Self-check
- What does mongos do?
- Monotonic _id shard risk?
Tip: Choose shard key before data size makes resharding painful.
Interview prep
- mongos?
- Routes queries to correct shards.
- Shard key?
- Determines data distribution across shards.