A neural network stacks layers of simple units (neurons) that apply weighted sums and nonlinear activations. Stacking layers lets the network learn hierarchical features—edges in images, then shapes, then objects.
Building blocks
- Input layer — raw features or embeddings
- Hidden layers — learned representations
- Output layer — class logits or regression value
- Activation — ReLU, sigmoid, softmax introduce nonlinearity
- Loss — measures prediction error during training
Forward pass intuition
# Single neuron: weighted sum + activation
def relu(x: float) -> float:
return max(0.0, x)
weights = [0.3, -0.1, 0.7]
inputs = [1.0, 2.0, 0.5]
z = sum(w * x for w, x in zip(weights, inputs))
print("activation:", relu(z))Practice: Optional pseudocode only—run locally in Jupyter if helpful. No model training required for this literacy track.
Why depth helps (sometimes)
Deeper nets can represent complex functions but need more data, compute, and regularization. Start shallow or use pretrained models before training huge nets from scratch.
Important interview questions and answers
- Q: Why nonlinear activations?
A: Without them, stacked linear layers collapse to one linear transform. - Q: Parameters in a network?
A: Weights and biases learned by gradient-based optimization.
Self-check
- Name the three layer types in a classifier network.
- What does ReLU do to negative values?
Tip: Think layers = learned features; depth helps only with enough data and regularization.
Interview prep
- Why nonlinear activations?
- Without them, stacked layers collapse to a single linear transform.
- Hidden layers do what?
- Learn hierarchical representations from raw inputs.