Character references insert symbols without typing raw bytes—named (©), decimal (©), or hexadecimal (©).
Reserved characters
&begins references—escape as&in text when literal ampersands appear.<starts tags—escape as<when discussing markup inside text nodes.
When UTF-8 suffices
Modern editors handle Unicode directly—entities mainly help ambiguous contexts or ASCII-only transports.
Attributes
If attribute values use double quotes, single quotes inside values need no escaping; alternate quoting strategies reduce entity clutter.
Numeric references
Decimal and hex forms reference Unicode code points—useful for symbols lacking named entities.
Example — named, decimal, and raw Unicode
<p>Copyright © 2026 (©) — arrow →</p>
<p>Euro sign in UTF-8: € matches €</p>
Rendered output
Copyright © 2026 (©) — arrow →
Euro sign in UTF-8: € matches €
Escaping vs templating
Frameworks default-escape text—double-escaping shows raw entities to users; disabling escape “just this once” opens XSS. Know your stack’s rules.
Important interview questions and answers
- Q: What is the safest default character encoding for modern HTML?
A: UTF-8, declared early with `` and matched by server `Content-Type` headers. - Q: When are HTML entities still useful in UTF-8 pages?
A: For reserved characters (`&`, `<`) and contexts where explicit escaping avoids parser ambiguity. - Q: What is the key difference between HTML5 parsing and XHTML parsing?
A: HTML5 recovers from many errors; XHTML (XML) treats many parse errors as fatal.
Pitfall: Prefer UTF-8 and literal characters over entity soup when possible.