Use utf8mb4 with a sensible collation (e.g. utf8mb4_unicode_ci) so emoji and all Unicode store correctly—legacy utf8 is incomplete.
Database and table defaults
CREATE DATABASE practice
CHARACTER SET utf8mb4
COLLATE utf8mb4_unicode_ci;
CREATE TABLE users (
email VARCHAR(255) NOT NULL,
display_name VARCHAR(255) NOT NULL
) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;Practice: Run on database practice in mysql client.
Collation affects sorting
_ci collations are case-insensitive for equality—pick explicitly for indexes and UNIQUE constraints.
Connection charset
Set client charset in PDO/Laravel so bytes on the wire match table definition—mojibake is a common production bug.
Important interview questions and answers
- Q: utf8 vs utf8mb4?
A: utf8 in MySQL is 3-byte subset; utf8mb4 is full Unicode. - Q: Collation on email UNIQUE?
A: Case sensitivity depends on collation—test login lookups.
Self-check
- Why utf8mb4 for new projects?
- What does COLLATE control?
Tip: Emoji test: insert 🎉 and verify round-trip in app.
Interview prep
- utf8 trap?
MySQL utf8 is 3-byte—utf8mb4 is full Unicode.
- Collation?
Defines sort/compare rules including case sensitivity.