Skip to content
Learn Netverks

Lesson

Step 8/36 22% through track

tokens-tokenization

Tokens and Tokenization

Last reviewed May 28, 2026 Content v20260528
Track mode
none
Means
Read / quiz
Reading
~1 min
Level
beginner

This lesson

This lesson teaches Tokens and Tokenization: generative AI patterns—LLMs, prompting, retrieval, safety, and integration habits for real assistants and copilots.

Token economics drive product margins—measure prompts before launch.

You will apply Tokens and Tokenization in contexts like: Chat products, code assistants, search augmentation, and internal knowledge tools.

Study explanations, case studies, and MCQs—this topic is read/quiz focused without a code runner.

When you can explain the previous lesson's ideas in your own words.

LLMs consume tokens—subword pieces from a vocabulary—not whole words. Billing, context limits, and prompt sizing are all token-based.

Examples

# Illustrative — real counts come from the model tokenizer
text = "unbelievable"
# might split into ["un", "believ", "able"] depending on tokenizer

Why subwords

  • Handles rare words without million-entry dictionaries
  • Shares morphemes across languages
  • Code and JSON benefit from character-level pieces

Practical rules

Use the provider's tiktoken or API token counter before production. English averages ~4 characters per token; code and non-Latin scripts differ.

Truncation strategy: drop oldest chat turns, summarize history, or retrieve only top-k chunks—not silent mid-word cuts.

Important interview questions and answers

  1. Q: Why isn't one word always one token?
    A: Subword tokenization splits rare or compound strings.

Self-check

  1. What unit do providers bill on?
  2. Why measure prompts before launch?

Pitfall: Pricing surprises—count tokens on longest realistic prompt before budgeting.

Interview prep

Why subwords?

Compact vocabulary handling rare words, morphology, and code fragments.

Billing unit?

Providers bill tokens for prompt + completion—measure before launch.

Interview tip Lesson completion confidence

Can you explain this lesson in 30 seconds without reading notes?

Not saved yet.

Check yourself

Multiple choice — immediate feedback.

Discussion

Past discussion is visible to everyone. Only logged-in users can post comments and replies.

Starter discussion topics

  • Billing unit?
  • Subword why?

Sign up or log in to post comments and sync lesson progress across devices.

No discussion yet. Be the first to ask a question.

Jump