Observable Pipeline
How LLMs Find Answers
Step inside the transformer architecture — from raw text to generated tokens, every layer made visible.
Click a step below to illuminate the corresponding layer
Tokenization
Text → Token IDs
LLMs never read raw characters. Before anything else, your text is split into sub-word tokens using Byte-Pair Encoding (BPE) — the same algorithm GPT-4 uses. A word like "backpropagation" may become ["back", "prop", "agation"] depending on how common those sub-strings are in training data. Token IDs are just integers indexing a vocabulary of ~50 000 entries.
GPT-4 vocabulary: ~100,000 tokens
"Hello world" = 2 tokens
"Backpropagation" ≈ 3–4 tokens
Token count drives API cost
Live Tokenizer
NInsideN shows the observable application-level pipeline — tokenisation, retrieval, context assembly, and streaming. The model's private internal activations and weights are not accessible.