Anatomy of a Prompt — System, User, and Assistant Explained
Understand how system, user and assistant messages shape LLM behavior.

Search for a command to run...
Series
Breaking down how LLMs actually work — tokens, embeddings, context windows, and the fundamentals you need before building anything serious.
Understand how system, user and assistant messages shape LLM behavior.

Learn how to choose embedding models and dimensions for production RAG systems. Compare OpenAI, Voyage AI, and Google's free options for embedding

Learn what embeddings are, how vector similarity works, and why understanding magnitude vs direction matters for semantic search and RAG systems.

Learn how Byte Pair Encoding (BPE) actually works — the algorithm that powers GPT, Claude, and LLaMA tokenizers. Step-by-step with examples.

Learn what tokens really are, why they're not words, and how understanding tokenization saves you money on LLM API costs.
