LLM & Generative AI Glossary
100 Key Terms Explained (A–Z)
A practical reference covering AI, large language models, prompt engineering, retrieval, automation, and AI search — written for marketers, founders, and product teams. Each term has a plain-language definition, why it matters, and a real-world example you can use today.
Built for marketers, SEO teams, founders, and AI practitioners
109 terms
A11 TERMS
AI Agents
Agentic AI
Agentic AI describes systems where a language model plans steps, calls tools, and acts toward a goal with limited human input. The model decides what to do next, executes actions, observes results, and adjusts its plan until the task is finished.
WHY IT MATTERS
Agentic AI turns chatbots into workers that can complete multi-step business tasks end to end.
SIMPLE EXAMPLE
A marketing agent researches competitors, drafts a brief, and schedules a blog post without manual handoffs.
AI Agents
AI Agent
An AI agent is software that uses a large language model as its reasoning core and connects to tools, memory, and APIs to complete tasks. It interprets a goal, chooses actions, runs them, and returns results to a user or another system.
WHY IT MATTERS
AI agents replace fragmented automations with one system that can plan and act across tools.
SIMPLE EXAMPLE
An SEO agent audits a site, finds broken pages, and opens fix tickets in Jira automatically.
AI Safety
AI Alignment
AI alignment is the practice of training and constraining AI systems so their behavior matches human goals, values, and safety expectations. It combines data choices, fine-tuning, reinforcement learning from human feedback, and policy rules that shape model outputs.
WHY IT MATTERS
Alignment determines whether an AI helps users safely or produces harmful, biased, or off-brand answers.
SIMPLE EXAMPLE
A support copilot is aligned to refuse legal advice and always escalate refund disputes to a human.
LLM Basics
AI Copilot
An AI copilot is an assistant embedded inside a product or workflow that helps a user complete tasks faster. It uses a language model plus context from the host app to suggest text, code, queries, or actions the user can accept, edit, or reject.
WHY IT MATTERS
Copilots increase output per employee without forcing teams to switch tools or learn new interfaces.
SIMPLE EXAMPLE
A sales copilot drafts follow-up emails inside the CRM using the latest call notes.
AI Safety
AI Governance
AI governance is the set of policies, roles, and controls that decide how an organization builds, deploys, and monitors AI systems. It covers data use, model selection, risk reviews, human oversight, auditing, and compliance with internal and external regulations.
WHY IT MATTERS
Governance lets teams scale AI usage without exposing the business to legal, brand, or security risk.
SIMPLE EXAMPLE
A bank requires every customer-facing prompt and dataset to be reviewed before production use.
AI Safety
AI Hallucination
An AI hallucination is a confident output from a language model that is factually wrong, fabricated, or unsupported by its sources. It happens because models predict likely text patterns, not verified facts, and may invent names, numbers, citations, or events.
WHY IT MATTERS
Hallucinations damage trust and can cause real legal, financial, or brand harm in customer-facing tools.
SIMPLE EXAMPLE
A chatbot invents a refund policy that does not exist in the company knowledge base.
LLM Basics
AI Model
An AI model is a trained mathematical system that maps inputs to outputs after learning patterns from data. In generative AI, the model is usually a neural network that has learned to produce text, images, audio, code, or structured data from a prompt.
WHY IT MATTERS
The model you choose sets the ceiling for quality, cost, latency, and safety in any AI feature.
SIMPLE EXAMPLE
A team picks a smaller model for autocomplete and a larger model for long-form report generation.
Automation
AI Orchestration
AI orchestration is the layer that coordinates models, prompts, tools, data sources, and steps inside a single workflow. It manages routing, retries, memory, and handoffs so a complex task runs reliably across multiple components instead of one isolated prompt.
WHY IT MATTERS
Orchestration is what turns demos into production AI systems that scale and stay debuggable.
SIMPLE EXAMPLE
An onboarding workflow pulls user data, calls a model, sends an email, and logs the result.
AI Safety
AI Safety
AI safety is the field focused on preventing AI systems from causing harm to people, organizations, or society. It includes technical work on alignment, robustness, and evaluation as well as operational practices like red teaming, monitoring, and incident response.
WHY IT MATTERS
Safety practices decide whether an AI feature is shipped, gated, or pulled after launch.
SIMPLE EXAMPLE
A team red-teams a new chatbot to find prompts that leak personal data before release.
LLM Basics
Artificial General Intelligence
Artificial general intelligence, or AGI, refers to a hypothetical AI system that can perform any intellectual task a human can, across domains, without retraining. Today's models are narrow and capable in specific tasks but do not yet meet a widely accepted definition of AGI.
WHY IT MATTERS
AGI shapes long-term strategy, regulation, and investment even though current systems are far from it.
SIMPLE EXAMPLE
Executives plan AI roadmaps assuming current narrow models, not speculative AGI capabilities.
Model Training
Attention Mechanism
The attention mechanism is the part of a transformer that lets the model weigh which input tokens matter most when predicting the next token. It assigns scores between tokens so context, not just position, drives the output.
WHY IT MATTERS
Attention is the core innovation that made modern LLMs possible and powerful at long-context tasks.
SIMPLE EXAMPLE
When summarizing a contract, attention helps the model focus on clauses about payment terms.
B5 TERMS
Model Training
Backpropagation
Backpropagation is the algorithm used to train neural networks by sending the prediction error backward through the network and updating the weights. It calculates how much each weight contributed to the error and adjusts it to reduce future errors.
WHY IT MATTERS
Backpropagation is what makes large models learnable in practice on huge datasets.
SIMPLE EXAMPLE
During training, a model improves its next-word predictions after each batch using backpropagation.
LLM Basics
Batch Inference
Batch inference runs many model predictions at once instead of one request at a time. It groups inputs, sends them through the model in parallel, and returns results together, which lowers cost per request and increases throughput for non-interactive workloads.
WHY IT MATTERS
Batch inference is how teams keep AI features affordable at scale.
SIMPLE EXAMPLE
An ecommerce team scores ten thousand product descriptions for SEO quality overnight in one batch.
Model Training
Benchmark Evaluation
Benchmark evaluation tests a model on standardized tasks and datasets to compare its performance against other models. Common benchmarks measure reasoning, coding, math, knowledge, safety, and instruction following with public scores teams use to choose models.
WHY IT MATTERS
Benchmarks help teams pick models on evidence rather than marketing claims.
SIMPLE EXAMPLE
A team compares two LLMs on a coding benchmark before choosing one for a developer copilot.
AI Safety
Bias in AI
Bias in AI is a systematic error in model outputs that unfairly favors or disadvantages certain groups, topics, or viewpoints. It usually comes from skewed training data, labeling choices, or alignment decisions and can appear in language, ranking, or recommendations.
WHY IT MATTERS
Bias creates legal, ethical, and brand risk and degrades the quality of AI-driven decisions.
SIMPLE EXAMPLE
A hiring screener trained on past resumes downgrades candidates from underrepresented schools.
LLM Basics
Big Language Model
A big language model is a colloquial term for a large language model with billions of parameters trained on huge text corpora. The terms are used interchangeably to describe general-purpose models that can read, write, summarize, and reason over text.
WHY IT MATTERS
Naming aside, model size strongly influences quality, cost, and latency tradeoffs.
SIMPLE EXAMPLE
A founder asks whether to use a big language model or a small fine-tuned model for support.
C8 TERMS
Prompt Engineering
Chain-of-Thought Prompting
Chain-of-thought prompting asks a model to show its reasoning step by step before giving a final answer. It improves accuracy on math, logic, and multi-step tasks because the model commits to intermediate conclusions instead of guessing the result in one jump.
WHY IT MATTERS
Chain-of-thought prompts often raise quality without changing the model or training data.
SIMPLE EXAMPLE
A pricing prompt instructs the model to list assumptions first, then compute the final quote.
RAG & Retrieval
Chunking
Chunking is the process of splitting long documents into smaller pieces so they can be embedded and retrieved by a language model. Good chunking preserves meaning by respecting headings, sentences, or semantic boundaries instead of cutting text by raw character count.
WHY IT MATTERS
Chunk quality directly controls how well retrieval-augmented generation answers a question.
SIMPLE EXAMPLE
A help center splits articles by H2 sections so the model retrieves the exact relevant paragraph.
LLM Basics
Completion
A completion is the text a language model generates in response to a prompt. The model continues the input by predicting the most likely next tokens until it reaches a stop condition, a length limit, or an end-of-message signal.
WHY IT MATTERS
Completion is the fundamental output unit billed and measured in most LLM APIs.
SIMPLE EXAMPLE
Sending a product brief returns a completion containing a draft landing-page headline.
Prompt Engineering
Context Engineering
Context engineering is the discipline of designing what information goes into a model's context window for each request. It includes choosing system prompts, examples, retrieved documents, tool outputs, and user history so the model has exactly what it needs and nothing else.
WHY IT MATTERS
Strong context engineering often beats prompt tweaks for accuracy, cost, and latency.
SIMPLE EXAMPLE
A support agent injects only the customer's plan and last three tickets, not the full account history.
LLM Basics
Context Window
The context window is the maximum amount of text, measured in tokens, that a model can consider at once. It includes the system prompt, user input, retrieved documents, prior turns, and the response. Anything outside the window is invisible to the model.
WHY IT MATTERS
The context window limits how much knowledge you can pass to a model in one call.
SIMPLE EXAMPLE
A long sales call transcript must be summarized in chunks because it exceeds the model's context window.
LLM Basics
Conversational AI
Conversational AI is software that holds multi-turn dialogues with users in natural language. It typically combines a language model, memory of recent turns, intent handling, and access to tools or knowledge so the conversation stays useful across many messages.
WHY IT MATTERS
Conversational AI is the main interface layer for support, sales, and internal copilots.
SIMPLE EXAMPLE
A bank chatbot remembers your earlier balance question when you next ask about transfer fees.
Model Training
Corpus
A corpus is a structured collection of text used to train, fine-tune, or evaluate a language model. Corpora may include books, web pages, code, conversations, or domain documents and are usually cleaned, deduplicated, and filtered before training begins.
WHY IT MATTERS
The corpus shapes what a model knows, how it writes, and where it has blind spots.
SIMPLE EXAMPLE
A legal LLM is fine-tuned on a corpus of contracts and case law instead of general web text.
LLM Basics
Custom GPT
A custom GPT is a configured version of a general LLM with a specific system prompt, instructions, knowledge files, and optional tools. It behaves like a specialized assistant for a single use case without retraining the underlying model.
WHY IT MATTERS
Custom GPTs let non-engineers ship branded AI assistants in hours, not weeks.
SIMPLE EXAMPLE
A marketing team builds a custom GPT that writes only in their tone-of-voice guidelines.
D6 TERMS
Model Training
Data Annotation
Data annotation is the process of labeling raw data so models can learn from it. Humans or tools tag text, images, or audio with categories, spans, ratings, or relationships, producing the ground truth used in training and evaluation.
WHY IT MATTERS
High-quality annotation often matters more than model choice for task performance.
SIMPLE EXAMPLE
Reviewers label support tickets as billing, technical, or sales to train an intent classifier.
Model Training
Data Augmentation
Data augmentation expands a training dataset by creating modified versions of existing examples. Techniques include paraphrasing text, translating and back-translating, or generating synthetic variations so the model sees more diversity without collecting new raw data.
WHY IT MATTERS
Augmentation reduces overfitting and improves performance when labeled data is scarce.
SIMPLE EXAMPLE
A team paraphrases customer questions to give an intent classifier more training variety.
Model Training
Dataset
A dataset is a curated collection of examples used to train, fine-tune, or evaluate a model. It typically includes inputs, labels, and metadata, organized into training, validation, and test splits so model performance can be measured fairly.
WHY IT MATTERS
Dataset quality, size, and split design control whether evaluation results are trustworthy.
SIMPLE EXAMPLE
A team holds out 10 percent of labeled support tickets as a test set for the new classifier.
Model Training
Deep Learning
Deep learning is a branch of machine learning that uses multi-layer neural networks to learn patterns directly from raw data. The depth of the network lets the system learn hierarchical features, which is why it powers modern vision, speech, and language models.
WHY IT MATTERS
Deep learning is the foundation under every current generative AI system.
SIMPLE EXAMPLE
A computer vision model uses deep learning to detect product defects on a factory line.
LLM Basics
Diffusion Model
A diffusion model is a generative model that learns to reverse a noise process. It starts from random noise and gradually denoises it into a coherent output, which is how most modern image, video, and audio generators work.
WHY IT MATTERS
Diffusion models drive the visual side of generative AI, alongside LLMs for text.
SIMPLE EXAMPLE
A marketing team generates campaign hero images from a brief using a diffusion model.
Model Training
Distillation
Distillation is a training technique where a smaller student model learns to imitate a larger teacher model. The student keeps most of the teacher's quality while running faster and cheaper, making it suitable for production and on-device use.
WHY IT MATTERS
Distillation is one of the main ways to ship LLM features at low latency and cost.
SIMPLE EXAMPLE
A company distills a large model into a small one that runs autocomplete inside its mobile app.
E4 TERMS
RAG & Retrieval
Embedding
An embedding is a vector of numbers that represents the meaning of a piece of text, image, or other data. Items with similar meaning sit close together in vector space, which lets systems compare, search, and cluster content by semantic similarity.
WHY IT MATTERS
Embeddings power semantic search, RAG, recommendations, and most modern AI search features.
SIMPLE EXAMPLE
A help center embeds every article so user questions retrieve the closest matching answers.
RAG & Retrieval
Embedding Model
An embedding model is a neural network trained to turn text or other inputs into embeddings. Different embedding models trade off dimensionality, language coverage, domain quality, and cost, and the choice strongly affects retrieval relevance.
WHY IT MATTERS
Picking the right embedding model often improves RAG quality more than swapping the LLM.
SIMPLE EXAMPLE
An ecommerce team tests three embedding models to find the one that ranks product matches best.
Model Training
Evaluation Dataset
An evaluation dataset is a held-out set of labeled examples used to measure how well a model performs on a task. It is kept separate from training data so reported metrics reflect real generalization, not memorization.
WHY IT MATTERS
Without a clean evaluation set, teams cannot tell if a new prompt or model is actually better.
SIMPLE EXAMPLE
A team measures support classification accuracy on 1,000 unseen tickets after every prompt change.
AI Safety
Explainable AI
Explainable AI, or XAI, is the set of methods that make AI decisions understandable to humans. Techniques include feature attributions, surfaced sources, decision rules, and natural language rationales that show why a model produced a given output.
WHY IT MATTERS
Explainability is required for trust, debugging, and many regulated AI use cases.
SIMPLE EXAMPLE
A credit model shows which factors raised or lowered an applicant's risk score.
F4 TERMS
Prompt Engineering
Few-Shot Prompting
Few-shot prompting includes a small number of input-output examples in the prompt to show the model the desired pattern. The model uses those examples to infer format, tone, and reasoning style for a new input without any fine-tuning.
WHY IT MATTERS
Few-shot prompting is the cheapest way to lift quality on structured or stylistic tasks.
SIMPLE EXAMPLE
A prompt shows three examples of well-written meta descriptions before asking for a new one.
Model Training
Fine-Tuning
Fine-tuning continues training a pretrained model on a smaller, task-specific dataset. The model adjusts its weights to specialize in a domain, format, or style while keeping the broad language ability it learned during pretraining.
WHY IT MATTERS
Fine-tuning is the right choice when prompting and retrieval cannot reach the quality bar.
SIMPLE EXAMPLE
A legal team fine-tunes a model on its contract templates to draft consistent NDAs.
LLM Basics
Foundation Model
A foundation model is a large model trained on broad data that can be adapted to many downstream tasks. It serves as a general base on top of which teams build features through prompting, retrieval, fine-tuning, or tool use.
WHY IT MATTERS
Choosing a foundation model is a strategic decision that shapes every AI feature on top of it.
SIMPLE EXAMPLE
A startup standardizes on one foundation model for chat, search, and content generation.
AI Agents
Function Calling
Function calling is a feature that lets a language model output structured arguments for a developer-defined function instead of plain text. The application then executes the function and feeds the result back, letting the model use real tools and data.
WHY IT MATTERS
Function calling is the bridge between language models and the rest of your software stack.
SIMPLE EXAMPLE
A travel assistant calls a search-flights function with parsed dates and cities from the user's message.
G4 TERMS
LLM Basics
Generative AI
Generative AI is a category of models that produce new content such as text, images, audio, video, code, or structured data. It contrasts with predictive AI, which classifies or scores existing inputs rather than creating new outputs.
WHY IT MATTERS
Generative AI is the technology layer behind most current AI products and copilots.
SIMPLE EXAMPLE
A marketing team uses generative AI to draft ad variants, then a human selects the strongest ones.
RAG & Retrieval
Grounding
Grounding is the practice of forcing a language model to base its answer on a specific source rather than its parametric memory. Sources can be retrieved documents, database rows, tool outputs, or live API data passed into the prompt.
WHY IT MATTERS
Grounding is the most reliable way to reduce hallucinations in production AI systems.
SIMPLE EXAMPLE
A support bot answers refund questions only from the official policy document, not training data.
AI Safety
Guardrails
Guardrails are rules and filters that constrain what a language model can accept as input or return as output. They cover topics, formats, PII, brand tone, and safety policies, and can be enforced by prompts, classifiers, or rule engines.
WHY IT MATTERS
Guardrails keep AI features on-policy and on-brand at scale without manual review of every output.
SIMPLE EXAMPLE
A chatbot blocks competitor mentions and rewrites any output that contains personal data.
Model Training
Gradient Descent
Gradient descent is an optimization algorithm used to train neural networks. It iteratively adjusts model weights in the direction that most reduces a loss function, gradually moving the model toward better predictions over many training steps.
WHY IT MATTERS
Gradient descent is the math engine that turns raw data and a model architecture into a useful model.
SIMPLE EXAMPLE
During training, weights are updated by gradient descent after each batch of examples.
H3 TERMS
AI Safety
Hallucination
A hallucination is an LLM output that sounds confident but is factually wrong, made up, or unsupported. Hallucinations include fake citations, invented product features, fabricated quotes, and incorrect numbers presented in fluent prose.
WHY IT MATTERS
Hallucinations are the top reason AI features fail user trust reviews and audits.
SIMPLE EXAMPLE
A research assistant invents a study and fake authors when asked for evidence.
AI Safety
Human-in-the-Loop AI
Human-in-the-loop AI is a workflow where a person reviews, edits, or approves model outputs before they reach a user or downstream system. The human acts as a quality gate and as a source of feedback the model can later learn from.
WHY IT MATTERS
Human-in-the-loop is the safest path to deploying AI in regulated or high-stakes workflows.
SIMPLE EXAMPLE
An insurer uses AI to draft claims decisions, but an adjuster signs off before payment.
RAG & Retrieval
Hybrid Search
Hybrid search combines keyword search and vector-based semantic search in one query. Keyword matches catch exact terms, codes, and names, while vector matches catch meaning and paraphrases, and the results are merged and re-ranked.
WHY IT MATTERS
Hybrid search consistently outperforms pure keyword or pure vector search in real RAG systems.
SIMPLE EXAMPLE
A docs site uses hybrid search so a query for error code E42 returns both exact matches and related guides.
I4 TERMS
LLM Basics
Inference
Inference is the process of using a trained model to generate predictions or completions from new inputs. It is the runtime side of AI, separate from training, and is what users pay for per request in most LLM APIs.
WHY IT MATTERS
Inference cost and latency define the unit economics of every AI feature.
SIMPLE EXAMPLE
A search box calls a model at inference time to rewrite the user's query before retrieval.
Model Training
Instruction Tuning
Instruction tuning is a fine-tuning step that trains a model on examples written as instructions paired with desired responses. It teaches the model to follow natural-language commands instead of only continuing text.
WHY IT MATTERS
Instruction tuning is what made base LLMs usable as helpful assistants.
SIMPLE EXAMPLE
A model is instruction-tuned on prompts like 'Summarize this email in two sentences' with ideal answers.
Automation
Intent Classification
Intent classification is a model task that maps a user message to a predefined intent label such as billing, support, or sales. It is widely used in chatbots and routing systems to decide which flow, agent, or tool should handle the request.
WHY IT MATTERS
Accurate intent classification is the difference between a useful router and a frustrating chatbot.
SIMPLE EXAMPLE
A help desk routes tickets to the right team based on the predicted intent of the first message.
AI Safety
Interpretability
Interpretability is the field that studies how to look inside a model and understand why it produced a specific output. Methods range from feature attributions to mechanistic analysis of internal activations and circuits.
WHY IT MATTERS
Interpretability supports debugging, safety reviews, and compliance with explainability requirements.
SIMPLE EXAMPLE
Researchers identify the neurons most responsible for a chatbot's tendency to flatter users.
✦ AI Search
Want your brand to be found in AI answers?
Get a tailored plan to make your brand findable, citable, and chosen across ChatGPT, Gemini, Perplexity, and Google AI Overviews.
J2 TERMS
Prompt Engineering
JSON Mode
JSON mode is a model setting that forces outputs to be valid JSON, often matching a schema you provide. It removes string parsing errors and makes model responses safe to feed directly into downstream code, databases, or APIs.
WHY IT MATTERS
JSON mode is essential for reliable automations and tool use in production.
SIMPLE EXAMPLE
A lead enrichment workflow asks the model to return a JSON object with name, role, and company.
AI Safety
Jailbreak Prompt
A jailbreak prompt is an input crafted to make a model bypass its safety rules, system instructions, or content policies. Techniques include role play, hidden instructions, encoded text, or multi-step social engineering.
WHY IT MATTERS
Jailbreaks are a real attack surface for any public AI feature and must be tested for.
SIMPLE EXAMPLE
A red team submits a prompt that pretends to be a developer test and tries to extract the system prompt.
K3 TERMS
RAG & Retrieval
Knowledge Base
A knowledge base is a structured collection of documents, articles, or data that an AI system can search and cite. It is the source of truth that grounds model answers and is usually indexed for keyword and vector search.
WHY IT MATTERS
A clean, current knowledge base is the single biggest lever for accurate AI answers.
SIMPLE EXAMPLE
A SaaS company connects its help center as the knowledge base for its customer-facing chatbot.
RAG & Retrieval
Knowledge Graph
A knowledge graph is a structured representation of entities and the relationships between them. It stores facts as nodes and edges, which lets AI systems answer questions that need precise, connected, or multi-hop information.
WHY IT MATTERS
Knowledge graphs improve precision for AI search where keyword and vector retrieval fall short.
SIMPLE EXAMPLE
A B2B platform uses a knowledge graph to answer questions like 'Which customers in EMEA use feature X?'
RAG & Retrieval
Knowledge Retrieval
Knowledge retrieval is the step that fetches relevant documents, snippets, or facts from a data source so a language model can use them in its answer. It usually combines indexing, query rewriting, ranking, and filtering.
WHY IT MATTERS
Retrieval quality sets the ceiling for any RAG or AI search experience.
SIMPLE EXAMPLE
Before answering, a sales copilot retrieves the latest pricing sheet and competitor battle cards.
L4 TERMS
LLM Basics
Language Model
A language model is a system that assigns probabilities to sequences of text and can generate the next likely tokens given a prompt. Modern language models are neural networks trained on large text corpora to read, write, and reason in natural language.
WHY IT MATTERS
Language models are the engines behind chat, search, summarization, and most AI copilots.
SIMPLE EXAMPLE
An autocomplete feature uses a language model to suggest the next sentence in an email.
LLM Basics
Large Language Model
A large language model, or LLM, is a language model with billions of parameters trained on massive text corpora. LLMs can follow instructions, summarize, translate, code, and reason across many domains without task-specific training.
WHY IT MATTERS
LLMs are the foundation models behind most generative AI products today.
SIMPLE EXAMPLE
A marketing team uses an LLM to draft, translate, and localize blog posts in one workflow.
Model Training
Latent Space
Latent space is the internal mathematical space where a model represents inputs as vectors of features. Similar inputs cluster together, and moving through latent space changes the meaning of the represented item in a structured way.
WHY IT MATTERS
Latent space is what makes embeddings, similarity search, and generative interpolation possible.
SIMPLE EXAMPLE
Two product descriptions about running shoes sit close together in the model's latent space.
Model Training
LoRA
LoRA, short for Low-Rank Adaptation, is a fine-tuning method that trains small adapter matrices on top of a frozen base model. It captures task-specific behavior with far fewer trainable parameters than full fine-tuning, making customization cheap and fast.
WHY IT MATTERS
LoRA lets teams ship customized models without paying full fine-tuning cost.
SIMPLE EXAMPLE
A studio trains a LoRA so a base image model produces characters in its signature art style.
M6 TERMS
Model Training
Machine Learning
Machine learning is the broader field of building systems that learn patterns from data instead of following hand-written rules. It includes classical methods like regression and trees as well as modern deep learning and the LLMs built on top of it.
WHY IT MATTERS
Machine learning is the parent discipline that contains generative AI as one branch.
SIMPLE EXAMPLE
A churn model uses machine learning to predict which subscribers will cancel next month.
Model Training
Model Evaluation
Model evaluation is the structured process of measuring how well a model performs on defined tasks using metrics, test sets, and human review. It compares versions, catches regressions, and informs decisions about deployment.
WHY IT MATTERS
Without evaluation, AI teams ship changes on vibes and discover problems in production.
SIMPLE EXAMPLE
A team runs an evaluation suite of 200 prompts before promoting any new system prompt to production.
LLM Basics
Model Parameters
Model parameters are the numerical values inside a neural network that are learned during training. They determine how the model transforms inputs into outputs, and their count is often used as a rough proxy for model capacity.
WHY IT MATTERS
Parameter count influences quality, cost, and the hardware needed to run a model.
SIMPLE EXAMPLE
A 70-billion-parameter model usually answers complex questions better than a 7-billion one.
LLM Basics
Model Weights
Model weights are the trained parameters of a neural network, stored as large files that can be loaded for inference. Open weights can be downloaded and run locally, while closed weights stay behind a vendor API.
WHY IT MATTERS
Weight access decides whether a team can self-host, fine-tune freely, or only call an API.
SIMPLE EXAMPLE
A regulated firm chooses an open-weights model so it can run inference inside its own network.
LLM Basics
Multimodal AI
Multimodal AI describes models that handle more than one type of input or output, such as text, images, audio, and video, in a single system. The model can read a screenshot, listen to a clip, or describe an image alongside text.
WHY IT MATTERS
Multimodal models unlock workflows that mix documents, screenshots, and voice in one conversation.
SIMPLE EXAMPLE
A user uploads a chart image and asks a multimodal model to write the takeaway as a tweet.
Model Training
Mixture of Experts
Mixture of experts, or MoE, is an architecture where many specialist sub-networks exist inside one model and a router activates only a few per input. This gives the quality of a very large model with the cost of a much smaller one per request.
WHY IT MATTERS
MoE is one of the main ways frontier labs are scaling capability without scaling cost linearly.
SIMPLE EXAMPLE
An MoE LLM routes coding questions to coding experts and legal questions to legal ones internally.
N3 TERMS
LLM Basics
Natural Language Processing
Natural language processing, or NLP, is the field that builds systems to understand, generate, and analyze human language. It includes tasks like classification, extraction, summarization, translation, and dialogue, and underpins most modern LLM applications.
WHY IT MATTERS
NLP is the discipline that turns raw text into structured insight and useful product features.
SIMPLE EXAMPLE
An NLP pipeline extracts company names and amounts from contract PDFs for a finance team.
Model Training
Neural Network
A neural network is a model made of layers of connected nodes that transform inputs into outputs through weighted operations. Training adjusts those weights so the network learns to map inputs to the right answers across many examples.
WHY IT MATTERS
Neural networks are the building block of every modern AI model.
SIMPLE EXAMPLE
An image classifier uses a neural network to distinguish defective and healthy parts on a line.
RAG & Retrieval
Neural Retrieval
Neural retrieval uses neural networks to encode queries and documents into embeddings and rank them by semantic similarity. It complements or replaces classical keyword retrieval and powers most modern AI search experiences.
WHY IT MATTERS
Neural retrieval is what makes search feel like it understands meaning, not just words.
SIMPLE EXAMPLE
A docs search returns relevant guides even when the user uses different wording than the title.
O3 TERMS
LLM Basics
Open-Source LLM
An open-source LLM is a large language model whose weights, and often code and training details, are released publicly. Teams can download, run, fine-tune, and deploy it under the model's license without depending on a single vendor.
WHY IT MATTERS
Open-source LLMs give teams control over cost, privacy, and customization.
SIMPLE EXAMPLE
A healthcare startup self-hosts an open-source LLM to keep patient data inside its own infrastructure.
AI Agents
Output Parser
An output parser is a component that converts a model's text response into a structured format such as JSON, a typed object, or a database row. It handles validation, retries, and schema enforcement so downstream code can rely on the result.
WHY IT MATTERS
Output parsers turn unreliable text outputs into safe inputs for the rest of your software.
SIMPLE EXAMPLE
A workflow parses an LLM's answer into a typed Lead object before saving it to the CRM.
Model Training
Overfitting
Overfitting is when a model learns the training data too closely, including its noise, and performs worse on new data. It usually appears as high training accuracy and low test accuracy and is fought with more data, regularization, or simpler models.
WHY IT MATTERS
Overfit models look great in development and disappoint in production.
SIMPLE EXAMPLE
A churn model that scores 99 percent on training data drops to 70 percent on next month's customers.
P8 TERMS
LLM Basics
Parameter
A parameter is a single learned numerical value inside a neural network. The collection of all parameters defines the model, and parameter count is often quoted as a rough indicator of model size and capacity.
WHY IT MATTERS
Parameters determine model size, cost, and the hardware required to run inference.
SIMPLE EXAMPLE
A 7B model has seven billion parameters and fits on consumer GPUs, unlike a 70B model.
Model Training
Pretraining
Pretraining is the first, most expensive stage of building a foundation model, where the model learns general language patterns from a huge corpus of text. It usually trains by predicting the next token across trillions of examples.
WHY IT MATTERS
Pretraining decides what a model fundamentally knows before any fine-tuning or prompting.
SIMPLE EXAMPLE
A foundation model is pretrained on web text and code before being instruction-tuned for chat.
Prompt Engineering
Prompt
A prompt is the input you send to a language model to get a response. It can include instructions, context, examples, retrieved data, and a user question, and its structure strongly affects the quality, format, and safety of the output.
WHY IT MATTERS
Prompts are the primary user interface and product surface for most AI features today.
SIMPLE EXAMPLE
A meta-description prompt includes the page title, target keyword, and a length limit.
Prompt Engineering
Prompt Chain
A prompt chain is a sequence of prompts where each step uses the output of the previous one. Chains break complex tasks into smaller, easier subtasks and let teams insert tools, validation, or branching between steps.
WHY IT MATTERS
Prompt chains turn one fragile prompt into a reliable multi-step workflow.
SIMPLE EXAMPLE
A blog workflow chains: outline, draft, fact-check, and meta description into four steps.
Prompt Engineering
Prompt Engineering
Prompt engineering is the practice of designing, testing, and refining the inputs sent to a language model to get reliable, high-quality outputs. It covers structure, examples, constraints, format, and the use of system prompts and retrieved context.
WHY IT MATTERS
Prompt engineering is often the cheapest and fastest way to lift AI feature quality.
SIMPLE EXAMPLE
A team improves answer accuracy by 20 percent just by rewriting the system prompt and adding examples.
AI Safety
Prompt Injection
Prompt injection is an attack where malicious instructions are placed in user input or external content so the model executes them instead of the developer's instructions. It can leak data, override safety rules, or trigger unintended tool calls.
WHY IT MATTERS
Prompt injection is one of the most serious security risks in any agentic or RAG system.
SIMPLE EXAMPLE
A web page contains hidden text telling a browsing agent to send the user's emails to an attacker.
Prompt Engineering
Prompt Library
A prompt library is a managed collection of reusable, version-controlled prompts shared across a team or product. It treats prompts like code, with naming, documentation, evaluation, and rollout controls.
WHY IT MATTERS
Prompt libraries prevent drift, duplicate work, and silent quality regressions.
SIMPLE EXAMPLE
Marketing, support, and sales all pull approved prompts from a single internal prompt library.
Prompt Engineering
Prompt Template
A prompt template is a reusable prompt with placeholders for variables such as user input, retrieved documents, or product fields. The template is filled in at runtime, ensuring every call follows the same proven structure.
WHY IT MATTERS
Templates make prompts maintainable, testable, and consistent across thousands of calls.
SIMPLE EXAMPLE
A meta-description template inserts page title and keyword into a fixed instruction every time.
Q3 TERMS
AI Search & SEO
Query
A query is the question or input a user sends to a search system or AI assistant. In AI search, queries are often rewritten, expanded, or decomposed before retrieval and answer generation to improve the quality of results.
WHY IT MATTERS
Understanding real queries is the starting point for any AI search or content strategy.
SIMPLE EXAMPLE
A user query 'best CRM for small SaaS' is rewritten by the system into a clearer search intent.
RAG & Retrieval
Query Embedding
A query embedding is the vector representation of a user query, produced by an embedding model. It is compared against document embeddings to find the most semantically similar content for retrieval.
WHY IT MATTERS
Query embeddings let search match meaning, not just exact words.
SIMPLE EXAMPLE
A search for 'cancel my plan' retrieves the article 'How to end your subscription' via embeddings.
Model Training
Quantization
Quantization reduces the precision of a model's weights, for example from 16-bit to 4-bit, so the model uses less memory and runs faster. Modern quantization techniques keep most of the original quality while shrinking the model significantly.
WHY IT MATTERS
Quantization lets larger models run on smaller, cheaper hardware in production.
SIMPLE EXAMPLE
A team quantizes a 13B model so it runs on a single GPU inside its product.
R5 TERMS
RAG & Retrieval
RAG
RAG, or retrieval-augmented generation, is a pattern where a language model retrieves relevant documents at query time and uses them to ground its answer. It combines a search system with an LLM so responses are based on current, owned data.
WHY IT MATTERS
RAG is the standard pattern for accurate, source-cited AI answers over private content.
SIMPLE EXAMPLE
A support bot uses RAG to answer from the latest help articles instead of stale training data.
AI Search & SEO
Ranking Model
A ranking model orders a set of candidate items, such as search results or recommendations, by predicted relevance to a user or query. In AI search, ranking models often re-score candidates returned by retrieval before sending them to the LLM.
WHY IT MATTERS
Strong ranking is what separates AI search that feels precise from AI search that feels noisy.
SIMPLE EXAMPLE
A retrieval step returns 50 docs, and a ranking model picks the top 5 to pass to the LLM.
Model Training
Reinforcement Learning
Reinforcement learning is a training method where a model learns by taking actions and receiving rewards or penalties. Over many trials, it adjusts its behavior to maximize cumulative reward in an environment.
WHY IT MATTERS
Reinforcement learning is a key tool for aligning LLMs and training agents that act in the world.
SIMPLE EXAMPLE
A robot learns to grasp objects through reinforcement learning by trying and being rewarded for success.
Model Training
RLHF
RLHF, or reinforcement learning from human feedback, fine-tunes a model using preferences collected from human raters. People compare model outputs, a reward model learns those preferences, and the LLM is then optimized to match them.
WHY IT MATTERS
RLHF is the main reason modern chat models feel helpful, polite, and on-policy.
SIMPLE EXAMPLE
Raters pick the better of two answers, and the model is trained to produce the preferred style.
RAG & Retrieval
Retriever
A retriever is the component in a RAG system that fetches candidate documents or chunks for a given query. It can use keyword search, vector search, or both, and its output is what the LLM reads before answering.
WHY IT MATTERS
Retriever quality is the single biggest determinant of RAG accuracy.
SIMPLE EXAMPLE
A retriever pulls the top five matching policy paragraphs before the LLM writes an answer.
S5 TERMS
AI Search & SEO
Semantic Search
Semantic search uses embeddings to match queries and documents by meaning rather than exact keywords. It returns relevant results even when wording differs and is the foundation of modern AI search and RAG.
WHY IT MATTERS
Semantic search is what makes AI search feel like it actually understands the question.
SIMPLE EXAMPLE
A search for 'reset password' returns 'recover account access' even with no shared keywords.
Model Training
Self-Attention
Self-attention is the mechanism in a transformer that lets each token attend to every other token in the input. It produces context-aware representations and is what allows LLMs to handle long, complex passages coherently.
WHY IT MATTERS
Self-attention is the architectural innovation behind modern LLMs.
SIMPLE EXAMPLE
When summarizing a long email, self-attention links the closing question back to the opening request.
LLM Basics
Small Language Model
A small language model, or SLM, is a language model with far fewer parameters than frontier LLMs, optimized to run cheaply on smaller hardware. SLMs trade some general capability for speed, cost, and the ability to run on-device.
WHY IT MATTERS
SLMs are the right choice for high-volume, low-latency, or on-device AI features.
SIMPLE EXAMPLE
A mobile app runs an SLM on the phone for offline writing suggestions.
Prompt Engineering
System Prompt
A system prompt is a hidden instruction the developer passes to the model that defines its role, tone, allowed topics, and output format. It applies to every user message in a session and is the primary way to control assistant behavior.
WHY IT MATTERS
The system prompt is the most leveraged piece of text in any AI product.
SIMPLE EXAMPLE
A retail bot's system prompt restricts answers to product, shipping, and returns topics.
Model Training
Synthetic Data
Synthetic data is artificial data generated by a model or simulation, used to train, fine-tune, or evaluate other models. It can fill gaps where real data is scarce, sensitive, or expensive to label.
WHY IT MATTERS
Synthetic data lets teams train models when real-world data is limited or restricted.
SIMPLE EXAMPLE
A team generates synthetic support tickets to balance rare intent classes in its training set.
T5 TERMS
Prompt Engineering
Temperature
Temperature is a setting that controls randomness in a model's output. Low values make outputs more deterministic and focused, while high values make them more diverse and creative, at the cost of some consistency and accuracy.
WHY IT MATTERS
Temperature is a fast lever to tune AI features for either reliability or creativity.
SIMPLE EXAMPLE
Email drafts use temperature 0.7 for variety, while data extraction uses 0 for stability.
LLM Basics
Token
A token is the basic unit of text a language model reads and generates, usually a short string of characters or part of a word. Models charge, limit context, and produce output measured in tokens, not characters.
WHY IT MATTERS
Tokens are the unit of cost, latency, and context in every LLM-based feature.
SIMPLE EXAMPLE
A 1,000-word article is roughly 1,300 tokens for most English text.
Model Training
Tokenization
Tokenization is the process of splitting text into tokens that a model can process. The tokenizer is paired with the model and decides how words, punctuation, code, and non-English text are broken into pieces.
WHY IT MATTERS
Tokenization choices affect cost, context length, and how well models handle different languages.
SIMPLE EXAMPLE
A long German compound word may consume more tokens than its English equivalent.
AI Agents
Tool Calling
Tool calling lets a language model invoke external functions, APIs, or plugins to do things it cannot do alone, such as fetch data, run calculations, or update systems. The model returns structured arguments and then uses the results in its next response.
WHY IT MATTERS
Tool calling turns LLMs from text generators into agents that can act in real systems.
SIMPLE EXAMPLE
A research agent uses tool calling to run a web search and then summarize the results.
Model Training
Transformer Architecture
The transformer architecture is the neural network design behind modern LLMs. It uses self-attention to process all tokens in parallel and capture long-range relationships, replacing earlier recurrent designs and enabling much larger, more capable models.
WHY IT MATTERS
The transformer is the architectural reason today's AI capabilities exist.
SIMPLE EXAMPLE
Most major LLMs, from GPT to open-source models, are variations of the transformer architecture.
U2 TERMS
RAG & Retrieval
Unstructured Data
Unstructured data is information that does not fit neatly into rows and columns, such as documents, emails, chat transcripts, images, audio, and video. LLMs and embeddings make this data searchable and useful without forcing it into a schema.
WHY IT MATTERS
Most enterprise knowledge is unstructured, and AI is what finally makes it usable.
SIMPLE EXAMPLE
A team uses an LLM to extract action items from thousands of unstructured meeting transcripts.
Prompt Engineering
User Prompt
A user prompt is the message sent by the end user in a conversation with an AI assistant. It sits alongside the system prompt and any retrieved context and is the part the developer has the least direct control over.
WHY IT MATTERS
Designing for messy real-world user prompts is what separates demos from production AI.
SIMPLE EXAMPLE
A user types 'fix this' with no context, and the assistant must ask a clarifying question.
V4 TERMS
RAG & Retrieval
Vector Database
A vector database is a system designed to store embeddings and run fast similarity search over them. It indexes high-dimensional vectors and supports filters, hybrid search, and metadata so retrieval at scale stays fast and accurate.
WHY IT MATTERS
A vector database is the storage layer that makes production RAG and AI search practical.
SIMPLE EXAMPLE
An ecommerce site stores product embeddings in a vector database for instant semantic search.
RAG & Retrieval
Vector Embedding
A vector embedding is a numerical representation of a piece of content in a high-dimensional space. Items with similar meaning have nearby vectors, which lets systems search, cluster, and recommend by semantic similarity.
WHY IT MATTERS
Vector embeddings are the data structure behind every modern semantic search system.
SIMPLE EXAMPLE
A blog post is converted into a 1,536-dimension vector embedding for search.
RAG & Retrieval
Vector Index
A vector index is a data structure inside a vector database that enables fast nearest-neighbor search across millions of embeddings. Common index types like HNSW or IVF trade off recall, latency, and memory for different workloads.
WHY IT MATTERS
Index choice and tuning directly control AI search latency and cost at scale.
SIMPLE EXAMPLE
Switching to an HNSW index lets a docs search return results in 30 ms instead of 300 ms.
AI Search & SEO
Vector Search
Vector search ranks documents by the distance between their embeddings and a query embedding. It returns results based on meaning rather than keywords and is the core retrieval method behind RAG and AI-powered search experiences.
WHY IT MATTERS
Vector search is the technical foundation of AI search, citations, and grounded answers.
SIMPLE EXAMPLE
A help center uses vector search so 'forgot login' returns the password reset article.
W3 TERMS
LLM Basics
Weights
Weights are the trained numerical values inside a neural network that determine how it transforms inputs into outputs. Together they define the model and are what gets shipped, loaded, fine-tuned, and quantized.
WHY IT MATTERS
Access to weights determines whether a team can self-host, customize, or only use an API.
SIMPLE EXAMPLE
An open-weights model can be downloaded and fine-tuned in-house on private data.
Model Training
Weak Supervision
Weak supervision is a training approach where labels are generated automatically from rules, heuristics, or other models instead of full human annotation. The labels are noisier but cheap and abundant, and models can still learn useful patterns from them.
WHY IT MATTERS
Weak supervision unlocks training when high-quality labeled data is too expensive to collect.
SIMPLE EXAMPLE
A team labels millions of tickets as urgent based on keyword rules to bootstrap a triage model.
Automation
Workflow Automation
Workflow automation uses software, often combined with AI, to execute a sequence of steps across systems with little or no human input. AI-powered workflows add reasoning, content generation, and decision-making on top of traditional triggers and actions.
WHY IT MATTERS
AI-powered automation is the main path to real productivity gains from generative AI.
SIMPLE EXAMPLE
A workflow auto-drafts a proposal, attaches pricing, and notifies the sales rep when a lead qualifies.
X1 TERM
AI Safety
XAI
XAI, short for explainable AI, refers to methods and tools that make the reasoning of AI systems understandable to humans. Approaches include feature attributions, rationales, surfaced sources, and visualizations that show why a model produced a given output.
WHY IT MATTERS
XAI supports trust, debugging, and compliance for high-stakes AI systems.
SIMPLE EXAMPLE
An AI underwriting tool shows applicants which factors most influenced its decision.
Y1 TERM
Prompt Engineering
YAML Prompting
YAML prompting is a style of prompt design that uses YAML structure to organize roles, instructions, examples, constraints, and expected output formats. The structured layout makes complex prompts easier to read, version, and maintain.
WHY IT MATTERS
YAML-style prompts scale better than long paragraphs as prompts grow more complex.
SIMPLE EXAMPLE
A content prompt uses YAML keys for role, audience, tone, structure, and forbidden phrases.
Z2 TERMS
Model Training
Zero-Shot Learning
Zero-shot learning is a model's ability to perform a task it was not explicitly trained on, using only a description of the task. The model leverages general knowledge from pretraining to generalize to new categories or instructions.
WHY IT MATTERS
Zero-shot capability is what makes modern LLMs useful out of the box across many tasks.
SIMPLE EXAMPLE
An LLM classifies support tickets into new categories described only in plain English.
Prompt Engineering
Zero-Shot Prompting
Zero-shot prompting asks a language model to do a task with only an instruction and no examples. The model relies on its pretraining and instruction tuning to interpret the request and produce a usable answer.
WHY IT MATTERS
Zero-shot prompting is the fastest way to test whether an LLM can handle a task at all.
SIMPLE EXAMPLE
A team asks a model to 'classify this email as billing, support, or sales' with no examples first.
AI Search
Want your Brand to be Found?
Get a tailored plan to make your brand findable, citable, and chosen across ChatGPT, Gemini, Perplexity, and Google AI Overviews.
Book Strategy Call →