LLM & Generative AI Glossary

100 Key Terms Explained (A–Z)

A practical reference covering AI, large language models, prompt engineering, retrieval, automation, and AI search — written for marketers, founders, and product teams. Each term has a plain-language definition, why it matters, and a real-world example you can use today.

Explore Terms Get AI Search Audit→

Built for marketers, SEO teams, founders, and AI practitioners

109 terms

11 TERMS

AI Agents

Agentic AI

Agentic AI describes systems where a language model plans steps, calls tools, and acts toward a goal with limited human input. The model decides what to do next, executes actions, observes results, and adjusts its plan until the task is finished.

WHY IT MATTERS

Agentic AI turns chatbots into workers that can complete multi-step business tasks end to end.

SIMPLE EXAMPLE

A marketing agent researches competitors, drafts a brief, and schedules a blog post without manual handoffs.

AI Agents

AI Agent

An AI agent is software that uses a large language model as its reasoning core and connects to tools, memory, and APIs to complete tasks. It interprets a goal, chooses actions, runs them, and returns results to a user or another system.

WHY IT MATTERS

AI agents replace fragmented automations with one system that can plan and act across tools.

SIMPLE EXAMPLE

An SEO agent audits a site, finds broken pages, and opens fix tickets in Jira automatically.

AI Safety

AI Alignment

AI alignment is the practice of training and constraining AI systems so their behavior matches human goals, values, and safety expectations. It combines data choices, fine-tuning, reinforcement learning from human feedback, and policy rules that shape model outputs.

WHY IT MATTERS

Alignment determines whether an AI helps users safely or produces harmful, biased, or off-brand answers.

SIMPLE EXAMPLE

A support copilot is aligned to refuse legal advice and always escalate refund disputes to a human.

LLM Basics

AI Copilot

An AI copilot is an assistant embedded inside a product or workflow that helps a user complete tasks faster. It uses a language model plus context from the host app to suggest text, code, queries, or actions the user can accept, edit, or reject.

WHY IT MATTERS

Copilots increase output per employee without forcing teams to switch tools or learn new interfaces.

SIMPLE EXAMPLE

A sales copilot drafts follow-up emails inside the CRM using the latest call notes.

AI Safety

AI Governance

AI governance is the set of policies, roles, and controls that decide how an organization builds, deploys, and monitors AI systems. It covers data use, model selection, risk reviews, human oversight, auditing, and compliance with internal and external regulations.

WHY IT MATTERS

Governance lets teams scale AI usage without exposing the business to legal, brand, or security risk.

SIMPLE EXAMPLE

A bank requires every customer-facing prompt and dataset to be reviewed before production use.

AI Safety

AI Hallucination

An AI hallucination is a confident output from a language model that is factually wrong, fabricated, or unsupported by its sources. It happens because models predict likely text patterns, not verified facts, and may invent names, numbers, citations, or events.

WHY IT MATTERS

Hallucinations damage trust and can cause real legal, financial, or brand harm in customer-facing tools.

SIMPLE EXAMPLE

A chatbot invents a refund policy that does not exist in the company knowledge base.

LLM Basics

AI Model

An AI model is a trained mathematical system that maps inputs to outputs after learning patterns from data. In generative AI, the model is usually a neural network that has learned to produce text, images, audio, code, or structured data from a prompt.

WHY IT MATTERS

The model you choose sets the ceiling for quality, cost, latency, and safety in any AI feature.

SIMPLE EXAMPLE

A team picks a smaller model for autocomplete and a larger model for long-form report generation.

Automation

AI Orchestration

AI orchestration is the layer that coordinates models, prompts, tools, data sources, and steps inside a single workflow. It manages routing, retries, memory, and handoffs so a complex task runs reliably across multiple components instead of one isolated prompt.

WHY IT MATTERS

Orchestration is what turns demos into production AI systems that scale and stay debuggable.

SIMPLE EXAMPLE

An onboarding workflow pulls user data, calls a model, sends an email, and logs the result.

AI Safety

AI safety is the field focused on preventing AI systems from causing harm to people, organizations, or society. It includes technical work on alignment, robustness, and evaluation as well as operational practices like red teaming, monitoring, and incident response.

WHY IT MATTERS

Safety practices decide whether an AI feature is shipped, gated, or pulled after launch.

SIMPLE EXAMPLE

A team red-teams a new chatbot to find prompts that leak personal data before release.

LLM Basics

Artificial General Intelligence

Artificial general intelligence, or AGI, refers to a hypothetical AI system that can perform any intellectual task a human can, across domains, without retraining. Today's models are narrow and capable in specific tasks but do not yet meet a widely accepted definition of AGI.

WHY IT MATTERS

AGI shapes long-term strategy, regulation, and investment even though current systems are far from it.

SIMPLE EXAMPLE

Executives plan AI roadmaps assuming current narrow models, not speculative AGI capabilities.

Model Training

Attention Mechanism

The attention mechanism is the part of a transformer that lets the model weigh which input tokens matter most when predicting the next token. It assigns scores between tokens so context, not just position, drives the output.

WHY IT MATTERS

Attention is the core innovation that made modern LLMs possible and powerful at long-context tasks.

SIMPLE EXAMPLE

When summarizing a contract, attention helps the model focus on clauses about payment terms.

5 TERMS

Model Training

Backpropagation

Backpropagation is the algorithm used to train neural networks by sending the prediction error backward through the network and updating the weights. It calculates how much each weight contributed to the error and adjusts it to reduce future errors.

WHY IT MATTERS

Backpropagation is what makes large models learnable in practice on huge datasets.

SIMPLE EXAMPLE

During training, a model improves its next-word predictions after each batch using backpropagation.

LLM Basics

Batch Inference

Batch inference runs many model predictions at once instead of one request at a time. It groups inputs, sends them through the model in parallel, and returns results together, which lowers cost per request and increases throughput for non-interactive workloads.

WHY IT MATTERS

Batch inference is how teams keep AI features affordable at scale.

SIMPLE EXAMPLE

An ecommerce team scores ten thousand product descriptions for SEO quality overnight in one batch.

Model Training

Benchmark Evaluation

Benchmark evaluation tests a model on standardized tasks and datasets to compare its performance against other models. Common benchmarks measure reasoning, coding, math, knowledge, safety, and instruction following with public scores teams use to choose models.

WHY IT MATTERS

Benchmarks help teams pick models on evidence rather than marketing claims.

SIMPLE EXAMPLE

A team compares two LLMs on a coding benchmark before choosing one for a developer copilot.

AI Safety

Bias in AI

Bias in AI is a systematic error in model outputs that unfairly favors or disadvantages certain groups, topics, or viewpoints. It usually comes from skewed training data, labeling choices, or alignment decisions and can appear in language, ranking, or recommendations.

WHY IT MATTERS

Bias creates legal, ethical, and brand risk and degrades the quality of AI-driven decisions.

SIMPLE EXAMPLE

A hiring screener trained on past resumes downgrades candidates from underrepresented schools.

LLM Basics

Big Language Model

A big language model is a colloquial term for a large language model with billions of parameters trained on huge text corpora. The terms are used interchangeably to describe general-purpose models that can read, write, summarize, and reason over text.

WHY IT MATTERS

Naming aside, model size strongly influences quality, cost, and latency tradeoffs.

SIMPLE EXAMPLE

A founder asks whether to use a big language model or a small fine-tuned model for support.

8 TERMS

Prompt Engineering

Chain-of-Thought Prompting

Chain-of-thought prompting asks a model to show its reasoning step by step before giving a final answer. It improves accuracy on math, logic, and multi-step tasks because the model commits to intermediate conclusions instead of guessing the result in one jump.

WHY IT MATTERS

Chain-of-thought prompts often raise quality without changing the model or training data.

SIMPLE EXAMPLE

A pricing prompt instructs the model to list assumptions first, then compute the final quote.

RAG & Retrieval

Chunking

Chunking is the process of splitting long documents into smaller pieces so they can be embedded and retrieved by a language model. Good chunking preserves meaning by respecting headings, sentences, or semantic boundaries instead of cutting text by raw character count.

WHY IT MATTERS

Chunk quality directly controls how well retrieval-augmented generation answers a question.

SIMPLE EXAMPLE

A help center splits articles by H2 sections so the model retrieves the exact relevant paragraph.

LLM Basics

Completion

A completion is the text a language model generates in response to a prompt. The model continues the input by predicting the most likely next tokens until it reaches a stop condition, a length limit, or an end-of-message signal.

WHY IT MATTERS

Completion is the fundamental output unit billed and measured in most LLM APIs.

SIMPLE EXAMPLE

Sending a product brief returns a completion containing a draft landing-page headline.

Prompt Engineering

Context Engineering

Context engineering is the discipline of designing what information goes into a model's context window for each request. It includes choosing system prompts, examples, retrieved documents, tool outputs, and user history so the model has exactly what it needs and nothing else.

WHY IT MATTERS

Strong context engineering often beats prompt tweaks for accuracy, cost, and latency.

SIMPLE EXAMPLE

A support agent injects only the customer's plan and last three tickets, not the full account history.

LLM Basics

Context Window

The context window is the maximum amount of text, measured in tokens, that a model can consider at once. It includes the system prompt, user input, retrieved documents, prior turns, and the response. Anything outside the window is invisible to the model.

WHY IT MATTERS

The context window limits how much knowledge you can pass to a model in one call.

SIMPLE EXAMPLE

A long sales call transcript must be summarized in chunks because it exceeds the model's context window.

LLM Basics

Conversational AI

Conversational AI is software that holds multi-turn dialogues with users in natural language. It typically combines a language model, memory of recent turns, intent handling, and access to tools or knowledge so the conversation stays useful across many messages.

WHY IT MATTERS

Conversational AI is the main interface layer for support, sales, and internal copilots.

SIMPLE EXAMPLE

A bank chatbot remembers your earlier balance question when you next ask about transfer fees.

Model Training

Corpus

A corpus is a structured collection of text used to train, fine-tune, or evaluate a language model. Corpora may include books, web pages, code, conversations, or domain documents and are usually cleaned, deduplicated, and filtered before training begins.

WHY IT MATTERS

The corpus shapes what a model knows, how it writes, and where it has blind spots.

SIMPLE EXAMPLE

A legal LLM is fine-tuned on a corpus of contracts and case law instead of general web text.

LLM Basics

Custom GPT

A custom GPT is a configured version of a general LLM with a specific system prompt, instructions, knowledge files, and optional tools. It behaves like a specialized assistant for a single use case without retraining the underlying model.

WHY IT MATTERS

Custom GPTs let non-engineers ship branded AI assistants in hours, not weeks.

SIMPLE EXAMPLE

A marketing team builds a custom GPT that writes only in their tone-of-voice guidelines.

6 TERMS

Model Training

Data Annotation

Data annotation is the process of labeling raw data so models can learn from it. Humans or tools tag text, images, or audio with categories, spans, ratings, or relationships, producing the ground truth used in training and evaluation.

WHY IT MATTERS

High-quality annotation often matters more than model choice for task performance.

SIMPLE EXAMPLE

Reviewers label support tickets as billing, technical, or sales to train an intent classifier.

Model Training

Data Augmentation

Data augmentation expands a training dataset by creating modified versions of existing examples. Techniques include paraphrasing text, translating and back-translating, or generating synthetic variations so the model sees more diversity without collecting new raw data.

WHY IT MATTERS

Augmentation reduces overfitting and improves performance when labeled data is scarce.

SIMPLE EXAMPLE

A team paraphrases customer questions to give an intent classifier more training variety.

Model Training

Dataset

A dataset is a curated collection of examples used to train, fine-tune, or evaluate a model. It typically includes inputs, labels, and metadata, organized into training, validation, and test splits so model performance can be measured fairly.

WHY IT MATTERS

Dataset quality, size, and split design control whether evaluation results are trustworthy.

SIMPLE EXAMPLE

A team holds out 10 percent of labeled support tickets as a test set for the new classifier.

Model Training

Deep Learning

Deep learning is a branch of machine learning that uses multi-layer neural networks to learn patterns directly from raw data. The depth of the network lets the system learn hierarchical features, which is why it powers modern vision, speech, and language models.

WHY IT MATTERS

Deep learning is the foundation under every current generative AI system.

SIMPLE EXAMPLE

A computer vision model uses deep learning to detect product defects on a factory line.

LLM Basics

Diffusion Model

A diffusion model is a generative model that learns to reverse a noise process. It starts from random noise and gradually denoises it into a coherent output, which is how most modern image, video, and audio generators work.

WHY IT MATTERS

Diffusion models drive the visual side of generative AI, alongside LLMs for text.

SIMPLE EXAMPLE

A marketing team generates campaign hero images from a brief using a diffusion model.

Model Training

Distillation

Distillation is a training technique where a smaller student model learns to imitate a larger teacher model. The student keeps most of the teacher's quality while running faster and cheaper, making it suitable for production and on-device use.

WHY IT MATTERS

Distillation is one of the main ways to ship LLM features at low latency and cost.

SIMPLE EXAMPLE

A company distills a large model into a small one that runs autocomplete inside its mobile app.

4 TERMS

RAG & Retrieval

Embedding

An embedding is a vector of numbers that represents the meaning of a piece of text, image, or other data. Items with similar meaning sit close together in vector space, which lets systems compare, search, and cluster content by semantic similarity.

WHY IT MATTERS

Embeddings power semantic search, RAG, recommendations, and most modern AI search features.

SIMPLE EXAMPLE

A help center embeds every article so user questions retrieve the closest matching answers.

RAG & Retrieval

Embedding Model

An embedding model is a neural network trained to turn text or other inputs into embeddings. Different embedding models trade off dimensionality, language coverage, domain quality, and cost, and the choice strongly affects retrieval relevance.

WHY IT MATTERS

Picking the right embedding model often improves RAG quality more than swapping the LLM.

SIMPLE EXAMPLE

An ecommerce team tests three embedding models to find the one that ranks product matches best.

Model Training

Evaluation Dataset

An evaluation dataset is a held-out set of labeled examples used to measure how well a model performs on a task. It is kept separate from training data so reported metrics reflect real generalization, not memorization.

WHY IT MATTERS

Without a clean evaluation set, teams cannot tell if a new prompt or model is actually better.

SIMPLE EXAMPLE

A team measures support classification accuracy on 1,000 unseen tickets after every prompt change.

AI Safety

Explainable AI

Explainable AI, or XAI, is the set of methods that make AI decisions understandable to humans. Techniques include feature attributions, surfaced sources, decision rules, and natural language rationales that show why a model produced a given output.

WHY IT MATTERS

Explainability is required for trust, debugging, and many regulated AI use cases.

SIMPLE EXAMPLE

A credit model shows which factors raised or lowered an applicant's risk score.

4 TERMS

Prompt Engineering

Few-Shot Prompting

Few-shot prompting includes a small number of input-output examples in the prompt to show the model the desired pattern. The model uses those examples to infer format, tone, and reasoning style for a new input without any fine-tuning.

WHY IT MATTERS

Few-shot prompting is the cheapest way to lift quality on structured or stylistic tasks.

SIMPLE EXAMPLE

A prompt shows three examples of well-written meta descriptions before asking for a new one.

Model Training

Fine-Tuning

Fine-tuning continues training a pretrained model on a smaller, task-specific dataset. The model adjusts its weights to specialize in a domain, format, or style while keeping the broad language ability it learned during pretraining.

WHY IT MATTERS

Fine-tuning is the right choice when prompting and retrieval cannot reach the quality bar.

SIMPLE EXAMPLE

A legal team fine-tunes a model on its contract templates to draft consistent NDAs.

LLM Basics

Foundation Model

A foundation model is a large model trained on broad data that can be adapted to many downstream tasks. It serves as a general base on top of which teams build features through prompting, retrieval, fine-tuning, or tool use.

WHY IT MATTERS

Choosing a foundation model is a strategic decision that shapes every AI feature on top of it.

SIMPLE EXAMPLE

A startup standardizes on one foundation model for chat, search, and content generation.

AI Agents

Function Calling

Function calling is a feature that lets a language model output structured arguments for a developer-defined function instead of plain text. The application then executes the function and feeds the result back, letting the model use real tools and data.

WHY IT MATTERS

Function calling is the bridge between language models and the rest of your software stack.

SIMPLE EXAMPLE

A travel assistant calls a search-flights function with parsed dates and cities from the user's message.

4 TERMS

LLM Basics

Generative AI

Generative AI is a category of models that produce new content such as text, images, audio, video, code, or structured data. It contrasts with predictive AI, which classifies or scores existing inputs rather than creating new outputs.

WHY IT MATTERS

Generative AI is the technology layer behind most current AI products and copilots.

SIMPLE EXAMPLE

A marketing team uses generative AI to draft ad variants, then a human selects the strongest ones.

RAG & Retrieval

Grounding

Grounding is the practice of forcing a language model to base its answer on a specific source rather than its parametric memory. Sources can be retrieved documents, database rows, tool outputs, or live API data passed into the prompt.

WHY IT MATTERS

Grounding is the most reliable way to reduce hallucinations in production AI systems.

SIMPLE EXAMPLE

A support bot answers refund questions only from the official policy document, not training data.

AI Safety

Guardrails

Guardrails are rules and filters that constrain what a language model can accept as input or return as output. They cover topics, formats, PII, brand tone, and safety policies, and can be enforced by prompts, classifiers, or rule engines.

WHY IT MATTERS

Guardrails keep AI features on-policy and on-brand at scale without manual review of every output.

SIMPLE EXAMPLE

A chatbot blocks competitor mentions and rewrites any output that contains personal data.

Model Training

Gradient Descent

Gradient descent is an optimization algorithm used to train neural networks. It iteratively adjusts model weights in the direction that most reduces a loss function, gradually moving the model toward better predictions over many training steps.

WHY IT MATTERS

Gradient descent is the math engine that turns raw data and a model architecture into a useful model.

SIMPLE EXAMPLE

During training, weights are updated by gradient descent after each batch of examples.

3 TERMS

AI Safety

Hallucination

A hallucination is an LLM output that sounds confident but is factually wrong, made up, or unsupported. Hallucinations include fake citations, invented product features, fabricated quotes, and incorrect numbers presented in fluent prose.

WHY IT MATTERS

Hallucinations are the top reason AI features fail user trust reviews and audits.

SIMPLE EXAMPLE

A research assistant invents a study and fake authors when asked for evidence.

AI Safety

Human-in-the-Loop AI

Human-in-the-loop AI is a workflow where a person reviews, edits, or approves model outputs before they reach a user or downstream system. The human acts as a quality gate and as a source of feedback the model can later learn from.

WHY IT MATTERS

Human-in-the-loop is the safest path to deploying AI in regulated or high-stakes workflows.

SIMPLE EXAMPLE

An insurer uses AI to draft claims decisions, but an adjuster signs off before payment.

RAG & Retrieval

Hybrid Search

Hybrid search combines keyword search and vector-based semantic search in one query. Keyword matches catch exact terms, codes, and names, while vector matches catch meaning and paraphrases, and the results are merged and re-ranked.

WHY IT MATTERS

Hybrid search consistently outperforms pure keyword or pure vector search in real RAG systems.

SIMPLE EXAMPLE

A docs site uses hybrid search so a query for error code E42 returns both exact matches and related guides.

4 TERMS

LLM Basics

Inference

Inference is the process of using a trained model to generate predictions or completions from new inputs. It is the runtime side of AI, separate from training, and is what users pay for per request in most LLM APIs.

WHY IT MATTERS

Inference cost and latency define the unit economics of every AI feature.

SIMPLE EXAMPLE

A search box calls a model at inference time to rewrite the user's query before retrieval.

Model Training

Instruction Tuning

Instruction tuning is a fine-tuning step that trains a model on examples written as instructions paired with desired responses. It teaches the model to follow natural-language commands instead of only continuing text.

WHY IT MATTERS

Instruction tuning is what made base LLMs usable as helpful assistants.

SIMPLE EXAMPLE

A model is instruction-tuned on prompts like 'Summarize this email in two sentences' with ideal answers.

Automation

Intent Classification

Intent classification is a model task that maps a user message to a predefined intent label such as billing, support, or sales. It is widely used in chatbots and routing systems to decide which flow, agent, or tool should handle the request.

WHY IT MATTERS

Accurate intent classification is the difference between a useful router and a frustrating chatbot.

SIMPLE EXAMPLE

A help desk routes tickets to the right team based on the predicted intent of the first message.

AI Safety

Interpretability

Interpretability is the field that studies how to look inside a model and understand why it produced a specific output. Methods range from feature attributions to mechanistic analysis of internal activations and circuits.

WHY IT MATTERS

Interpretability supports debugging, safety reviews, and compliance with explainability requirements.

SIMPLE EXAMPLE

Researchers identify the neurons most responsible for a chatbot's tendency to flatter users.

✦ AI Search

Want your brand to be found in AI answers?

Get a tailored plan to make your brand findable, citable, and chosen across ChatGPT, Gemini, Perplexity, and Google AI Overviews.

Book Strategy Call →

2 TERMS

Prompt Engineering

JSON Mode

JSON mode is a model setting that forces outputs to be valid JSON, often matching a schema you provide. It removes string parsing errors and makes model responses safe to feed directly into downstream code, databases, or APIs.

WHY IT MATTERS

JSON mode is essential for reliable automations and tool use in production.

SIMPLE EXAMPLE

A lead enrichment workflow asks the model to return a JSON object with name, role, and company.

AI Safety

Jailbreak Prompt

A jailbreak prompt is an input crafted to make a model bypass its safety rules, system instructions, or content policies. Techniques include role play, hidden instructions, encoded text, or multi-step social engineering.

WHY IT MATTERS

Jailbreaks are a real attack surface for any public AI feature and must be tested for.

SIMPLE EXAMPLE

A red team submits a prompt that pretends to be a developer test and tries to extract the system prompt.

3 TERMS

RAG & Retrieval

Knowledge Base

A knowledge base is a structured collection of documents, articles, or data that an AI system can search and cite. It is the source of truth that grounds model answers and is usually indexed for keyword and vector search.

WHY IT MATTERS

A clean, current knowledge base is the single biggest lever for accurate AI answers.

SIMPLE EXAMPLE

A SaaS company connects its help center as the knowledge base for its customer-facing chatbot.

RAG & Retrieval

Knowledge Graph

A knowledge graph is a structured representation of entities and the relationships between them. It stores facts as nodes and edges, which lets AI systems answer questions that need precise, connected, or multi-hop information.

WHY IT MATTERS

Knowledge graphs improve precision for AI search where keyword and vector retrieval fall short.

SIMPLE EXAMPLE

A B2B platform uses a knowledge graph to answer questions like 'Which customers in EMEA use feature X?'

RAG & Retrieval

Knowledge Retrieval

Knowledge retrieval is the step that fetches relevant documents, snippets, or facts from a data source so a language model can use them in its answer. It usually combines indexing, query rewriting, ranking, and filtering.

WHY IT MATTERS

Retrieval quality sets the ceiling for any RAG or AI search experience.

SIMPLE EXAMPLE

Before answering, a sales copilot retrieves the latest pricing sheet and competitor battle cards.

4 TERMS

LLM Basics

Language Model

A language model is a system that assigns probabilities to sequences of text and can generate the next likely tokens given a prompt. Modern language models are neural networks trained on large text corpora to read, write, and reason in natural language.

WHY IT MATTERS

Language models are the engines behind chat, search, summarization, and most AI copilots.

SIMPLE EXAMPLE

An autocomplete feature uses a language model to suggest the next sentence in an email.

LLM Basics

Large Language Model

A large language model, or LLM, is a language model with billions of parameters trained on massive text corpora. LLMs can follow instructions, summarize, translate, code, and reason across many domains without task-specific training.

WHY IT MATTERS

LLMs are the foundation models behind most generative AI products today.

SIMPLE EXAMPLE

A marketing team uses an LLM to draft, translate, and localize blog posts in one workflow.

Model Training

Latent Space

Latent space is the internal mathematical space where a model represents inputs as vectors of features. Similar inputs cluster together, and moving through latent space changes the meaning of the represented item in a structured way.

WHY IT MATTERS

Latent space is what makes embeddings, similarity search, and generative interpolation possible.

SIMPLE EXAMPLE

Two product descriptions about running shoes sit close together in the model's latent space.

Model Training

LoRA

LoRA, short for Low-Rank Adaptation, is a fine-tuning method that trains small adapter matrices on top of a frozen base model. It captures task-specific behavior with far fewer trainable parameters than full fine-tuning, making customization cheap and fast.

WHY IT MATTERS

LoRA lets teams ship customized models without paying full fine-tuning cost.

SIMPLE EXAMPLE

A studio trains a LoRA so a base image model produces characters in its signature art style.

6 TERMS

Model Training

Machine Learning

Machine learning is the broader field of building systems that learn patterns from data instead of following hand-written rules. It includes classical methods like regression and trees as well as modern deep learning and the LLMs built on top of it.

WHY IT MATTERS

Machine learning is the parent discipline that contains generative AI as one branch.

SIMPLE EXAMPLE

A churn model uses machine learning to predict which subscribers will cancel next month.

Model Training

Model Evaluation

Model evaluation is the structured process of measuring how well a model performs on defined tasks using metrics, test sets, and human review. It compares versions, catches regressions, and informs decisions about deployment.

WHY IT MATTERS

Without evaluation, AI teams ship changes on vibes and discover problems in production.

SIMPLE EXAMPLE

A team runs an evaluation suite of 200 prompts before promoting any new system prompt to production.

LLM Basics

Model Parameters

Model parameters are the numerical values inside a neural network that are learned during training. They determine how the model transforms inputs into outputs, and their count is often used as a rough proxy for model capacity.

WHY IT MATTERS

Parameter count influences quality, cost, and the hardware needed to run a model.

SIMPLE EXAMPLE

A 70-billion-parameter model usually answers complex questions better than a 7-billion one.

LLM Basics

Model Weights

Model weights are the trained parameters of a neural network, stored as large files that can be loaded for inference. Open weights can be downloaded and run locally, while closed weights stay behind a vendor API.

WHY IT MATTERS

Weight access decides whether a team can self-host, fine-tune freely, or only call an API.

SIMPLE EXAMPLE

A regulated firm chooses an open-weights model so it can run inference inside its own network.

LLM Basics

Multimodal AI

Multimodal AI describes models that handle more than one type of input or output, such as text, images, audio, and video, in a single system. The model can read a screenshot, listen to a clip, or describe an image alongside text.

WHY IT MATTERS

Multimodal models unlock workflows that mix documents, screenshots, and voice in one conversation.

SIMPLE EXAMPLE

A user uploads a chart image and asks a multimodal model to write the takeaway as a tweet.

Model Training

Mixture of Experts

Mixture of experts, or MoE, is an architecture where many specialist sub-networks exist inside one model and a router activates only a few per input. This gives the quality of a very large model with the cost of a much smaller one per request.

WHY IT MATTERS

MoE is one of the main ways frontier labs are scaling capability without scaling cost linearly.

SIMPLE EXAMPLE

An MoE LLM routes coding questions to coding experts and legal questions to legal ones internally.

3 TERMS

LLM Basics

Natural Language Processing

Natural language processing, or NLP, is the field that builds systems to understand, generate, and analyze human language. It includes tasks like classification, extraction, summarization, translation, and dialogue, and underpins most modern LLM applications.

WHY IT MATTERS

NLP is the discipline that turns raw text into structured insight and useful product features.

SIMPLE EXAMPLE

An NLP pipeline extracts company names and amounts from contract PDFs for a finance team.

Model Training

Neural Network

A neural network is a model made of layers of connected nodes that transform inputs into outputs through weighted operations. Training adjusts those weights so the network learns to map inputs to the right answers across many examples.

WHY IT MATTERS

Neural networks are the building block of every modern AI model.

SIMPLE EXAMPLE

An image classifier uses a neural network to distinguish defective and healthy parts on a line.

RAG & Retrieval

Neural Retrieval

Neural retrieval uses neural networks to encode queries and documents into embeddings and rank them by semantic similarity. It complements or replaces classical keyword retrieval and powers most modern AI search experiences.

WHY IT MATTERS

Neural retrieval is what makes search feel like it understands meaning, not just words.

SIMPLE EXAMPLE

A docs search returns relevant guides even when the user uses different wording than the title.

3 TERMS

LLM Basics

Open-Source LLM

An open-source LLM is a large language model whose weights, and often code and training details, are released publicly. Teams can download, run, fine-tune, and deploy it under the model's license without depending on a single vendor.

WHY IT MATTERS

Open-source LLMs give teams control over cost, privacy, and customization.

SIMPLE EXAMPLE

A healthcare startup self-hosts an open-source LLM to keep patient data inside its own infrastructure.

AI Agents

Output Parser

An output parser is a component that converts a model's text response into a structured format such as JSON, a typed object, or a database row. It handles validation, retries, and schema enforcement so downstream code can rely on the result.

WHY IT MATTERS

Output parsers turn unreliable text outputs into safe inputs for the rest of your software.

SIMPLE EXAMPLE

A workflow parses an LLM's answer into a typed Lead object before saving it to the CRM.

Model Training

Overfitting

Overfitting is when a model learns the training data too closely, including its noise, and performs worse on new data. It usually appears as high training accuracy and low test accuracy and is fought with more data, regularization, or simpler models.

WHY IT MATTERS

Overfit models look great in development and disappoint in production.

SIMPLE EXAMPLE

A churn model that scores 99 percent on training data drops to 70 percent on next month's customers.

8 TERMS

LLM Basics

Parameter

A parameter is a single learned numerical value inside a neural network. The collection of all parameters defines the model, and parameter count is often quoted as a rough indicator of model size and capacity.

WHY IT MATTERS

Parameters determine model size, cost, and the hardware required to run inference.

SIMPLE EXAMPLE

A 7B model has seven billion parameters and fits on consumer GPUs, unlike a 70B model.

Model Training

Pretraining

Pretraining is the first, most expensive stage of building a foundation model, where the model learns general language patterns from a huge corpus of text. It usually trains by predicting the next token across trillions of examples.

WHY IT MATTERS

Pretraining decides what a model fundamentally knows before any fine-tuning or prompting.

SIMPLE EXAMPLE

A foundation model is pretrained on web text and code before being instruction-tuned for chat.

Prompt Engineering

Prompt

A prompt is the input you send to a language model to get a response. It can include instructions, context, examples, retrieved data, and a user question, and its structure strongly affects the quality, format, and safety of the output.

WHY IT MATTERS

Prompts are the primary user interface and product surface for most AI features today.

SIMPLE EXAMPLE

A meta-description prompt includes the page title, target keyword, and a length limit.

Prompt Engineering

Prompt Chain

A prompt chain is a sequence of prompts where each step uses the output of the previous one. Chains break complex tasks into smaller, easier subtasks and let teams insert tools, validation, or branching between steps.

WHY IT MATTERS

Prompt chains turn one fragile prompt into a reliable multi-step workflow.

SIMPLE EXAMPLE

A blog workflow chains: outline, draft, fact-check, and meta description into four steps.

Prompt Engineering

Prompt engineering is the practice of designing, testing, and refining the inputs sent to a language model to get reliable, high-quality outputs. It covers structure, examples, constraints, format, and the use of system prompts and retrieved context.

WHY IT MATTERS

Prompt engineering is often the cheapest and fastest way to lift AI feature quality.

SIMPLE EXAMPLE

A team improves answer accuracy by 20 percent just by rewriting the system prompt and adding examples.

AI Safety

Prompt Injection

Prompt injection is an attack where malicious instructions are placed in user input or external content so the model executes them instead of the developer's instructions. It can leak data, override safety rules, or trigger unintended tool calls.

WHY IT MATTERS

Prompt injection is one of the most serious security risks in any agentic or RAG system.

SIMPLE EXAMPLE

A web page contains hidden text telling a browsing agent to send the user's emails to an attacker.

Prompt Engineering

Prompt Library

A prompt library is a managed collection of reusable, version-controlled prompts shared across a team or product. It treats prompts like code, with naming, documentation, evaluation, and rollout controls.

WHY IT MATTERS

Prompt libraries prevent drift, duplicate work, and silent quality regressions.

SIMPLE EXAMPLE

Marketing, support, and sales all pull approved prompts from a single internal prompt library.

Prompt Engineering

Prompt Template

A prompt template is a reusable prompt with placeholders for variables such as user input, retrieved documents, or product fields. The template is filled in at runtime, ensuring every call follows the same proven structure.

WHY IT MATTERS

Templates make prompts maintainable, testable, and consistent across thousands of calls.

SIMPLE EXAMPLE

A meta-description template inserts page title and keyword into a fixed instruction every time.

3 TERMS

AI Search & SEO

Query

A query is the question or input a user sends to a search system or AI assistant. In AI search, queries are often rewritten, expanded, or decomposed before retrieval and answer generation to improve the quality of results.

WHY IT MATTERS

Understanding real queries is the starting point for any AI search or content strategy.

SIMPLE EXAMPLE

A user query 'best CRM for small SaaS' is rewritten by the system into a clearer search intent.

RAG & Retrieval

Query Embedding

A query embedding is the vector representation of a user query, produced by an embedding model. It is compared against document embeddings to find the most semantically similar content for retrieval.

WHY IT MATTERS

Query embeddings let search match meaning, not just exact words.

SIMPLE EXAMPLE

A search for 'cancel my plan' retrieves the article 'How to end your subscription' via embeddings.

Model Training

Quantization

Quantization reduces the precision of a model's weights, for example from 16-bit to 4-bit, so the model uses less memory and runs faster. Modern quantization techniques keep most of the original quality while shrinking the model significantly.

WHY IT MATTERS

Quantization lets larger models run on smaller, cheaper hardware in production.

SIMPLE EXAMPLE

A team quantizes a 13B model so it runs on a single GPU inside its product.

5 TERMS

RAG & Retrieval

RAG

RAG, or retrieval-augmented generation, is a pattern where a language model retrieves relevant documents at query time and uses them to ground its answer. It combines a search system with an LLM so responses are based on current, owned data.

WHY IT MATTERS

RAG is the standard pattern for accurate, source-cited AI answers over private content.

SIMPLE EXAMPLE

A support bot uses RAG to answer from the latest help articles instead of stale training data.

AI Search & SEO

Ranking Model

A ranking model orders a set of candidate items, such as search results or recommendations, by predicted relevance to a user or query. In AI search, ranking models often re-score candidates returned by retrieval before sending them to the LLM.

WHY IT MATTERS

Strong ranking is what separates AI search that feels precise from AI search that feels noisy.

SIMPLE EXAMPLE

A retrieval step returns 50 docs, and a ranking model picks the top 5 to pass to the LLM.

Model Training

Reinforcement Learning

Reinforcement learning is a training method where a model learns by taking actions and receiving rewards or penalties. Over many trials, it adjusts its behavior to maximize cumulative reward in an environment.

WHY IT MATTERS

Reinforcement learning is a key tool for aligning LLMs and training agents that act in the world.

SIMPLE EXAMPLE

A robot learns to grasp objects through reinforcement learning by trying and being rewarded for success.

Model Training

RLHF

RLHF, or reinforcement learning from human feedback, fine-tunes a model using preferences collected from human raters. People compare model outputs, a reward model learns those preferences, and the LLM is then optimized to match them.

WHY IT MATTERS

RLHF is the main reason modern chat models feel helpful, polite, and on-policy.

SIMPLE EXAMPLE

Raters pick the better of two answers, and the model is trained to produce the preferred style.

RAG & Retrieval

Retriever

A retriever is the component in a RAG system that fetches candidate documents or chunks for a given query. It can use keyword search, vector search, or both, and its output is what the LLM reads before answering.

WHY IT MATTERS

Retriever quality is the single biggest determinant of RAG accuracy.

SIMPLE EXAMPLE

A retriever pulls the top five matching policy paragraphs before the LLM writes an answer.

5 TERMS

AI Search & SEO

Semantic Search

Semantic search uses embeddings to match queries and documents by meaning rather than exact keywords. It returns relevant results even when wording differs and is the foundation of modern AI search and RAG.

WHY IT MATTERS

Semantic search is what makes AI search feel like it actually understands the question.

SIMPLE EXAMPLE

A search for 'reset password' returns 'recover account access' even with no shared keywords.

Model Training

Self-Attention

Self-attention is the mechanism in a transformer that lets each token attend to every other token in the input. It produces context-aware representations and is what allows LLMs to handle long, complex passages coherently.

WHY IT MATTERS

Self-attention is the architectural innovation behind modern LLMs.

SIMPLE EXAMPLE

When summarizing a long email, self-attention links the closing question back to the opening request.

LLM Basics

Small Language Model

A small language model, or SLM, is a language model with far fewer parameters than frontier LLMs, optimized to run cheaply on smaller hardware. SLMs trade some general capability for speed, cost, and the ability to run on-device.

WHY IT MATTERS

SLMs are the right choice for high-volume, low-latency, or on-device AI features.

SIMPLE EXAMPLE

A mobile app runs an SLM on the phone for offline writing suggestions.

Prompt Engineering

System Prompt

A system prompt is a hidden instruction the developer passes to the model that defines its role, tone, allowed topics, and output format. It applies to every user message in a session and is the primary way to control assistant behavior.

WHY IT MATTERS

The system prompt is the most leveraged piece of text in any AI product.

SIMPLE EXAMPLE

A retail bot's system prompt restricts answers to product, shipping, and returns topics.

Model Training

Synthetic Data

Synthetic data is artificial data generated by a model or simulation, used to train, fine-tune, or evaluate other models. It can fill gaps where real data is scarce, sensitive, or expensive to label.

WHY IT MATTERS

Synthetic data lets teams train models when real-world data is limited or restricted.

SIMPLE EXAMPLE

A team generates synthetic support tickets to balance rare intent classes in its training set.

5 TERMS

Prompt Engineering

Temperature

Temperature is a setting that controls randomness in a model's output. Low values make outputs more deterministic and focused, while high values make them more diverse and creative, at the cost of some consistency and accuracy.

WHY IT MATTERS

Temperature is a fast lever to tune AI features for either reliability or creativity.

SIMPLE EXAMPLE

Email drafts use temperature 0.7 for variety, while data extraction uses 0 for stability.

LLM Basics

Token

A token is the basic unit of text a language model reads and generates, usually a short string of characters or part of a word. Models charge, limit context, and produce output measured in tokens, not characters.

WHY IT MATTERS

Tokens are the unit of cost, latency, and context in every LLM-based feature.

SIMPLE EXAMPLE

A 1,000-word article is roughly 1,300 tokens for most English text.

Model Training

Tokenization

Tokenization is the process of splitting text into tokens that a model can process. The tokenizer is paired with the model and decides how words, punctuation, code, and non-English text are broken into pieces.

WHY IT MATTERS

Tokenization choices affect cost, context length, and how well models handle different languages.

SIMPLE EXAMPLE

A long German compound word may consume more tokens than its English equivalent.

AI Agents

Tool Calling

Tool calling lets a language model invoke external functions, APIs, or plugins to do things it cannot do alone, such as fetch data, run calculations, or update systems. The model returns structured arguments and then uses the results in its next response.

WHY IT MATTERS

Tool calling turns LLMs from text generators into agents that can act in real systems.

SIMPLE EXAMPLE

A research agent uses tool calling to run a web search and then summarize the results.

Model Training

Transformer Architecture

The transformer architecture is the neural network design behind modern LLMs. It uses self-attention to process all tokens in parallel and capture long-range relationships, replacing earlier recurrent designs and enabling much larger, more capable models.

WHY IT MATTERS

The transformer is the architectural reason today's AI capabilities exist.

SIMPLE EXAMPLE

Most major LLMs, from GPT to open-source models, are variations of the transformer architecture.

2 TERMS

RAG & Retrieval

Unstructured Data

Unstructured data is information that does not fit neatly into rows and columns, such as documents, emails, chat transcripts, images, audio, and video. LLMs and embeddings make this data searchable and useful without forcing it into a schema.

WHY IT MATTERS

Most enterprise knowledge is unstructured, and AI is what finally makes it usable.

SIMPLE EXAMPLE

A team uses an LLM to extract action items from thousands of unstructured meeting transcripts.

Prompt Engineering

User Prompt

A user prompt is the message sent by the end user in a conversation with an AI assistant. It sits alongside the system prompt and any retrieved context and is the part the developer has the least direct control over.

WHY IT MATTERS

Designing for messy real-world user prompts is what separates demos from production AI.

SIMPLE EXAMPLE

A user types 'fix this' with no context, and the assistant must ask a clarifying question.

4 TERMS

RAG & Retrieval

Vector Database

A vector database is a system designed to store embeddings and run fast similarity search over them. It indexes high-dimensional vectors and supports filters, hybrid search, and metadata so retrieval at scale stays fast and accurate.

WHY IT MATTERS

A vector database is the storage layer that makes production RAG and AI search practical.

SIMPLE EXAMPLE

An ecommerce site stores product embeddings in a vector database for instant semantic search.

RAG & Retrieval

Vector Embedding

A vector embedding is a numerical representation of a piece of content in a high-dimensional space. Items with similar meaning have nearby vectors, which lets systems search, cluster, and recommend by semantic similarity.

WHY IT MATTERS

Vector embeddings are the data structure behind every modern semantic search system.

SIMPLE EXAMPLE

A blog post is converted into a 1,536-dimension vector embedding for search.

RAG & Retrieval

Vector Index

A vector index is a data structure inside a vector database that enables fast nearest-neighbor search across millions of embeddings. Common index types like HNSW or IVF trade off recall, latency, and memory for different workloads.

WHY IT MATTERS

Index choice and tuning directly control AI search latency and cost at scale.

SIMPLE EXAMPLE

Switching to an HNSW index lets a docs search return results in 30 ms instead of 300 ms.

AI Search & SEO

Vector Search

Vector search ranks documents by the distance between their embeddings and a query embedding. It returns results based on meaning rather than keywords and is the core retrieval method behind RAG and AI-powered search experiences.

WHY IT MATTERS

Vector search is the technical foundation of AI search, citations, and grounded answers.

SIMPLE EXAMPLE

A help center uses vector search so 'forgot login' returns the password reset article.

3 TERMS

LLM Basics

Weights

Weights are the trained numerical values inside a neural network that determine how it transforms inputs into outputs. Together they define the model and are what gets shipped, loaded, fine-tuned, and quantized.

WHY IT MATTERS

Access to weights determines whether a team can self-host, customize, or only use an API.

SIMPLE EXAMPLE

An open-weights model can be downloaded and fine-tuned in-house on private data.

Model Training

Weak Supervision

Weak supervision is a training approach where labels are generated automatically from rules, heuristics, or other models instead of full human annotation. The labels are noisier but cheap and abundant, and models can still learn useful patterns from them.

WHY IT MATTERS

Weak supervision unlocks training when high-quality labeled data is too expensive to collect.

SIMPLE EXAMPLE

A team labels millions of tickets as urgent based on keyword rules to bootstrap a triage model.

Automation

Workflow Automation

Workflow automation uses software, often combined with AI, to execute a sequence of steps across systems with little or no human input. AI-powered workflows add reasoning, content generation, and decision-making on top of traditional triggers and actions.

WHY IT MATTERS

AI-powered automation is the main path to real productivity gains from generative AI.

SIMPLE EXAMPLE

A workflow auto-drafts a proposal, attaches pricing, and notifies the sales rep when a lead qualifies.

1 TERM

AI Safety

XAI

XAI, short for explainable AI, refers to methods and tools that make the reasoning of AI systems understandable to humans. Approaches include feature attributions, rationales, surfaced sources, and visualizations that show why a model produced a given output.

WHY IT MATTERS

XAI supports trust, debugging, and compliance for high-stakes AI systems.

SIMPLE EXAMPLE

An AI underwriting tool shows applicants which factors most influenced its decision.

1 TERM

Prompt Engineering

YAML Prompting

YAML prompting is a style of prompt design that uses YAML structure to organize roles, instructions, examples, constraints, and expected output formats. The structured layout makes complex prompts easier to read, version, and maintain.

WHY IT MATTERS

YAML-style prompts scale better than long paragraphs as prompts grow more complex.

SIMPLE EXAMPLE

A content prompt uses YAML keys for role, audience, tone, structure, and forbidden phrases.

2 TERMS

Model Training

Zero-Shot Learning

Zero-shot learning is a model's ability to perform a task it was not explicitly trained on, using only a description of the task. The model leverages general knowledge from pretraining to generalize to new categories or instructions.

WHY IT MATTERS

Zero-shot capability is what makes modern LLMs useful out of the box across many tasks.

SIMPLE EXAMPLE

An LLM classifies support tickets into new categories described only in plain English.

Prompt Engineering

Zero-Shot Prompting

Zero-shot prompting asks a language model to do a task with only an instruction and no examples. The model relies on its pretraining and instruction tuning to interpret the request and produce a usable answer.

WHY IT MATTERS

Zero-shot prompting is the fastest way to test whether an LLM can handle a task at all.

SIMPLE EXAMPLE

A team asks a model to 'classify this email as billing, support, or sales' with no examples first.

AI Search

Want your Brand to be Found?

Get a tailored plan to make your brand findable, citable, and chosen across ChatGPT, Gemini, Perplexity, and Google AI Overviews.

Book Strategy Call →