When policy exists—but failure still ships
A team rolls out an LLM assistant for customer service agents. Governance is “in place”: there’s a steering committee, a review board, and a risk tier. Then a real incident happens: the assistant suggests a fee waiver that violates policy, an agent copy-pastes it, and the customer escalates. Security asks, “Why didn’t monitoring catch abnormal outputs?” Legal asks, “What control prevented this from reaching a customer?” The product lead asks, “What exactly do we do differently on Monday?”
That’s the gap this lesson closes. Governance tells you who decides and what evidence is required. Control design tells you how risk is concretely reduced in the product and operating process—in a way that is repeatable, testable, and auditable.
The practical goal: for any AI use case, you can point to a short set of preventive, detective, and corrective controls that match its risk tier and autonomy—so outcomes don’t depend on heroics or “best effort.”
The control design vocabulary that makes governance real
A control is a specific measure that reduces risk by changing what can happen, what you can notice, or how fast you can recover. Controls turn principles like “protect privacy” into implementation realities like “PII is redacted before logging” and “retention is capped by policy.”
Key terms (used consistently in risk and audit functions):
-
Preventive controls: stop an unwanted event from happening (or make it unlikely). Example: access restrictions to sensitive data; a rule that blocks external sharing of model outputs containing personal data.
-
Detective controls: surface an unwanted event quickly. Example: monitoring that flags hallucination spikes; review sampling that catches policy violations.
-
Corrective controls: restore a safe state after something goes wrong. Example: rollback, incident response playbooks, retraining, prompt or policy changes.
Two principles keep control design aligned with the governance foundations you already have:
-
Govern the decision, not the model: controls must match how outputs are used (assist vs decide), impacts, and data sensitivity. A “low-risk” drafting assistant and a “high-impact” churn flagger should not share the same control set.
-
Evidence over intent: “We trained it to be safe” is not a control. A control produces evidence—logs, approvals, evaluation results, monitoring alerts, or sign-offs—that someone can review and challenge.
A helpful analogy is finance: budgets (prevent), reconciliations (detect), and chargeback/exception handling (correct) keep spending aligned. AI needs the same operating discipline—just tuned to model behavior, data flows, and automation risk.
Designing controls that actually work: Prevent, Detect, Correct
Preventive controls: reduce exposure by shaping what the system can do
Preventive controls are the fastest way to reduce risk, but they’re also the easiest to get wrong because teams confuse guidance (“don’t hallucinate”) with an enforceable mechanism (“the system must ground answers in approved sources and block unsupported claims”). The best preventive controls constrain the system at the points where damage becomes possible: data entry, model interaction, and decision execution.
A practical way to design preventive controls is to align them to three levers: data, model behavior, and decision rights. On the data side, prevention includes proven practices like least-privilege access, approved data sources, and retention rules that match legal obligations. On the behavior side, prevention includes guardrails like retrieval from approved knowledge, redaction before prompts/logging, and UX patterns that keep humans in control when impact is high. On the decision side, prevention includes approvals, autonomy caps, and requirements that certain outputs must never be executed automatically.
Best practices to make prevention proportional (and not bureaucratic):
-
Treat autonomy as a multiplier: “recommend” can tolerate more freedom than “auto-execute,” so prevention strengthens as autonomy increases.
-
Make constraints explicit and testable: “No financial advice” becomes “block outputs containing restricted advice patterns unless routed through an approved scripted flow.”
-
Design for real workflow: if agents or analysts can bypass the control to hit a KPI, assume they will under pressure.
Common pitfalls and misconceptions:
-
Misconception: “If the committee approved the use case, we’re safe.” Approval is not a control; it’s a decision point. Prevention must live in product and process.
-
Pitfall: relying on a single barrier (e.g., a policy statement) instead of layered constraints. One gap often becomes the path of least resistance.
-
Pitfall: over-controlling low-risk tools, which drives shadow AI. Risk-tiering exists so prevention is graduated, not blanket.
Detective controls: notice drift, misuse, and failures before customers do
Detective controls accept a reality of AI systems: even with strong prevention, failures still happen—because data shifts, prompts change, users improvise, and edge cases emerge only at scale. Detection is how you avoid “two weeks later, complaints spike” moments by creating early visibility and escalation paths.
Effective detection goes beyond model performance metrics. You need a portfolio of signals across model behavior, user behavior, and business outcomes. Model behavior signals might include spikes in refusal rates, unusual output lengths, increased “unsupported claim” flags, or retrieval misses. User behavior signals include override rates (how often humans delete or rewrite suggestions), escalation tagging, and repeated re-prompts that indicate the system is not meeting needs. Business outcome signals include complaint categories, anomaly rates in downstream decisions, or sudden swings in key metrics that align with deployments.
Best practices that make detection actionable:
-
Define thresholds and owners: an alert with no accountable responder is noise, not a control.
-
Monitor what matters to the decision: for a churn flagging tool, detection must include false-positive/false-negative sampling, not just accuracy on a stale test set.
-
Build feedback loops into UX: users need a fast, low-friction way to report “wrong,” “unsafe,” or “policy-violating” outputs, and those reports must be triaged.
Common pitfalls and misconceptions:
-
Misconception: “Monitoring is just model metrics.” Governance context already warns this—real monitoring includes complaints, overrides, and incidents.
-
Pitfall: dashboards without triggers. If everything is “informational,” nothing is controlled.
-
Pitfall: sampling that isn’t risk-tiered. High-impact decisions need more frequent review and stronger signals than internal drafting tools.
Corrective controls: recover fast and keep fixes from being one-offs
Corrective controls are your recovery system: how you contain harm, restore a safe operating state, and reduce repeat incidents. Without corrective controls, organisations end up in a cycle of urgent patches, unclear ownership, and untraceable risk acceptance—exactly what governance aims to prevent.
Strong corrective control design starts with pre-planned playbooks and decision rights for action under pressure. Who can disable a feature? Who can roll back a model? When do you escalate from first line (product/ML/engineering) to second line (risk/privacy/security) to leadership? This is where the “three lines” structure becomes operational: first line executes fixes, second line challenges root-cause closure and residual risk, and third line can later test whether the operating model worked.
Corrective controls should also produce durable artifacts:
-
Incident records that capture what happened, why, impact, and remediation.
-
Post-incident control improvements (e.g., new prevention constraints, better detection thresholds).
-
Exception decisions when the business chooses to accept residual risk—documented at the right level.
Common pitfalls and misconceptions:
-
Misconception: “Corrective means retrain the model.” Sometimes the best correction is changing the workflow, tightening autonomy, or fixing a data feed.
-
Pitfall: no rollback path. “We can’t turn it off” is a governance failure in itself.
-
Pitfall: treating incidents as isolated. Corrective controls should feed back into tiering, gates, and controls so the organisation gets safer over time.
How Prevent/Detect/Correct compare (and why you need all three)
You rarely choose one control type; you layer them. Prevention reduces probability, detection reduces time-to-notice, and correction reduces time-to-recover and recurrence.
| Dimension | Prevent | Detect | Correct |
|---|---|---|---|
| Primary goal | Stop or constrain harmful events before they occur | Notice harmful events quickly and reliably | Restore safe operation and reduce repeat incidents |
| Typical mechanisms | Access restrictions, autonomy caps, approval gates, redaction, grounded retrieval, UX constraints | Monitoring thresholds, sampling reviews, override/escalation tracking, complaint tagging | Rollback/kill switch, incident playbooks, patching prompts/policies, retraining or workflow change |
| Best fit | High-impact decisions, sensitive data handling, high autonomy | Systems exposed to drift, broad users, ambiguous edge cases | Any system with customer impact or operational dependency |
| Evidence produced | Configurations, approvals, documented constraints, test results | Alerts, dashboards with triggers, review logs, user feedback records | Incident reports, change records, sign-offs, residual risk decisions |
| Common failure mode | Becomes “policy” instead of enforceable constraints | Becomes “observability” without owners or thresholds | Becomes firefighting without root-cause closure |
[[flowchart-placeholder]]
Two control designs in practice (step-by-step)
Example 1: Bank LLM assistant for customer service agents (assistive, higher sensitivity)
Start with the governance classification already implied by the use case: it touches account data, can mislead customers, and affects outcomes. The control design begins by translating those risks into a layered set of controls that fit an assistive tool where the agent remains the decision-maker.
Step-by-step control design:
- Prevent: Constrain the assistant to approved knowledge and safe data handling. Use retrieval from vetted policy content, redact sensitive data before prompts/logging, and enforce retention rules for transcripts and model interactions. Add UX constraints: the assistant drafts, but the agent must confirm before sending, and the UI clearly marks text as machine-generated.
- Prevent: Add “no-go zones” aligned to policy: if the model drifts into restricted advice or binding commitments, block the response or route it to scripted flows. This is prevention tied directly to decision impact, not generic “safety wording.”
- Detect: Monitor for failure modes that matter: complaint tags like “misleading info,” spikes in escalation, and override rates where agents delete drafts. Include periodic sampling targeted at known high-risk topics (fees, eligibility, adverse actions), not random sampling that misses concentrated harms.
- Correct: Establish rapid rollback and root-cause routines: disable the feature or restrict topics if thresholds are breached, then update safeguards (retrieval corpus, blocklists/routing, agent guidance). Second line reviews whether the revised controls are sufficient and documents any accepted residual risk.
Impact, benefits, limitations:
-
Benefit: the bank ships productivity gains while keeping humans as the final decision authority and producing evidence (constraints, monitoring, incident records) that committees can review.
-
Limitation: operational load is real—sampling, triage, and threshold tuning require sustained ownership. If monitoring becomes “best effort,” the system quietly accumulates risk as products and policies change.
Example 2: Retailer dynamic pricing recommendations (impactful, constrained autonomy)
Even though pricing feels “internal,” the governance context treats it as impactful because of consumer protection, reputation, and fairness perceptions. Control design focuses on preventing runaway automation, detecting anomalies early, and correcting fast when data feeds or market shocks break assumptions.
Step-by-step control design:
- Prevent: Set autonomy caps by category. Low-risk categories may allow automated execution within strict bounds (e.g., capped percentage changes), while sensitive categories require a pricing manager approval. This is “govern the decision”: the same model can be used differently depending on impact.
- Prevent: Require reason codes so humans can challenge recommendations: drivers like inventory signals, seasonality, or competitor index. If the system cannot produce a coherent reason code (or confidence falls below a threshold), route to manual review rather than auto-apply.
- Detect: Monitor governance-relevant signals: frequency of cap hits (system constantly tries to exceed limits), human override rates, and sudden regional anomalies that might indicate data feed errors. Pair those with business outcomes like margin swings and stockouts to catch “technically fine” models that are operationally harmful.
- Correct: Implement rollback and containment: revert to last-known-good pricing rules, freeze automated execution for affected categories, and run a root-cause review that updates controls (tighter caps, better feed validation, revised thresholds). Escalate residual risk decisions when leadership wants to keep more autonomy despite recurring anomalies.
Impact, benefits, limitations:
-
Benefit: speed where safe, friction where consequences justify it. Teams can scale automation without betting the brand on perfect data.
-
Limitation: overly conservative caps can erase value, while overly permissive autonomy turns minor feed errors into major customer-facing incidents. Control design is a tuning exercise, not a one-time checkbox.
Closing: the control design mindset you can reuse
Control design is how governance stops being meetings and documents and becomes an operating system for safe, scalable AI. The pattern is simple, but the execution must be specific: prevent what you can, detect what you can’t fully prevent, and correct quickly with durable fixes and traceable decisions.
Key takeaways:
-
Controls must match the decision context: impact, sensitivity, and autonomy determine how strong and how layered your controls should be.
-
Evidence beats intent: a control is only real if it produces something reviewable—configurations, thresholds, logs, approvals, incident records.
-
Layering wins: prevention reduces probability, detection reduces time-to-notice, correction reduces time-to-recover and recurrence.
Now that the foundation is in place, we'll move into Monitoring, Auditability & Evidence [30 minutes].