When “top 5 use cases” turns into a traffic jam
A leadership team approves a set of promising AI initiatives: a genAI support assistant, automated invoice matching, churn prediction, and a fraud model refresh. Everyone leaves the meeting aligned—until execution starts. The same data engineers are pulled into every project, the legal team becomes a queue, model deployments wait on logging decisions, and teams discover late that two “separate” use cases need the same customer identity table that doesn’t yet exist.
This is where many AI strategies quietly fail: not because the use cases were wrong, but because the roadmap was a list, not a plan. In AI portfolios, execution is constrained by shared dependencies (data, integration, governance controls) and the work is socio-technical (workflow adoption, monitoring, accountability), not just model building.
Roadmap design matters now because AI is easier to start than to operate safely at scale. A strong roadmap turns prioritised initiatives into an executable sequence that grows capability over time, avoids avoidable rework, and keeps risk controls “built in” rather than bolted on.
The roadmap vocabulary that keeps portfolios honest
A roadmap is not a Gantt chart; it’s a decision tool that explains why this, why now, and what must be true first. To make it practical, you need a few shared terms.
Horizons are time-and-capability bands that describe how the organisation will deliver value while building the foundations for harder work. A useful mental model is: early horizons prioritise speed-to-learning and repeatability, later horizons unlock bigger bets that require deeper data, integration, and governance maturity. Horizons are not just “near/mid/far”; they are a commitment to what kind of work is realistic given operating constraints.
Dependencies are prerequisites without which a use case cannot be delivered or governed responsibly. In AI, dependencies aren’t only technical. They include data rights, workflow instrumentation, integration paths, monitoring/on-call ownership, and risk controls like logging, redaction, human oversight, and auditability. If these aren’t mapped early, teams end up “discovering governance” during rollout—when changes are most expensive.
This lesson connects directly to the earlier prioritisation lens of value, feasibility, and risk. Prioritisation ranks initiatives; roadmap design turns those rankings into an execution sequence that respects feasibility constraints and embeds risk controls from day one. The outcome-chain mindset also carries over: each horizon should deliver measurable operational and business outcomes, with counter-metrics that prevent “activity as impact.”
Designing horizons that build capability (not just deliver projects)
A practical horizon model for AI roadmaps uses three layers: H1 (prove and standardise), H2 (scale and reuse), H3 (transform and differentiate). What matters is not the labels, but the logic: each horizon should create assets—data pipelines, monitoring patterns, governance routines, change management muscle—that make the next horizon cheaper and safer.
In H1, you target initiatives with high feasibility and controlled risk that can ship into production and be measured cleanly. The goal isn’t “quick demos”; it’s repeatable delivery with decision-grade metrics: baseline, comparison method, and counter-metrics. H1 should also establish minimum governance instrumentation: decision logging, incident tracking, model/data health monitoring, and named operational ownership. If you can’t monitor, you can’t scale; if you can’t measure, you can’t prioritise intelligently later.
In H2, you scale the patterns that worked and invest in reusable components. This is where a portfolio stops being a set of bespoke builds and starts compounding. Typical H2 work includes shared feature stores or semantic layers, standard evaluation harnesses for genAI, platform integrations, and routine risk reviews that don’t require heroics each time. Importantly, H2 is where you reduce the “hidden tax” that prioritisation surfaced: the same scarce skills (data engineering, security review, legal) become less of a bottleneck when controls and templates are standardised.
In H3, you tackle initiatives that were high value but previously gated by dependencies or risk-control cost—like models embedded in core decisioning, end-to-end workflow redesign, or cross-domain AI products. The misconception here is that H3 is only about time; it’s about maturity. If H1 and H2 didn’t build strong measurement integrity, monitoring, and accountable ownership, H3 efforts often produce the most impressive pilots—and the most painful operational surprises.
A helpful analogy is building a factory line, not a single prototype. H1 proves you can build and run one station safely; H2 standardises parts and quality checks; H3 adds high-throughput automation where failures are expensive.
Dependency mapping: turning “feels blocked” into explicit prerequisites
Dependencies become manageable when you treat them as first-class roadmap objects, not as footnotes. In AI work, the portfolio stalls when dependencies remain implicit: “data might be available,” “legal will review,” “we’ll add monitoring later,” or “ops will adopt it somehow.” A dependency map makes these assumptions testable.
Start by categorising dependencies into a small set you can actually govern:
-
Data dependencies: access rights, quality, timeliness, lineage, and whether you can define a stable unit of analysis for measurement. This ties directly to the earlier metric discipline: without baselines and proper logging, you can’t prove causality.
-
Workflow dependencies: where the AI output lands, who acts on it, and what adoption looks like (acceptance rate, overrides, rework). Many AI initiatives “work” technically but fail because the workflow is not instrumented to show whether behavior changed.
-
Integration dependencies: APIs, latency constraints, identity resolution, ticketing/CRM hooks, and the path to production support. Integration determines feasibility far more than most early scoring discussions admit.
-
Governance/control dependencies: logging, human oversight rules, redaction, retrieval constraints for genAI, auditability, and incident response. Risk cannot be “added later” if the controls require design-time decisions.
-
Operating model dependencies: named owners, on-call support, model versioning, retraining triggers, and budget for ongoing evaluation. If nobody owns the model in production, feasibility was overstated.
Best practice is to express each dependency as a “must be true” statement with an owner and evidence. For example: “We have the right to use customer emails for retrieval,” “All AI-assisted responses are logged with citations,” or “Override reasons are captured in the underwriting UI.” This removes ambiguity and prevents the common pitfall of treating governance as a gate at the end of delivery.
A typical misconception is that dependency mapping equals “more bureaucracy.” In reality, it reduces bureaucracy by preventing late-stage rework and avoiding meetings that exist only to resolve surprises. When dependencies are explicit, you can sequence work rationally: build shared logging once, unblock multiple use cases; resolve data rights early, avoid pausing rollouts midstream.
A dependency map you can use in exec conversations
| Dependency type | What it looks like in practice | Signals it’s missing (common stall points) | Roadmap move |
|---|---|---|---|
| Data rights & access | Clear permission to use data; access paths documented; PII handling defined; lineage understood enough for audit. | Legal review becomes a queue; teams build with synthetic/limited data; late discovery of prohibited fields. | Treat as a gate: resolve in H1 for repeatable patterns; don’t start H2/H3 use cases that need it. |
| Workflow instrumentation | You can observe behavior change: acceptance, overrides, rework, escalations, time saved, error rates. | “We shipped it but can’t prove impact”; adoption disputes; counter-metrics missing. | Add instrumentation as a dependency deliverable, not “nice to have.” |
| Integration & production support | Known integration path; latency and failure modes defined; on-call ownership and incident process set. | Pilots stuck in sandboxes; brittle manual steps; outages create distrust. | Sequence integration foundations early, or keep scope constrained until done. |
| Governance controls | Logging, audit trail, redaction, human oversight, evaluation/monitoring plan ready from day one. | Risk discovered after pilot; inability to explain decisions; model misuse. | Build control “templates” in H1/H2 so H3 isn’t reinventing governance. |
| Operating model ownership | Clear RACI for model, data, and workflow; retraining and rollback process; run costs planned. | “No one owns it”; monitoring ignored; gradual degradation. | Make ownership a go/no-go criterion for scale. |
[[flowchart-placeholder]]
Sequencing the roadmap: gates, parallel tracks, and avoiding shared bottlenecks
Once horizons and dependencies are explicit, sequencing becomes an exercise in constraint management. The main failure mode is selecting the “top initiatives” and starting them simultaneously even though they share the same scarce resources: data engineering, legal, security, or platform deployment capacity. A high-scoring portfolio can still be operationally impossible if everything queues behind a single dependency.
A robust approach is to combine gates with parallel tracks. Gates are non-negotiables: if you cannot log decisions, define counter-metrics, or comply with data rights, the use case stays in a pre-delivery state. Parallel tracks allow progress without pretending everything can ship at once. For example, you can run a “data & controls foundation” track alongside a “low-risk production wins” track. The portfolio stays productive while the organisation removes the real bottlenecks.
Sequencing also benefits from designing for capability compounding. If two use cases require the same retrieval layer, entity resolution, or monitoring pattern, sequence the roadmap so the first project builds the reusable component explicitly, not as accidental byproduct. This is where roadmaps become strategic: you aren’t only delivering use cases; you’re building the organisation’s ability to deliver the next ten.
Common pitfalls show up predictably:
-
Pitfall: “Everything is priority 1.”
-
Why it happens: leaders approve value narratives without confronting shared constraints.
-
How to avoid it: cap in-flight initiatives per constrained function; force trade-offs using dependency gates.
-
Pitfall: Treating risk controls as end-stage compliance.
-
Why it happens: pilots optimise for demo speed, not operational safety.
-
How to avoid it: tie controls to horizon entry criteria; include control build time in feasibility.
-
Pitfall: Confusing platform work with “no value.”
-
Why it happens: foundations don’t produce a single KPI jump immediately.
-
How to avoid it: connect foundation items to the number of use cases unblocked and the measurement/monitoring integrity gained.
Applied example 1: GenAI customer support—sequencing for speed and safety
Imagine you prioritised a genAI assistant that drafts customer support replies and retrieves policy content. The value story is clear: reduce drafting time and improve time-to-first-response. But roadmap design determines whether this becomes a scalable capability or a one-off tool that creates risk.
In H1, scope to a constrained domain (e.g., a subset of ticket categories) and implement the minimum dependencies that make outcomes measurable and risks observable. That means instrumenting acceptance rate, edit distance (how much agents rewrite), escalation rate, and repeat contact within 7 days as counter-metrics. It also means governance controls are designed up front: PII redaction in prompts, retrieval with citations, and logging of the prompt/output plus source documents for audit and incident review. If those pieces aren’t in place, the team may ship faster—but leadership won’t know if speed gains are offset by rework or policy misstatements.
In H2, reuse becomes the focus. You standardise evaluation and monitoring patterns: sampled hallucination checks, category-specific refusal rules, and workflow triggers for when the assistant must escalate. You also expand integration robustness—latency thresholds, fallback behavior when retrieval fails, and incident playbooks—so the tool doesn’t become a liability during peak volumes. This is where feasibility improves across the portfolio: other genAI use cases can inherit the same logging, redaction, and evaluation harness.
A limitation to call out explicitly is that genAI “quality” is not a single model metric. If the roadmap doesn’t include continuous evaluation and counter-metrics, the organisation risks optimising for faster replies that quietly degrade trust, compliance posture, and customer outcomes. The roadmap’s job is to prevent that trade-off from happening invisibly.
Applied example 2: ML credit decisioning—gating on auditability and operating readiness
Consider an ML model to support credit decisions. The value can be high, but the dependency burden is heavier and the consequences of failure are more severe. A roadmap that treats this like a typical software feature often ends up stuck between proof-of-concept and production.
In H1, the smartest move is usually not full automation. Instead, design a constrained rollout with strong measurement and controls: log every model input and output, capture override rates (how often underwriters reject recommendations), and define leading indicators (like early delinquency proxies) while acknowledging that true loss outcomes lag. Governance dependencies are central: versioning, traceability, and a clear process for adverse-action consistency where relevant. If you cannot produce a decision trail, you are not “not ready yet”—you are fundamentally unable to operate the model within an auditable risk posture.
In H2, you scale only after operational signals show readiness: stable data pipelines, drift monitoring with defined go/no-go thresholds, and an operating model that includes retraining triggers and rollback procedures. This is where feasibility becomes real: a model that cannot be monitored and maintained is not feasible, regardless of initial performance. You also reduce portfolio friction by standardising reviews and templates for high-impact models, so each new credit-related use case doesn’t restart governance from scratch.
The limitation here is that value evidence arrives slowly. A roadmap must balance leadership patience with rigorous controls: you can’t “wait for perfect data” forever, but you also can’t claim success based on approval rate alone if risk outcomes will change later. The horizon approach helps: deliver measurable progress (decision time, override reduction, calibration stability) while preserving the integrity of long-term outcome evaluation.
The roadmap mindset to carry into execution
A strong AI roadmap does three things at once:
-
It sequences work by horizons so early delivery builds the capabilities needed for later, higher-value bets.
-
It makes dependencies explicit—data rights, workflow instrumentation, integration, governance controls, and operational ownership—so you don’t discover blockers at the point of launch.
-
It treats measurement and risk as design constraints, not after-the-fact reporting, staying consistent with outcome chains, baselines, and counter-metrics.
If you can explain your roadmap as “we will ship these outcomes in H1 while building these reusable dependencies to unlock H2 and H3,” you’re no longer running an AI project list—you’re executing an AI strategy.
Next, we'll build on this by exploring Executive Alignment and Change Impacts [25 minutes].