When AI delivery scales—and work stops fitting inside one team
A retailer launches a GenAI support copilot, a forecasting model, and an invoice-matching automation across three business units. Each initiative has a capable team, a motivated sponsor, and a deadline. Within weeks, the friction shows up in familiar ways: product teams want fast iterations, security wants consistent controls, data owners want fewer ad-hoc requests, and engineers want shared tooling so they’re not rebuilding pipelines from scratch.
The problem isn’t talent or intent—it’s that AI work crosses boundaries by default. Models and prompts change after launch, data access must be governed continuously, and incident response can’t be “someone else’s job.” Without a clear way of working, organisations either centralise everything (and create queues) or decentralise everything (and create fragmentation and uneven risk).
This lesson makes the “ways of working” concrete: how product teams, platform teams, and squads collaborate to deliver AI outcomes repeatedly, with speed and control.
The basic vocabulary: product, platform, and squads (and why decision rights matter here)
A product is a customer- or employee-facing capability with outcomes, adoption, and ongoing operations. In AI, the “product” often includes model behavior, UI/UX, workflow integration, evaluation, monitoring, and change management. A platform is shared infrastructure and standards that make delivery teams faster and safer—think “paved roads” like logging defaults, model/prompt registries, evaluation templates, deployment patterns, and incident tooling.
A squad is a durable, cross-functional delivery unit aligned to an outcome. In practice, squads often sit in a domain (support, operations, fraud) and include a product owner, engineering, data/ML, and access to risk and security partners. The point of squads isn’t org-chart aesthetics; it’s reducing handoffs so the people who build can also run and improve what they ship.
Decision rights are the connective tissue that keeps these ways of working from turning into governance-by-meeting. The earlier framing—guardrails that scale with risk—only works if teams know who approves data access, what counts as a material change (prompt edits, new retrieval sources, model swaps), who can deploy, and who can roll back. In other words: ways of working are where operating model patterns become daily execution.
Two grounding principles keep this practical. First, centralise what must be consistent (risk policy, non-negotiable controls, shared telemetry), and decentralise what must be fast and contextual (domain UX, workflow adoption, local iteration). Second, make the compliant path the easiest path: platform defaults plus lightweight, auditable artifacts (evaluation reports, change logs, model/prompt cards) reduce both risk and bureaucracy.
Three ways of working—and how to choose without creating bottlenecks
Product-led delivery: outcomes first, with AI as a living feature
In a product-led way of working, the primary unit of accountability is the product team: it owns the business outcome, the user experience, and the “run” responsibilities after launch. This is essential for avoiding the common failure mode of “central builds, local suffers,” where a central AI team ships something but nobody owns adoption, monitoring, or incident response. AI capabilities only create value when they are embedded into workflows; product-led delivery ensures someone is accountable for rollout pacing, training, and the real-world metrics that matter (handle time, conversion, defect rate), not just model metrics.
Product-led does not mean “anything goes.” The product team works inside enterprise guardrails: approved vendors, minimum logging and monitoring, defined evaluation readiness, and tiered sign-offs based on risk. A practical split is: product teams decide what “good” means for the workflow and adoption, while risk owners decide which failure modes are unacceptable and what fallback is mandatory. Technical leads decide how to implement (model choice, retrieval strategy, prompt patterns) as long as they meet non-negotiables.
The biggest misconception in product-led AI is thinking the work ends at launch. AI systems are live: data drifts, user behavior changes, vendor models update, and prompt edits can materially change outputs. So product-led ways of working must include operational routines: monitoring dashboards with thresholds, on-call or escalation paths, and explicit change control for “material changes.” Without these, iteration speed becomes silent risk accumulation, especially in GenAI where small prompt or retrieval tweaks can shift behavior dramatically.
Common pitfalls show up when product-led is adopted without platform support. Teams create their own evaluation scripts, logging formats, and deployment practices; results become incomparable, incidents are harder to triage, and audits become painful. The fix is not to pull work back into a central queue, but to standardise the minimum evidence and telemetry—so autonomy exists inside a consistent system.
Platform-led enablement: paved roads, not a central delivery factory
A platform-led way of working treats shared capabilities as a product in their own right. The platform team’s job is to reduce cognitive load and rework for delivery teams by providing standard ingestion patterns, deployment templates, evaluation harnesses, model/prompt registries, and incident tooling. When this is done well, delivery teams move faster because they are not reinventing pipelines—and risk is better controlled because evidence capture and guardrails are built-in.
The key design choice is decision rights: platform teams should own non-negotiables (logging schema, minimum monitoring, approved deployment patterns, required artifacts like change logs and evaluation reports). Delivery teams should own implementation decisions inside that paved road. Exceptions must exist, but they should be explicit: time-bound approvals, documented rationale, and a plan to return to paved roads. That prevents “temporary” deviations from becoming permanent fragmentation.
A frequent misconception is that “platform” means “central approval.” In reality, the platform is most effective when it removes the need for approvals by making the safe path automatic. For example, if the platform standardises redaction before external calls, enforces model/prompt version tracking, and provides an evaluation template aligned to risk tiers, teams can ship quickly without re-litigating the same controls. Platform is how governance becomes operational rather than a document or committee.
The common pitfall is building a platform that is hard to use, slow to evolve, or disconnected from product realities. When platform adoption is painful, teams bypass it—and the organisation ends up with tool sprawl and uneven controls. The platform team must behave like a service provider with clear SLAs, usability focus, and close feedback loops with squads. Otherwise, “platform-led” becomes “platform says no,” and the organisation drifts back to either shadow IT or central bottlenecks.
Squads and the hub-and-spoke rhythm: autonomy with standards
Squads are how many organisations operationalise a federated or hybrid operating model: a small central “hub” sets standards and enables, while domain “spokes” (squads) deliver. The hub often includes platform, architecture, risk policy, and shared AI specialists; squads sit close to the business processes and own day-to-day delivery. This arrangement aligns with the earlier idea that decision rights should be closest to impact unless enterprise risk forces centralisation.
What makes squads work is not just cross-functionality, but a repeating cadence tied to the AI lifecycle: ideation and use-case selection, data access approvals, evaluation readiness, deployment with change records, monitoring with runbooks, incident response, and retirement. Squads need explicit service ownership so incidents don’t turn into blame loops across product, data, security, and legal. They also need a clear definition of “material change” so iteration remains fast without becoming uncontrolled change in production.
A helpful way to think about squads is: they own the “thin waist” between business and technology. They translate desired outcomes into working systems, and they keep those systems healthy over time. The hub’s role is to make that translation repeatable by providing templates, non-negotiables, and shared evidence capture. This is how you avoid the extremes: “everyone can ship anything” versus “a committee must approve everything.”
The biggest pitfall in squads is ambiguity at the seams—especially between platform and product teams. If squads assume the platform owns operations, you get orphaned services. If the platform assumes squads will implement controls, you get uneven governance. The fix is a crisp contract: platform provides paved roads and guardrails; squads own outcomes, integration, and run. Risk and security partners embed through tiered engagement: lightweight for low-risk work, deeper and more formal for high-risk systems.
Comparing the models in practice
| Dimension | Product-led | Platform-led | Squad-based (hub-and-spoke) |
|---|---|---|---|
| Primary goal | Ship and improve a specific capability with strong adoption and operational ownership. | Make delivery faster and safer through reusable “paved roads” and defaults. | Balance domain speed with enterprise consistency through federated delivery and shared standards. |
| Where decisions sit | Product owner and tech lead decide most delivery and iteration choices within guardrails; risk sign-off depends on tier. | Platform team owns non-negotiables (telemetry, registries, deployment patterns); teams decide everything else. | Hub sets policy, minimum controls, and exception process; squads decide domain implementation and run responsibilities. |
| Strengths | Strong alignment to real outcomes; fewer handoffs; clearer ownership for monitoring and incidents. | Reduces duplication; improves auditability via built-in evidence capture; enables consistent incident response tooling. | Scales across multiple domains; avoids central bottlenecks while preventing fragmentation with standards. |
| Failure modes | Tool sprawl and inconsistent controls if guardrails/platform are weak; “done at launch” thinking. | “Platform says no” bottleneck; low adoption if platform is slow or hard to use. | Seam confusion; uneven risk execution; committees appear when decision rights are unclear. |
| Best fit | When workflow integration and adoption are the main challenge, and fast iteration is needed. | When many teams need shared capabilities (logging, eval, deployment), and consistency/auditability matter. | When scaling across business units with varied maturity and risk profiles. |
[[flowchart-placeholder]]
Applied example 1: Retail bank support copilot—product, platform, and risk working as one system
A retail bank builds a contact-center GenAI assistant that summarises calls, drafts responses, and suggests next-best actions. The bank chooses a federated delivery approach: a central enabling hub plus a contact-center squad that owns the product. The first step is to define the non-negotiables centrally: approved model providers, redaction rules, required logging, and a standard evaluation suite (groundedness checks, toxicity screening, privacy leakage tests). This prevents each squad from renegotiating vendor and control choices and ensures evidence is comparable across use cases.
Next, the contact-center squad handles product realities: team workflows, UI, training, and adoption metrics like average handle time and escalation rate. The product owner defines “good” in business terms and controls rollout pacing. The delivery tech lead selects retrieval strategy and prompt structure, but stays inside guardrails (for example, no unredacted PII sent to external endpoints). The risk owner defines unacceptable failure modes—hallucinated financial advice, leakage of sensitive content—and requires human fallback for specific actions.
Then change control makes iteration safe. The bank explicitly classifies “material changes”: prompt edits that affect advice language or grounding sources require re-evaluation and risk sign-off; cosmetic formatting changes do not. When a prompt update is proposed to improve tone, the squad runs the standard evaluation suite, logs the change in the registry, and ships behind a staged rollout. The impact is speed with control: weekly iteration remains possible, incidents have a clear service owner, and the enterprise can audit decisions without turning every change into a meeting. The limitation is coordination overhead, which the bank manages by keeping artifacts lightweight and partially automated through platform defaults.
Applied example 2: Predictive maintenance in 40 plants—platform leverage without local chaos
A manufacturer deploys predictive maintenance models across 40 plants, each with different sensors, equipment, and data quality. The organisation sets up an AI platform team to own the paved road: standard ingestion patterns, model registry usage, monitoring defaults, and deployment templates. The platform team holds decision rights for the non-negotiables—logging format, minimum dashboards, incident tooling—so performance and reliability can be compared across plants and incidents can be triaged consistently.
Regional analytics squads then tailor models to local conditions. They own feature choices, model selection, and integration with maintenance workflows, but must meet minimum evidence requirements before deployment: documented evaluation results, data quality thresholds, and rollback plans. Data decision rights are intentionally split: local plant data owners approve operational datasets; enterprise security sets access rules and retention; a governance function resolves semantic disputes so “failure event” and related metrics mean the same thing across sites.
When one plant’s sensor feed becomes unstable, monitoring triggers an incident. The local service owner has authority to degrade gracefully: disable automated scheduling suggestions or switch to conservative thresholds while the issue is diagnosed. The platform team supports by providing standard telemetry, alerting, and runbook templates so response is fast and consistent. The benefit is reuse: build once, deploy many, and leadership gets comparable indicators across plants. The limitation is adoption risk: if the platform is slow or hard to use, plants bypass it. The organisation mitigates this by allowing exceptions only with documented rationale and time-bound remediation—preventing permanent fragmentation.
Making it real: a few operating rules teams can actually follow
A way of working becomes durable when it is easy to explain and hard to misunderstand. Three practical rules show up repeatedly in organisations that scale AI without freezing delivery.
First, tie ways of working to lifecycle moments, not org charts. Data access, evaluation readiness, deployment, monitoring, material change control, and retirement are where confusion becomes risk. If you define “who decides what” at those moments, teams stop improvising governance under pressure.
Second, separate guardrails from design freedom. Guardrails are non-negotiables: approved vendors, minimum logging, incident response process, standard evidence artifacts, and tiered sign-offs. Design freedom is everything inside the guardrails: model choice, prompt strategy, UX, rollout mechanics. This combination avoids both committee-default and autonomy-without-standards.
Third, make ownership explicit after launch. Every AI capability needs a named service owner with authority to roll back, disable features, and run incidents. Without that, the organisation may ship quickly but becomes slow and risky the moment behavior changes in production—which it will.
Now that the foundation is in place, we'll move into Portfolio & Funding for AI Outcomes [25 minutes].