The Missing Control Layer in AI Systems

By Published On: March 23, 2026

The Missing Control Layer in AI Systems

Part 2 of a 4-Part Series on AI in Production

In Part 1, I argued that AI doesn’t fail in the demo. It fails the first time you have to trust it.

I got a lot of responses to that post. Most of them weren’t disagreements. They were variations of: “Yes… but I don’t know how to fix that.”

That’s the interesting part. Because once you see the problem, you can’t unsee it.

You build something. It works. You connect it to a couple of tools. Maybe a model or two. Add some logic around it. And for a moment, it feels like you’ve got something real.

Then you ask a simple question: “Why did it do that?”

And the answer is… fuzzy.

A Real Example: Virtual CTO Advisor

The first version of my Virtual CTO Advisor was built as a GPT. It was exactly what you’d expect. Conversational. Responsive. Surprisingly useful.

It felt like magic—because it was.

I could ask it questions about enterprise architecture, strategy, tradeoffs—and it would respond in a way that felt aligned with how I think. From a demo perspective, it was perfect.

But the moment I tried to think about it as a system I could trust, things started to break down.

I couldn’t answer basic questions. Why did it give that answer and not another? What sources influenced it? What data was it allowed to access? What shouldn’t it say?

And more importantly: where do I put control?

There wasn’t a clean answer. Because everything was happening inside the GPT itself.

That question kept bothering me. Not because the system was wrong. But because I couldn’t point to where the decision was actually being made.

Was it the prompt? The model? The system instructions? The retrieval logic?

The answer was: all of them… and none of them.

That’s when it stopped being about my implementation and started looking like a broader pattern.

We’ve built systems where the system decides. The system chooses. The system executes.

And it works. Until you need to control it.

To make that system usable in a real-world context, I had to break it apart. I moved from a single GPT-driven experience to a more explicit, API-driven design. Not because the GPT didn’t work. Because I needed somewhere to put control.

So instead of one system doing everything, I created separation. One component handled input and context. One managed retrieval. One generated responses. One applied constraints and filtering.

That decomposition wasn’t about performance. It was about control.

Once the system was decomposed, something important happened. I could see where decisions were being made. I could insert logic between steps, enforce rules before execution, and trace how outputs were generated.

Concretely, that middle step started to look like real checks: Is this tool call allowed for this user? Is the data source properly scoped? Is this within budget? Does the output violate any constraints?

In other words: I wasn’t just observing the system anymore. I was controlling it.

The Missing Layer

That experience made something clear. In every other part of enterprise architecture, we separate control from execution.

Identity is centralized through IAM and enforced consistently across applications. Network policy is enforced by the network, not delegated to individual services. Security controls are defined once in shared systems and enforced at execution.

We don’t rely on every function call to “do the right thing.” We build systems that define what can happen—and enforce it everywhere.

AI hasn’t done that yet.

Call it a control plane for AI decisions.

A place where policies live, where decisions are evaluated, and where execution is allowed—or blocked—based on rules you define.

The shift I ended up making was simple, but it changed everything. Instead of letting the system act immediately, you force it to pause—not for a human, but for a check.

So the flow becomes:

  • The system proposes what it wants to do
  • A control plane evaluates that proposal
  • Then—and only then—it executes

That middle step is the missing piece. Not more intelligence. Not a better model. A place to decide: “Should this happen at all?”

This is where people get stuck. They hear “control” and think: we’re going to put humans in every loop.

That doesn’t scale. And it’s not the goal.

What you actually want is policies enforced automatically, decisions evaluated consistently, and boundaries that don’t depend on prompts behaving correctly.

Most orchestration platforms today are really good at getting systems to run. They help you build pipelines, orchestrate workflows, and connect models and tools.

But they don’t cleanly separate what the system wants to do from what the system is allowed to do.

So you end up with something that works… but is hard to reason about.

Because eventually someone asks whether you can guarantee it won’t do X, explain why it did Y, or limit what it spends.

And the honest answer is: “Not cleanly.”

Most teams are still asking: “How do we build smarter systems?”

The better question is: “Where do we enforce control over those systems?”

Once you ask that question, the architecture starts to look very different.

Part 3: Why Most AI Architectures Collapse Under Governance

This is where good demos go to die.

Share This Story, Choose Your Platform!

RelatedArticles