Vivek Haldar

The Three Abstractions That Make AI Agents Real

Over the last few months, a number of abstractions have shipped from model vendors that help you build systems on top of LLMs—specifically, agents that automate end-to-end workflows. Individually, each abstraction solves a specific problem. But what’s coming into focus now is how they click together to enable something much larger: agents that can execute long, multi-step business processes from start to finish.

I want to lay out how I see these pieces fitting together. But first, a crucial distinction.

For each of these capabilities, there are multiple competing standards and implementations. It’s important to separate the abstract idea (what capability does this provide?) from the concrete standard (which protocol or standard implements it?). The abstract ideas are what matter for architectural thinking. The concrete implementations will evolve and compete.

Here are the three key abstractions:

1. MCP: Standardized Access to Data and APIs

The first foundational abstraction is MCP (Model Context Protocol). Its key function: standardize how models and agents access data and APIs, without building a forest of bespoke integrations.

Before MCP, every agent needed custom code to talk to every data source. MCP gives you a standard interface—a universal adapter between agents and the world’s APIs and databases. It’s been widely adopted and is, by now, a proven success.

2. Skills: Encoding Domain Knowledge for Agents

The second abstraction is skills. At a high level, skills let you encode instructions, SOPs (standard operating procedures), or domain knowledge in a form that agents can access and follow.

This is critical because not every agent task should involve dynamic planning from scratch. Many workflows are already well-understood. You know the right sequence of steps. You don’t want the agent to improvise—you want it to execute a known procedure reliably.

Skills are the abstraction that captures this. (For a deeper dive into how skills, commands, and sub-agents relate in practice, see this breakdown.) And importantly, skills can use MCPs to access data and APIs, but do so at a higher level of abstraction. For example, a skill might say: “Get the last 30 days of sales for Product X.” That’s a high-level, abstract instruction. Underneath, it gets translated into: which database, which table, which columns, what SQL query—and eventually, an MCP call that sits between the agent and your database.

The separation of concerns is clean: skills encode the what, MCPs handle the how.

3. Generative UI: The Last Mile for Humans

The third and most recent abstraction is putting UIs on top of agents. There are a couple of competing standards here—OpenAI’s Apps SDK and MCP Apps (an extension of MCP supported by multiple vendors)—but the abstract capability is the same: AI agents or MCP servers can now serve up a UI, not just text or tool responses.

UI is a last-mile problem. Agents themselves are far more effective working through APIs, MCPs, and command-line interfaces. UI exists for humans—to consume the final output of an agent’s work. And if UI is just the last-mile output for human consumption, then a necessary implication is that UI must be generative.

Think about it: if an agent executes a bespoke query or a multi-step skill, the result contains a unique mix of personal data, query-dependent information, and even query-independent signals (like PageRank was for Google search results). A static, pre-designed UI can’t anticipate this. The UI needs to be generated dynamically to fit the specific intent in the user’s request.

This is analogous to how search results pages or LLM chat interfaces work—every response is different, and the presentation adapts to the content. Handcrafted, static UIs designed ahead of time simply don’t work when the output is this dynamic.

How They Click Together

Here’s the full picture:

  1. Skills encode the workflow—the high-level instructions and domain knowledge.
  2. MCP provides standardized access to the data and APIs that skills need.
  3. Generative UI renders the final output for human consumption.

A concrete example: An agent executes a skill that says “Get last 30 days of sales for Product X.” The skill translates this into MCP calls to query the right database. The results come back, and an MCP App generates a UI—maybe a chart, a summary table, and some key insights—tailored to this specific result.

Each of these abstractions was a missing piece that, until recently, you had to cobble together from scratch. Now they exist as standardized capabilities. Together they enable agents that can execute real, end-to-end business processes.