TECH/AI DELEGATION > AI AGENTS

AI Delegation > AI Agents

Why autonomous AI agents will flop and structured delegation is the future.

TL;DR

  • Fully autonomous AI agents face too many risks for enterprise adoption
  • Delegation with human checkpoints offers the sweet spot between automation and control
  • The key is designing smart handoff points in workflows
  • Engineers must still design the process—AI handles execution
  • Combining probabilistic (AI) and deterministic (rules) creates robust systems

There is a lot of shit that can go wrong with agentic AI. Data breaches, data poisoning, cascading failures, unintended consequences, misalignment, and losing track of what's happening. With tool calling hooking directly into databases and company systems, AI agents are the opposite of pure functions—they have side effects everywhere.

I do not expect agentic AI (completely autonomous systems) to be successful, especially in the enterprise. There are too many risks. But I expect that delegation will become more prevalent with test-time compute. We delegate longer-running tasks to workflows—multi-step, pre-defined workflows with LLMs for decision-making in control flow or as workers in the nodes—that are specialized for certain actions and wait until they report back.

AI delegation has given rise to a new form of automation. Unlike traditional rules-based automation, which relies on predefined rules and workflows, agentic automation leverages AI to handle more complex and unstructured processes. These agents can adapt to changing situations and make decisions in real-time, leading to more efficient and flexible solutions.

However, the future likely lies in a combination of both probabilistic (agentic) and deterministic (rules-based) technologies. This blended approach allows organizations to leverage the strengths of each, creating robust and adaptable automation solutions.

The framework is straightforward. Task Setup: Humans frame the goal. They set limits on what the AI can do. Like setting guardrails on a road. AI Processing: The AI crunches data, spots patterns, and flags key findings. But it doesn't act on them. Human Review: A person checks the AI's work. They can accept it as is, ask for more detail, change direction, or stop the process. Feedback Loop: Each round of work helps tune the system. The AI learns what outputs are useful. Humans learn what tasks to delegate.

Think of it like teaching a new employee. First you show them basic tasks. Then you check their work. Over time, you trust them with more. But you keep the final say.

Here's how AI delegation works in practice. Data Entry: AI scans invoices and pulls key fields. Human quickly checks extracted data. AI flags odd entries. Human fixes flags and approves batches. AI learns from corrections. Financial Research: Human picks topics. AI pulls data from reports, news, and filings. AI spots trends and flags key points. Human reviews findings and asks new questions. AI digs deeper on specific areas. Human builds final insight from AI's work.

Coding Tasks: Human outlines feature needs. AI suggests code structure. Human picks approach. AI writes basic code. Human reviews and tweaks architecture. AI handles routine parts. Human focuses on core logic. AI flags potential bugs. Human makes final call on changes. The pattern? Humans guide while AI handles the grunt work. Each knows its role. The work flows back and forth, getting better each round.

This still means we need people to design these systems. Engineers figure out the steps, the workflow, and the points where humans should intervene or sign off. The design work includes mapping the process, finding natural break points in workflows, spotting where things often go wrong, marking where expert judgment matters most, and testing different stopping points.

Setting check points means picking key moments for human review, building in safety stops, adding ways to roll back changes, and creating override options before expensive operations like writing to a database or to review operations like searching a vector database.

Handling edge cases requires planning for weird data, adding escape hatches for tough calls, building in ways to flag problems, and making manual override easy. Making it learn involves tracking what humans change, noting which flags help, watching for patterns in mistakes, and building in feedback loops.

The hard part? Knowing where to draw the lines. Too many checks and the system is slow. Too few and things break. The key is the handoff points. AI does the heavy lifting, but humans control when and how it happens. This hybrid approach is the future—not fully autonomous agents that we can't trust, and not purely manual work that doesn't scale.