The Architecture of Agentic Systems
In the evolution of Large Language Models (LLMs), a critical distinction has emerged between simple "chatbot" interactions and Agentic Workflows. As defined by researchers at Anthropic and DeepLearning.AI, an Agent is not merely a predictor of the next token; it is a system where the LLM's output determines the control flow of the application.
This essay explores the structural theory of these systems interactively. We will examine the five core design patterns that transition a system from a static query engine to a dynamic problem solver: Tool Use, Prompt Chaining, Routing, Orchestration, and the Evaluator-Optimizer loop.
1. The Agentic Definition
Non-agentic systems are zero-shot: Input $\rightarrow$ Output. An Agentic System introduces a loop. The model can decide to pause generation, call an external tool (a calculator, a web search, an API), observe the result, and then continue.
Figure 1: In a complex query, the LLM stops reasoning to delegate a task to a deterministic tool, creating a multi-step workflow.
2. Workflow Pattern: Prompt Chaining
The most robust pattern is Prompt Chaining. This decomposes a complex task into fixed subtasks. It is deterministic topology: the output of step $n$ becomes the input of step $n+1$.
Crucially, chains often employ a Gate. A Gate is a classification step that acts as a circuit breaker. It decides if the flow should continue or halt (e.g., checking if a generated email is polite before sending).
Figure 2: The Gate acts as a boolean classifier. If $P(Safe) < Threshold$, the chain terminates early to prevent harm.
3. Routing
Routing introduces non-linear topology. A "Router" LLM classifies the user's intent and directs the flow to a specialized agent. This is essential for systems handling diverse tasks (e.g., a customer service bot handling both refunds and technical support).
Mathematically, the Router acts as a function $f(x) \rightarrow \{A, B, \dots\}$ where $x$ is the prompt.
Figure 3: The Router optimizes cost and accuracy by sending queries only to the specific model or prompt context required.
4. Parallelization vs. Orchestration
When a task is too large for a single context window, we split it.
- Parallelization (Map-Reduce): A Coordinator splits a task (e.g., a long document) into fixed chunks and runs them concurrently. It is fast but lacks context.
- Orchestrator-Worker: A dynamic LLM ("Orchestrator") analyzes the task, decides how to split it based on logic, and assigns sub-agents. It is slower but context-aware.
Figure 4: Parallelization splits data blindly. The Orchestrator "thinks" before splitting, creating semantic sub-tasks.
5. The Evaluator-Optimizer Loop
Perhaps the most powerful pattern is the Evaluator-Optimizer. This mimics the human refinement process. One LLM generates a solution, and another (the Evaluator) critiques it. If the critique is negative, the feedback is looped back to the generator.
This trades latency for accuracy. In the simulation below, increasing the threshold forces more iterations.
Figure 5: The system loops until $Quality_{current} \ge Q_{min}$. High thresholds significantly increase execution time.