The Architecture of Agentic Systems

In the evolution of Large Language Models (LLMs), a critical distinction has emerged between simple "chatbot" interactions and Agentic Workflows. As defined by researchers at Anthropic and DeepLearning.AI, an Agent is not merely a predictor of the next token; it is a system where the LLM's output determines the control flow of the application.

This essay explores the structural theory of these systems interactively. We will examine the five core design patterns that transition a system from a static query engine to a dynamic problem solver: Tool Use, Prompt Chaining, Routing, Orchestration, and the Evaluator-Optimizer loop.

1. The Agentic Definition

Non-agentic systems are zero-shot: Input $\rightarrow$ Output. An Agentic System introduces a loop. The model can decide to pause generation, call an external tool (a calculator, a web search, an API), observe the result, and then continue.

Query Type: Simple (Knowledge) Complex (Calculation)

User Input Idle

→

LLM Core Waiting

→

Tool / Calc Inactive

// System Ready... Select a query type.

Figure 1: In a complex query, the LLM stops reasoning to delegate a task to a deterministic tool, creating a multi-step workflow.

2. Workflow Pattern: Prompt Chaining

The most robust pattern is Prompt Chaining. This decomposes a complex task into fixed subtasks. It is deterministic topology: the output of step $n$ becomes the input of step $n+1$.

Crucially, chains often employ a Gate. A Gate is a classification step that acts as a circuit breaker. It decides if the flow should continue or halt (e.g., checking if a generated email is polite before sending).

Input Topic:

LLM 1 Ideate

→

Safety Gate Check

→

LLM 2 Draft

// Waiting for topic...

Figure 2: The Gate acts as a boolean classifier. If $P(Safe) < Threshold$, the chain terminates early to prevent harm.

3. Routing

Routing introduces non-linear topology. A "Router" LLM classifies the user's intent and directs the flow to a specialized agent. This is essential for systems handling diverse tasks (e.g., a customer service bot handling both refunds and technical support).

Mathematically, the Router acts as a function $f(x) \rightarrow \{A, B, \dots\}$ where $x$ is the prompt.

Router Classify Intent

Agent A Support

Agent B Analytics

// Enter a request above (e.g., "I want a refund")...

Figure 3: The Router optimizes cost and accuracy by sending queries only to the specific model or prompt context required.

4. Parallelization vs. Orchestration

When a task is too large for a single context window, we split it.

Parallelization (Map-Reduce): A Coordinator splits a task (e.g., a long document) into fixed chunks and runs them concurrently. It is fast but lacks context.
Orchestrator-Worker: A dynamic LLM ("Orchestrator") analyzes the task, decides how to split it based on logic, and assigns sub-agents. It is slower but context-aware.

Mode: Parallel (Fixed) Orchestrator (Dynamic)

Select a mode to visualize topology...

// Select a mode...

Figure 4: Parallelization splits data blindly. The Orchestrator "thinks" before splitting, creating semantic sub-tasks.

5. The Evaluator-Optimizer Loop

Perhaps the most powerful pattern is the Evaluator-Optimizer. This mimics the human refinement process. One LLM generates a solution, and another (the Evaluator) critiques it. If the critique is negative, the feedback is looped back to the generator.

This trades latency for accuracy. In the simulation below, increasing the threshold forces more iterations.

Quality Threshold ($Q_{min}$): 85%

Generator Drafting

→

← Feedback

Evaluator Review

→

Output Pending

// Set threshold and generate...

Figure 5: The system loops until $Quality_{current} \ge Q_{min}$. High thresholds significantly increase execution time.