Mike Henken | Senior Engineering Leader & AI Platform Architect

The shift from single AI agents to agentic systems represents a fundamental change in how we architect intelligent applications. Single agents work well for narrow tasks — classify this document, summarize this text, generate this code. Complex workflows demand coordination, state management, and structured decision-making that single agents cannot provide. Agentic systems solve this by decomposing intelligence into discrete, coordinated components. The question becomes: how do you structure these components?

This article presents a systematic framework drawn from production experience building a whitelabel agentic platform with 8+ microservices, an AI gateway routing across 20+ LLM providers, and an ecosystem registry managing 69+ integrations. The patterns described here are not academic abstractions. They emerged from shipping multi-agent systems that run at scale and fail gracefully when individual components misbehave.

The Five Functional Subsystems

Research from early 2026 establishes a system-theoretic framework that deconstructs agentic AI into five core functional subsystems. This decomposition maps directly to implementation decisions that every team building serious autonomous systems will face.

The first subsystem is Reasoning and World Model. This is where your system maintains context, generates plans, and decides what to do next. In practice, this subsystem manages the agent's internal representation of the task state and environment. A common mistake is conflating reasoning with LLM inference. Reasoning is the structural process of selecting among alternatives. The LLM may provide the raw material for that selection, but the reasoning framework imposes constraints, validates options, and enforces consistency across decisions.

The second subsystem is Perception and Grounding. Your system's ability to interpret signals from the environment and map them to meaningful internal representations. In the context of a platform like Landi, perception includes parsing user inputs, interpreting API responses, recognizing document structures, and extracting intent from ambiguous natural language queries. Grounding ensures these interpretations connect to the real world rather than hallucinated constructs. When an agent receives a customer support ticket, perception extracts the relevant entities and intent. Grounding verifies those extractions against known schemas, databases, and previous interactions.

The third subsystem is Action Execution. The doing layer — calling APIs, modifying state, sending messages, writing files, triggering deployments. Action execution seems straightforward until you deal with the reality of distributed systems: network failures, partial writes, idempotency requirements, and rollback semantics. The action execution subsystem must treat every external operation as potentially failing and design for recovery at every step.

The fourth subsystem is Learning and Adaptation. Mechanisms for improvement over time, whether through explicit feedback or implicit signals. In production, this ranges from simple prompt refinement based on user corrections to sophisticated reinforcement learning from human feedback (RLHF) pipelines. The critical design decision is the feedback loop latency: how quickly can the system incorporate new information? Batch learning (hourly or daily retraining) works for slowly-evolving domains. Real-time adaptation is necessary for environments where the distribution shifts within a single session.

The fifth subsystem is Inter-Agent Communication. The protocols and patterns that enable collaboration across multiple agents. This subsystem defines how agents share state, delegate tasks, resolve conflicts, and coordinate toward shared objectives. In a multi-agent platform, communication design determines whether the system behaves as a coherent team or a collection of isolated components that happen to share infrastructure.

The 12 Design Patterns

These five subsystems give rise to twelve reusable design patterns, organized into four categories. Understanding these patterns provides the vocabulary to reason about agentic system architecture and make deliberate design choices rather than ad hoc decisions.

Foundational Patterns

Pattern 1: Tool Use. Agents extend their capabilities through external tools rather than embedding all functionality internally. The agent acts as an orchestrator, selecting and invoking appropriate tools based on the task. This pattern decouples agent reasoning from execution, enabling extensibility without retraining. In practice, the tool use pattern manifests as a registry of available tools, each with a schema describing its inputs, outputs, and preconditions. The agent's role is to match task requirements to tool capabilities and construct valid invocations.

python

1class ToolUsingAgent:
2    def __init__(self, tools: list[Tool]):
3        self.tools = {t.name: t for t in tools}
4        self.context = {}
5
6    def execute(self, task: str) -> Result:
7        tool_name = self.select_tool(task)
8        tool = self.tools[tool_name]
9        return tool.run(task, context=self.context)

The tool use pattern is deceptively simple in concept but requires careful implementation. Tool selection must handle ambiguity (multiple tools could satisfy the requirement), composition (a task may require sequential tool invocations), and failure (tools may be unavailable, return errors, or produce incorrect results). A resilient implementation maintains a tool health registry, tracks historical success rates per tool, and implements fallback chains when primary tools fail.

Pattern 2: Memory. Agents maintain state across interactions through structured memory systems. The three-tier memory model distinguishes between episodic memory (what happened — a sequence of past interactions), semantic memory (what was learned — extracted facts, relationships, and domain knowledge), and procedural memory (how to do things — strategies and patterns that have proven effective). Conflating all three tiers into a single context window works for demos and fails at scale.

python

1class AgentMemory:
2    def __init__(self):
3        self.episodic = []       # Sequence of interactions
4        self.semantic = {}       # Extracted facts and relationships
5        self.procedural = {}     # Learned patterns and strategies
6
7    def store_interaction(self, interaction: Interaction):
8        self.episodic.append(interaction)
9        self.extract_facts(interaction)
10
11    def retrieve_relevant(self, query: str) -> list:
12        return self.similarity_search(query)

Working memory maps to the current context window. Episodic memory maps to a vector store of past task executions, enabling retrieval-augmented decision-making. Semantic memory maps to a structured database of world-model facts the agent can reference. The anti-pattern is forcing everything into the context window and hoping the model can sort through it. Production agents that interact across thousands of sessions need explicit memory management, including eviction policies, relevance scoring, and memory consolidation routines that compress older memories into higher-level summaries.

Pattern 3: Planning. Complex tasks require decomposition into actionable steps before execution begins. Planning patterns enable agents to reason about sequences, dependencies, and resource constraints before committing to an execution path. The distinction between planning and execution is critical: planning is cheap and reversible; execution is expensive and often irreversible. A planning agent decomposes goals into subgoals, orders them by dependency, identifies potential failure points, and produces an executable plan with explicit success criteria.

python

1class PlanningAgent:
2    def plan(self, goal: str) -> Plan:
3        subgoals = self.decompose(goal)
4        ordered = self.topological_sort(subgoals, dependencies)
5        return Plan(steps=ordered, success_criteria=self.criteria(goal))

In production environments, plans must be adaptive. The initial plan is a hypothesis. As execution proceeds and the agent gathers new information, the plan should be revisable. Static plans that cannot incorporate runtime feedback lead to agents that blindly execute steps that no longer make sense. The best planning implementations maintain a plan representation that the agent can query and modify mid-execution, adjusting priorities and inserting new steps as conditions change.

Cognitive and Decisional Patterns

Pattern 4: Reflection. Agents evaluate their own outputs before finalizing them. This self-critique loop catches errors, improves quality, and ensures outputs meet specified criteria. Reflection is the architectural foundation for agentic reliability — the ability of a system to maintain quality outputs even when individual generation steps produce subtly incorrect results.

python

1def reflect_and_refine(self, output: str, task: str) -> str:
2    critique = self.evaluate(output, criteria=task)
3    if critique.quality_score < self.threshold:
4        refined = self.refine(output, feedback=critique.feedback)
5        return refined
6    return output

The reflection pattern can be implemented as self-critique (the same agent evaluates its own output) or as a dedicated verification agent. Both approaches have trade-offs. Self-critique is simpler to implement but susceptible to the same biases that produced the original output. Dedicated verification agents provide independent assessment but add latency and cost. In production, I use a hybrid approach: self-critique for routine tasks and dedicated verification agents for high-stakes outputs where errors carry significant consequences.

Pattern 5: Debate. Multiple agents argue different positions, with a final arbiter synthesizing the outcome. This pattern surfaces considerations that a single agent might miss and is particularly valuable for decisions with significant consequences or high ambiguity.

python

1class DebateOrchestrator:
2    def run(self, proposition: str) -> Decision:
3        pro_argument = self.pro_agent.argue(proposition)
4        con_argument = self.con_agent.argue(proposition)
5        return self.arbiter.decide(
6            proposition=proposition,
7            pro=pro_argument,
8            con=con_argument
9        )

The debate pattern has an important operational consideration: it multiplies inference costs by the number of debating agents plus the arbiter. It is not appropriate for every decision. Reserve it for decisions where the expected value of better reasoning exceeds the additional compute cost. In practice, this means routing high-stakes decisions through debate while letting routine decisions flow through simpler patterns.

Execution and Interaction Patterns

Pattern 6: ReAct. The reasoning-acting loop interleaves planning with execution, allowing agents to adjust based on real-world feedback. Think, act, observe, repeat. This pattern enables agents to operate in uncertain environments where the outcome of each action provides information that influences subsequent decisions.

python

1def react_loop(self, task: str) -> Result:
2    while not self.is_complete(task):
3        thought = self.think(task, self.observations)
4        action = self.select_action(thought)
5        result = self.execute(action)
6        observation = self.observe(result)
7        self.observations.append(observation)
8    return self.synthesize_result()

The critical implementation detail for ReAct loops is termination. Without explicit guardrails, an agent can loop indefinitely — burning tokens, accumulating costs, and making no progress. Production implementations must include a maximum step counter, a cost budget, a progress detector (terminate if state hasn't changed in K iterations), and a deadline. These constraints aren't limitations; they're safety requirements.

Pattern 7: Human-in-the-Loop. Critical decisions require human oversight. This pattern defines escalation points and approval workflows based on action risk, agent confidence, and compliance requirements.

yaml

1human_approval_gate:
2  trigger: confidence < 0.85 OR action_type in [delete, modify_production]
3  approval_workflow:
4    timeout: 3600
5    fallback: reject
6  notification:
7    channels: [slack, email]
8    escalation_after: 1800

The key insight is that not all actions require the same level of oversight. A risk matrix categorizes actions by potential impact, applying different approval policies to each tier. High-risk actions (deleting production resources, modifying access controls) always require human judgment. Low-risk actions (reading data, generating reports) proceed autonomously with retrospective oversight. Medium-risk actions use confidence thresholds to decide. This graduated approach avoids the paradox of poorly-designed HITL systems: operators drowning in approval requests, leading to rubber-stamping or decision fatigue that defeats the oversight purpose.

Adaptive and Learning Patterns

Pattern 8: Feedback Learning. Agents improve through explicit feedback on their outputs. Every human correction is training data. The pattern structures the feedback loop, storing corrected outputs as training examples for periodic fine-tuning or few-shot prompt augmentation.

Pattern 9: Context Learning. Agents adapt their behavior based on accumulated context without explicit feedback. Patterns emerge from observation. An agent that repeatedly encounters a certain class of problem develops implicit heuristics for handling that class, even without being explicitly taught.

Pattern 10: Tool Creation. Advanced agents create new tools when existing ones prove insufficient. This meta-capability enables open-ended improvement. When an agent encounters a task that no existing tool can handle, it designs a tool specification, generates an implementation, validates the implementation in a sandbox, and registers it in the tool registry for future use.

Coordination Patterns

Pattern 11: Hierarchical Coordination. Multi-level agent structures with clear authority and delegation. Strategic-level agents set goals and allocate resources. Tactical-level agents coordinate task routing and resolve conflicts. Operational-level agents execute actions and validate results. This maps directly to how effective human organizations operate.

yaml

1hierarchy:
2  strategic_level:
3    agents: [orchestrator, planner]
4    authority: goal_setting, resource_allocation
5  tactical_level:
6    agents: [coordinator, resolver]
7    authority: task_routing, conflict_resolution
8  operational_level:
9    agents: [executor, validator]
10    authority: action_execution, result_validation

Pattern 12: Peer Collaboration. Agents at the same level coordinate through shared state and messaging rather than hierarchical control. Each agent broadcasts its intended action, gathers responses from peers, checks for conflicts or collaboration opportunities, and adjusts its plan based on peer input.

Enterprise Implementation Framework

These twelve patterns compose into a three-tier enterprise framework. The first tier is LLM Agents — task-specific automation with bounded scope. Single-purpose agents for well-defined tasks. The second tier is Agentic AI — adaptive goal-seekers with planning capabilities. Systems that navigate toward objectives in uncertain environments. The third tier is Agentic Communities — organizational frameworks with formal roles, protocols, and governance. Multiple agents coordinated toward complex objectives.

This progression maps directly to organizational maturity. Start with LLM agents for specific tasks, evolve toward agentic AI as requirements grow, and build agentic communities for enterprise-scale operations. Attempting to jump directly to tier three without establishing the foundational patterns is how teams produce systems that are impressive in demos and unreliable in production.

Implementation Considerations

Start with the subsystems. Before selecting patterns, understand which subsystems your application actually needs. Not every system requires all five — simple automation may need only Action Execution and basic Reasoning. Over-engineering the initial architecture creates maintenance burden without delivering proportional value.

Compose patterns deliberately. Each pattern solves a specific problem. Combine them based on your requirements, not because they exist. A system that uses all twelve patterns simultaneously is not sophisticated; it is complex without justification.

Plan for observability from the start. Agentic systems are inherently complex. Structured logging, trace propagation, and decision logging are not optional — they are essential for debugging and improvement. Every state transition, tool invocation, and decision point should emit a structured event. When something goes wrong (and it will), you need the ability to reconstruct exactly what happened and why.

The patterns presented here are not prescriptive. They are starting points. Real systems evolve their own variants. Understanding these foundational patterns provides the vocabulary to reason about that evolution and make decisions that compound rather than conflict.