Mactores Blog

Controlling Non-Determinism in Agentic AI Systems

Written by Bal Heroor | Mar 16, 2026 8:00:00 AM

Your AI agent works perfectly in a demo.

Then it reaches production — and suddenly it takes 47 steps to do a 5-step job.

Nothing is broken.

The model is strong.

The prompts are fine.

Yet no one can answer the most important question:

Why did the agent take this execution path?

This is the core failure mode of modern agentic AI systems. Not lack of intelligence — lack of structural control. In previous blog, we examined how architectural decisions shape agent behavior at scale particularly in Centralized vs. Distributed Intelligence: Designing Multi-Agent AI Systems That Scale, and this article builds directly on that foundation by focusing on execution control, determinism, and reliability.

Most agents today are built as free-form execution loops: think, act, observe, repeat. These loops are powerful, but they introduce unbounded non-determinism at the system level. The same input can lead to different plans, different tool calls, different step counts — and no reliable way to reproduce or audit the behavior.

For experimentation, this is acceptable.

For production systems — especially enterprise and regulated environments — it is not.

This article argues that non-determinism in agentic AI is fundamentally an architecture problem, not a prompting problem. Sampling controls reduce variance, but they do not constrain execution.

The solution is workflow-based agent architectures: deterministic control flow on the outside, probabilistic intelligence on the inside. LLMs operate at well-defined nodes; execution between them is explicit, bounded, and replayable.

In the sections that follow, we’ll examine where non-determinism comes from, why free-form agents amplify it, and how workflow-based patterns like DAG pipelines, state machines, and plan-and-execute systems enable production-grade agentic AI.

Before we dive in, if you'd prefer to watch rather than read, we've put together a video walkthrough — you can check it out here

 

Determinism vs Non-Determinism: A Formal and Practical View

Before controlling non-determinism, we must be precise about which determinism matters. In agentic AI, multiple forms are often conflated, leading to ineffective fixes. 

 

Determinism as a System Property (Not a Model Property)

In classical software engineering, determinism is a property of the program, not the hardware or compiler.

Formally, a function f is deterministic if:

∀x : f(x) → y (always)

Given the same input, the output — and the execution path — are invariant.

In practice, this implies:

  • Fixed control flow
  • Explicit state transitions
  • Bounded execution
  • Reproducible side effects

Example (deterministic):

 

def approve(amount):

if amount < 10_000:

return "AUTO_APPROVED"

else:

return "MANUAL_REVIEW"

Same input → same branch → same output → same trace.

 

What Determinism Looks Like in Traditional Systems?

Production software enforces determinism using structure, not intelligence:

  • Control-flow graphs (CFGs)
  • Finite state machines
  • Workflows/pipelines
  • Transactions and idempotency keys

Example: a simple workflow as a control-flow graph

[Validate] → [Enrich] → [Score] → [Decide]

 

Why LLM-Based Systems Break This Model?

Large language models are probabilistic programs. Even a single inference call is not strictly deterministic:

P(token | context) ≠ 1.0

Typical decoding introduces randomness:

response = model.generate(

prompt,

temperature=0.7,

top_p=0.9

)

Even with:

temperature = 0.0

you still face:

  • Floating-point variance
  • Non-deterministic kernels
  • Model version drift
  • Infrastructure-level randomness

But this is not the real problem.

 

Local Non-Determinism vs System Non-Determinism

A single probabilistic function is manageable.

An agent is not a function — it is a program that writes its own next instruction.

Typical free-form agent loop:

while not done:

thought = llm.think(state)

action = llm.decide(thought)

observation = execute(action)

state = update(state, observation)

What becomes non-deterministic:

  • Number of loop iterations
  • Branching decisions
  • Tool invocation order
  • Termination conditions

This creates execution path entropy, not just output variance.

Two runs, same input:

Run A: 8 steps → tool_x → tool_y → done

Run B: 21 steps → tool_y → reflect → tool_x → retry → done

Both are “correct.”
Neither is predictable.

 

Determinism Spectrum in Agentic Systems

Determinism is not binary — it exists on a spectrum:

  • Fully deterministic
    • Traditional workflows
    • State machines

  • Bounded non-determinism
    • Workflow-based agents
    • Plan-and-execute systems

  • Unbounded non-determinism
    • Free-form ReAct loops
    • Self-reflecting agents with retries

Why Agentic AI Magnifies Non-Determinism?

Non-determinism exists in all probabilistic systems. What makes agentic AI dangerous in production is not randomness itself, but the fact that agents are long-lived, stateful, and self-directing. Small stochastic decisions compound over time into large, irreversible execution variance.

A single LLM call behaves like a mostly pure function:

  • Stateless
  • No memory or side effects
  • Variance is local and bounded

An agent, by contrast, is a program:

  • Persistent state accumulates across steps
  • Temporal coupling means early randomness shapes all future behavior
  • Side effects (tool calls) mutate external systems and may be irreversible

This turns variance from a local concern into a system-wide property.

 

Free-Form Agent Loops Are Unbounded Programs

Most agent frameworks use a ReAct-style loop, but the issue is what the loop does not specify. There is no fixed upper bound on:

  • Steps
  • Branches
  • Tool calls
  • Retries or reflection cycles

In classical software, unbounded loops are bugs. In agentic systems, they are often the default.

 

Path Explosion Over Time

Each step introduces a branching decision. Over multiple steps, execution paths grow exponentially. The result is:

  • Compounding randomness from early decisions
  • Feedback amplification as observations shape future reasoning
  • Emergent behavior across retries, where fixing one failure exposes another

Two runs with identical inputs can diverge completely — not just in output, but in how the system behaves.

 

A Taxonomy of Non-Determinism in Agentic Systems

Non-determinism in agentic AI is not a single failure mode. It emerges from multiple layers of the system, each contributing its own form of variability. These layers interact and compound, which is why non-determinism becomes difficult to control once agents move beyond simple, single-step tasks.

Below is a concise taxonomy of the five primary sources of non-determinism in agentic systems.

 

Model-Level Non-Determinism

Originates inside the model and inference stack.

Control-Flow Non-Determinism

Occurs when execution structure is decided at runtime.

 

Data & Context Non-Determinism

Agents are highly sensitive to unstable context.

 

Environment & Tool Non-Determinism

Introduced when agents interact with external systems.

 

Multi-Agent Non-Determinism

Emerges from agent-to-agent interaction.

 

Workflow-Based Agent Architectures

Workflow-based agent architectures are designed to solve the core problem outlined so far: unbounded, opaque execution in agentic systems. Rather than allowing agents to dynamically decide their next action at every step, workflows move execution control into an explicit, deterministic structure.

 

What Is a Workflow-Based Agent?

A workflow-based agent is an agent whose behavior is governed by a predefined execution graph, rather than a free-form loop.

Key characteristics:

  • Explicit execution graph
    • The set of possible steps is known ahead of time
    • Dependencies between steps are defined explicitly

  • Finite, enumerable paths
    • All valid execution paths can be reasoned about
    • No surprise loops or infinite reasoning chains

  • Constrained transitions
    • Movement between steps is governed by rules, not improvisation
    • Branches are intentional, not emergent

In this model, the agent does not decide what to do next — it decides how to perform the current step.

 

Formal Properties

Workflow-based agents exhibit system-level properties that free-form agents cannot guarantee.

Core properties

  • Bounded execution
    • Upper limits on steps, retries, and tool usage
    • Predictable worst-case behavior

  • Partial order guarantees
    • Steps execute only after dependencies are satisfied
    • Independent steps can run in parallel without race conditions

  • Deterministic replay
    • Given the same inputs and state, execution can be replayed
    • Critical for debugging and incident analysis

  • Observable state transitions
    • Each step transition is logged and inspectable
    • Enables fine-grained monitoring and alerting

These properties shift agent behavior from emergent to engineered.

 

Why Workflows Scale to Production?

The value of workflow-based agents becomes obvious at production scale.

Operational advantages

  • Debuggability
    • Failures map to specific steps
    • Root causes are traceable

  • Compliance
    • Execution paths are auditable
    • Decision points are explainable

  • Cost predictability
    • Execution bounds enable accurate cost modeling
    • No runaway loops or surprise retries

  • SLA alignment
    • Latency and throughput can be reasoned about upfront
    • Parallelism is intentional, not accidental

 

Workflow-based architectures do not eliminate intelligence — they discipline it. By constraining how agents act, they make it possible to trust what agents do.

In enterprise environments, the need for workflow-based agent architectures is not theoretical — it is operational. At mactores, we consistently see agentic systems fail not at the reasoning layer, but at the execution layer: unclear control flow, irreproducible behavior, and the inability to explain why a system behaved a certain way.

Across regulated industries and large-scale internal platforms, free-form agent loops rarely survive first contact with requirements like auditability, SLA enforcement, or incident forensics. Workflow-based architectures — DAGs, state machines, and plan-and-execute systems — provide the structural guarantees these environments demand without eliminating the benefits of LLM-driven intelligence.

 

Pattern 1: DAG / Pipeline Architectures

DAG-based architectures are often the cleanest and most approachable way to introduce structure into agentic systems. They replace free-form loops with a mathematically well-understood execution model: the directed acyclic graph.

At scale, this pattern behaves less like an “agent” and more like a distributed workflow engine with intelligent nodes.

 

Formal Model

A DAG-based agent is defined as a directed acyclic graph:

G = (V, E)

Where:

  • V represents nodes (LLM calls, tools, functions, conditionals)
  • E represents directed dependencies between nodes

Key properties of the model:

  • Acyclic by construction
    • No cycles, no infinite loops
    • Execution always terminates

  • Topological ordering
    • There exists at least one valid execution order
    • Nodes execute only after all dependencies complete

  • Explicit dependency resolution
    • Data and control dependencies are declared, not inferred

This formalism alone eliminates an entire class of agent failures.

 

Execution Semantics

Execution in a DAG-based agent is governed by deterministic rules.

Core semantics

  • Parallel execution
    • Nodes with no unmet dependencies can execute concurrently
    • Latency is reduced without increasing complexity

  • Deterministic scheduling
    • Execution order is derived from graph structure
    • Not influenced by model outputs

  • Failure propagation
    • Failures propagate along outgoing edges
    • Downstream nodes can be skipped, retried, or short-circuited

This is fundamentally different from agent loops, where control flow is decided dynamically at runtime.

 

Code Example: DAG Execution Engine (Pseudo-Code)

At its core, a DAG agent executes like a workflow engine:

for node in topological_sort(graph):

if dependencies_satisfied(node):

execute(node)

Important implications of this structure:

  • The execution order is computed before execution begins
  • Nodes cannot introduce new execution paths
  • Control flow is data-driven, not reasoning-driven

LLMs influence what happens inside a node, not which node runs next.

 

LLMs as Pure Functions Inside Nodes

To preserve determinism, LLMs inside DAG nodes must behave like pure, bounded functions.

Node design principles

  • Input schema
    • Explicit, validated inputs
    • No hidden context leakage

  • Output schema
    • Structured outputs
    • Machine-parseable results

  • Validation gates
    • Reject malformed or incomplete outputs
    • Prevent downstream contamination

This containment is critical. If an LLM is allowed to emit free-form instructions, it effectively breaks out of the DAG and reintroduces control-flow non-determinism.

 

Failure Modes and Mitigations

DAG architectures dramatically reduce chaos, but they are not failure-proof.

Common failure modes

  • Partial failures
    • One node fails while others succeed
    • Leads to inconsistent intermediate state

  • Retry storms
    • Aggressive retries on shared dependencies
    • Cascading pressure on upstream systems

  • Fan-out amplification
    • A single upstream failure affects many downstream nodes
    • Cost and latency spikes

 

Pattern 2: State Machine Architectures

If DAG pipelines bring structure to what runs and when, state machine architectures bring structure to what the system is allowed to be. This pattern is the most restrictive of the workflow-based approaches — and for many production environments, that is precisely its strength.

State machines replace flexible execution graphs with explicit system states and sanctioned transitions. Nothing happens unless it is allowed.

 

Finite State Machines for Agents

In a state machine–driven agent, the system is always in exactly one well-defined state.

States as invariants

  • Each state represents a stable, well-understood condition of the system
  • Invariants define what must be true while the agent is in that state
  • Examples:
    • “Input validated”
    • “Awaiting approval”
    • “Execution authorized”
    • “Completed”

Transitions as contracts

  • Transitions encode allowed movement between states
  • Each transition has explicit preconditions and outcomes
  • No implicit jumps, skips, or loops

This forces agent behavior to conform to a predefined lifecycle, rather than inventing one at runtime.

 

Formal Definition

Formally, a finite state machine can be defined as:

S = {s1, s2, ..., sn}

T = {(s1 → s2), (s2 → s3)}

Where:

  • S is the finite set of valid states
  • T is the finite set of allowed transitions

Anything not represented in T is impossible by design.

This sharply limits execution entropy.

 

Execution Guarantees

State machine–based agents provide strong, system-level guarantees that free-form agents cannot.

Core guarantees

  • Single active state
    • The agent cannot be in two states at once
    • Eliminates ambiguity and race conditions

  • Fully traceable transitions
    • Every state change is explicit and logged
    • Transition history forms a complete execution trace

  • Deterministic replay
    • Given the same inputs and events, state transitions replay identically
    • Enables forensic-level debugging

The agent’s “reasoning” happens inside a state — never across states.

 

Code Example: State Transition Logic

At runtime, state transitions are enforced mechanically, not inferred.

if state == REVIEW and approved:

state = EXECUTE

Important characteristics of this pattern:

  • Transitions are explicit
  • Conditions are inspectable
  • Illegal transitions are rejected
  • The agent cannot “reason its way” into a new state

This makes state machines hostile to creativity — and extremely friendly to reliability.

 

Pattern 3: Plan-and-Execute Architectures

Plan-and-execute architectures address a core weakness of free-form agent loops: local decision-making without global awareness. Instead of deciding the next action step-by-step, the agent first constructs an explicit plan, then executes it under deterministic control. Intelligence is front-loaded; execution is disciplined.

This pattern is especially effective for long-horizon, multi-step tasks where coherence, ordering, and predictability matter more than improvisation.

 

Explicit Planning Phase

The defining feature of this architecture is a dedicated planning phase that runs before any action is taken.

What happens during planning

  • Global task decomposition
    • The task is broken into a complete sequence of steps
    • The full execution path is visible upfront

  • Dependency resolution
    • Ordering constraints between steps are identified
    • Parallelizable vs sequential steps are made explicit

  • Risk identification
    • External dependencies are surfaced early
    • High-risk steps can be isolated or guarded

Execution Phase

Once a plan exists, execution becomes a controlled process.

Execution characteristics

  • Step-by-step execution
    • Each step is executed in the order defined by the plan
    • No dynamic reordering at runtime

  • Deterministic ordering
    • The plan defines the execution sequence
    • Model outputs do not influence control flow

During execution, the agent is no longer deciding what to do next. It is simply carrying out a predefined contract.

This sharply reduces execution entropy while preserving intelligent reasoning where it matters most.

Replanning as a Controlled Exception

Real-world systems encounter surprises. Plan-and-execute architectures handle this through explicit replanning, not ad-hoc adaptation.

Replanning mechanics

  • Trigger conditions
    • A step fails
    • Preconditions are violated
    • New critical information appears

  • Controlled response
    • Execution pauses
    • The planner is invoked with updated context

  • State handling

    • Either:
      • Full plan regeneration (state reset)
      • Partial reuse of completed steps

Crucially, replanning is an exceptional path, not a continuous loop.

Pseudo-Code Example

At a high level, the control flow is straightforward:

plan = planner(task)

for step in plan:

execute(step)

if failure:

plan = replan(context)

Key properties of this structure:

  • Planning and execution are explicitly separated
  • Control flow is predictable
  • The agent cannot invent new steps mid-execution

 

Why This Beats ReAct for Complex Tasks?

ReAct-style agents decide one step at a time. This works well for short, exploratory tasks — but breaks down as complexity increases.

Advantages of plan-and-execute

  • Global coherence
    • The agent maintains a consistent end-to-end strategy

  • Bounded reasoning
    • Planning cost is paid once, not at every step

  • Predictable execution depth
    • The maximum number of steps is known upfront

For tasks like:

  • End-to-end code migrations
  • Multi-system workflows
  • Large-scale refactoring
  • Enterprise process automation

 

Hybrid and Composable Architectures

Real-world agentic systems rarely fit cleanly into a single architectural pattern. Production requirements often demand both flexibility and control, which is where hybrid and composable architectures become essential. The key idea is simple: compose intelligence inside structure, never the other way around.

 

DAGs Containing ReAct Agents

One common pattern is to use a DAG as the outer control structure, while allowing individual nodes to host free-form agents internally.

def dag_node(input):

return react_agent.run(input)

In this design:

  • The DAG defines when and whether a node runs
  • The ReAct agent defines how the task is solved locally
  • Any non-determinism is contained within the node boundary

 

State Machines with Agent-Based States

Another powerful composition is embedding agents inside state machine states.

if state == ANALYZE:

result = agent.analyze(context)

state = NEXT_STATE

Here:

  • The state machine enforces lifecycle and transitions
  • Each state may run an agent to perform complex reasoning
  • The agent cannot change states — only the controller can

This pattern is especially effective when regulatory or audit constraints require explicit state progression but still benefit from intelligent decision-making within each phase.

 

Nested Workflows

Complex systems often require workflows inside workflows.

Examples:

  • A DAG where one node triggers another DAG
  • A plan-and-execute step that itself runs a state machine
  • A state machine transition guarded by a planning subroutine

result = sub_workflow.execute(input)

Nested workflows allow teams to scale complexity hierarchically without flattening everything into a single, unmanageable graph.

Hybrid architectures let you selectively loosen constraints where it’s safe, while preserving deterministic guarantees where it matters most. In the next section, we’ll look at how these structured systems can be tested, replayed, and debugged in production.

 

Testing, Replay, and Debugging Deterministic Agents

 

Deterministic Replay

Deterministic replay is the cornerstone of reliable agent operations.

What must be captured

  • Input capture
    • User inputs, external events, tool responses

  • State snapshots
    • State at each transition or workflow step

  • Execution logs
    • Node execution order
    • Transition decisions
    • Failure reasons

With these artifacts, teams can replay an agent run step-by-step, answering not just what happened, but why it happened.

 

Behavioral Testing

Traditional unit tests are insufficient for agentic systems. Instead, teams adopt behavioral testing at the workflow level.

Effective strategies

  • Scenario testing
    • Known workflows with fixed expectations

  • Monte Carlo testing within bounded graphs
    • Run the same scenario multiple times
    • Observe variance within allowed paths

  • Golden-path verification
    • Define canonical execution traces
    • Alert when deviations occur

This approach acknowledges probabilistic behavior while enforcing structural guarantees.

 

Production Observability

Observability must be designed into the agent architecture, not added afterward.

Critical signals

  • Step-level metrics
    • Latency per node or state
    • Retry counts

  • Execution timelines
    • Visual traces of workflow progression

  • Failure heatmaps
    • Identify high-risk steps across runs

When agents are observable at the step level, failures become diagnosable events — not mysteries.

 

Failure Modes You Still Need to Design For

Workflow-based architectures dramatically reduce non-determinism, but they do not eliminate all risk. Some failure modes persist — and ignoring them leads to fragile systems.

 

Common pitfalls

  • Silent degradation: Models may comply structurally while producing lower-quality reasoning, leading to “successful” but incorrect executions.
  • Partial determinism illusions: A deterministic outer workflow can mask uncontrolled behavior inside nodes, creating false confidence.
  • Over-constraining intelligence: Excessive rigidity can prevent agents from handling edge cases, forcing constant human intervention.
  • Workflow sprawl: As systems grow, unmanaged workflows become hard to reason about and harder to evolve.

 

Applying These Patterns in Production at Mactores

In production systems, agentic AI rarely fails because a model “wasn’t smart enough.” It fails because the system surrounding the model could not bound, observe, or explain its behavior.

At Mactores, we encounter this pattern repeatedly when organizations attempt to operationalize agents across enterprise workflows. Early prototypes often rely on free-form agent loops, which work well in isolation but collapse under real-world constraints: compliance requirements, cost controls, latency budgets, and multi-team ownership.

Workflow-based agent architectures emerge as the practical solution. DAG pipelines provide predictable parallelism. State machines enforce explicit lifecycle guarantees. Plan-and-execute systems enable long-horizon reasoning without unbounded execution. In practice, these patterns are rarely used in isolation — they are composed, nested, and selectively relaxed where risk permits.

What matters most is not the specific pattern, but the principle: structure must exist outside the model. When control flow is explicit, and intelligence is localized, agentic systems become debuggable systems — not probabilistic experiments.

This shift is what enables agentic AI to function as infrastructure, not novelty.

 

Conclusion

Agentic AI systems fail in production not because they lack intelligence, but because they lack structure. Non-determinism is inevitable in probabilistic models — unbounded execution is not. By separating intelligence from control and adopting workflow-based architectures, teams can build agents that are observable, replayable, and production-safe without sacrificing capability.

DAGs, state machines, and plan-and-execute patterns all enforce the same principle: deterministic control flow with localized intelligence. This is how agentic systems evolve from experiments into infrastructure.

The open question is no longer whether agents can reason, but how much freedom should an intelligent system have to decide its own execution path in production? In the next article, we’ll extend this discussion by examining AI Agent Safety: The Missing Layer in Most Enterprise Deployments, and why structural control is a prerequisite, but not a substitute, for building agents that are truly safe, governable, and enterprise-ready.