This content originally appeared on DEV Community and was authored by yuer
RAG and Agent frameworks promise a lot:
“retrieval-augmented reasoning”, “tool execution”, “autonomous planning”.
But if you’ve actually tried deploying them into finance, legal, compliance, operations, or automation, you’ve probably noticed the same thing I did:
They’re structurally unstable.
Same input → different output.
Same data → different execution path.
This is not a hallucination issue.
It’s an architecture issue.
Let’s break it down.
🧩 1. Retrieval is inherently non-deterministic
ANN (HNSW/IVF/ScaNN) is approximate.
Meaning:
index rebuilds change the top-k
embedding drift changes neighbors
adding documents shifts similarity space
internal randomness changes ranking
If the retrieval set changes,
the entire RAG chain changes.
🧩 2. Context construction is unstable
LLMs don’t treat all chunks equally.
They’re sensitive to:
order of chunks
length differences
truncation behavior
position in the prompt
subtle formatting shifts
Same chunks ≠ same output.
🧩 3. LLM planners amplify randomness
Most Agent frameworks do:
LLM → plan → execute → re-plan → execute → ...
This creates a butterfly effect:
tiny differences in intermediate results
→ different plan
→ differe
→ completely different final output
Agents “improvise”, not “execute”.
🧩 4. No explicit state machine
Most Agent frameworks store “state” inside the prompt.
This means:
not reproducible
not auditable
cannot be replayed
impossible to certify for enterprise use
For regulated environments, this is a showstopper.
✅ A Minimal Deterministic Planner POC
To illustrate a different approach,
I built a small deterministic planner inside AWS Bedrock.
Repo:
👉 https://github.com/yuer-dsl/bedrock-deterministic-planner-poc
It’s intentionally tiny, but demonstrates the core idea:
✔ 1. Parse input → stable task nodes
No free-form reasoning to decide the steps.
The task graph is structural, not probabilistic.
✔ 2. Compile → deterministic execution graph
Same input → same path
Every. Single. Time.
This alone eliminates a huge class of RAG/Agent instability.
✔ 3. Output → auditable artifact
Instead of a raw LLM answer, the POC emits:
node sequence
decisions
trace_id
execution log
intermediate artifacts
It acts more like a program, less like improvisation.
🔥 Why Determinism Matters
As LLMs move deeper into:
finance
legal
compliance
operations
automation
enterprise tooling
three capabilities become essential:
Reproducibility
Auditability
Deterministic execution
Dynamic planning alone cannot achieve this.
Future Agent architectures must incorporate:
stable execution graphs
structural planning
versioned data snapshots
explicit state machines
deterministic control layers
Think of it as:
Agents must evolve from improvisers into compilers.
💬 Final Thoughts
RAG and Agents are powerful — but unstable by design.
This POC is a small step toward exploring deterministic alternatives:
👉 https://github.com/yuer-dsl/bedrock-deterministic-planner-poc
If you’re building RAG pipelines, Agent systems, or enterprise AI infrastructure, I’d love to hear your thoughts. Let’s discuss!
This content originally appeared on DEV Community and was authored by yuer
yuer | Sciencx (2025-11-20T13:11:15+00:00) Why RAG and Agent Systems Are Unstable — A Minimal Deterministic Planner POC. Retrieved from https://www.scien.cx/2025/11/20/why-rag-and-agent-systems-are-unstable-a-minimal-deterministic-planner-poc/
Please log in to upload a file.
There are no updates yet.
Click the Upload button above to add an update.