Why State Machines Are the Future of Voice AI
LLMs are great at language but terrible at following scripts. Here's why we built Prepatu around a deterministic state machine architecture.
The Problem with LLM-Driven Flows
If you’ve ever tried to build a multi-step voice agent using just an LLM, you know the pain. The model will:
- Skip steps when it thinks it knows the answer
- Invent transitions that don’t exist in your flow
- Forget where it is mid-conversation
- Hallucinate confirmations the user never gave
This isn’t a model quality issue — it’s an architectural one. LLMs are probabilistic by nature. They’re optimized for plausible next tokens, not correct control flow.
Separation of Concerns
The solution is surprisingly simple: don’t let the LLM control the flow.
At BusyTaal, we built Prepatu around a clean separation:
- YAML state machine — defines the flow declaratively
- LLM — operates inside each state for NLU, generation, and tool calls
- Engine — enforces transitions, blocks invalid moves, and manages state
The LLM becomes a guest in each state. It can understand language, extract entities, and call tools — but it cannot decide where to go next. That’s the engine’s job.
Why YAML?
We chose YAML for flow definitions because it’s:
- Human-readable — product managers can review flows
- Version-controllable — diffs are meaningful
- Declarative — you describe what, not how
- Validated at startup — the engine rejects invalid flows before they run
What This Means in Practice
A booking agent defined in Prepatu will always collect the required information before confirming. A support agent will always verify the customer’s identity before accessing account data. The engine guarantees it.
This is the difference between a demo that works 80% of the time and a production system you can trust.
Interested in building voice agents with state machines? Check out Prepatu on GitHub or read the documentation.