}
{
DATAJOURNEYHQ ACADEMY
← BLOG HOME

Your AI system is still a software system

Non-determinism changes the game. The fundamentals don't. What week 5 reminded us about building things that last.

At DataJourneyHQ Academy, we run a six-week bootcamp for building production AI systems. Every week surfaces something worth writing down.

There is a version of the AI hype cycle that treats LLMs as something entirely alien — a different category of creature that demands its own special rulebook.

It doesn’t <> Not entirely!


Two worlds, one system

Traditional software is deterministic. Same input, same output. That’s what makes it testable, observable, and debuggable.

LLMs break that contract — by design. The non-determinism is the point.

But the system is made of both. The orchestration layer, the API calls, the retry logic, the database writes: all deterministic. The LLM is one node in that graph. Non-deterministic, yes. Still a node.

The mistake we keep seeing is engineers treating the whole system as if one set of rules applies everywhere. You need both lenses, simultaneously.


Non-determinism is not an excuse

When a model returns something unexpected, “models are unpredictable, what can you do” is not an answer. That’s an engineering failure.

The model is doing what models do. Our job is to build the system around it so that non-determinism is bounded and recoverable:

  • Structured outputs — constrain what the model can return; don’t parse freeform text in production
  • Output validation — treat LLM responses like external API responses: verify before you trust
  • Retry with fallback — if the output is invalid, have a path that doesn’t crater the request

None of these are AI ideas. They are software engineering ideas applied to a non-deterministic component.


AIOps: you can’t fix what you can’t see

This is where the field is genuinely underdeveloped.

Traditional observability tells you what happened. In an AI system you need more: what did the prompt look like at runtime, why did the model take that path, where did latency actually live, which node quietly drifted.

The kinds of things that only surface when you instrument properly:

  • A retrieval step silently returning empty context. The model hallucinates with full confidence. No error thrown.
  • Token usage spiking 4× on one class of input, invisible until you pull per-run cost traces
  • An agent loop averaging 800ms — sitting at 14 seconds at p95. The happy path looked fine (p95 = the slowest 5% of requests; averages hide tail latency entirely)
  • Prompt output drifting after a model update, no crash Just quietly wrong answers for three days.

This week our engineers instrumented their pipelines and traced their agent runs step by step. The reaction was consistent: “I thought I understood what my system was doing, I did not!”

AIOps is not optional. It’s the line between a demo and a product.


The real lesson

The fundamentals don’t go away because you added an LLM. They become more important.

Separation of concerns + fail gracefully + Instrument everything + make it reproducible. These matter more when one component can’t be fully controlled — not less


The engineers who build the best AI systems aren’t the ones who understand AI the best. They’re the ones who understand systems — and know how to make room for the parts they can’t fully predict.

Capstone demos are this week. We’re excited 🎉