May 2026

How Agentic Systems Stay Aligned

Misalignment and obsolescence are the default outcome, not the exception. Decay and relevance, designed correctly per use case, are the only mechanism that prevents it.

Agentic systems in production do not stay aligned to the businesses they serve by default. Left alone, they drift. The data layer accumulates everything that was ever true without distinguishing it from what is true now. The frameworks that encode expert judgment stay fixed while the business keeps moving. The intelligence the system reasons against becomes progressively less accurate, not through any single failure but through the slow, invisible accumulation of distance between what the system knows and what is currently real.

I would argue this is not a risk to be managed. It is the default trajectory. A system that is well-calibrated at month three will be noticeably degraded at month nine and functionally misaligned at month eighteen, not because anything broke but because nothing was built to keep pace with a business that never stopped changing. The question is not whether an agentic system will fall out of alignment. It is how quickly, and whether the architecture was designed to prevent it.

The real solution is designing misalignment out of the architecture itself. Not through periodic maintenance or scheduled cleanup, but through mechanisms that are continuously, automatically, and accurately tracking what is currently relevant and decaying what is not. Decay and relevance are two sides of the same mechanism, and together they are what keeps a system calibrated. Getting them right requires designing them separately for every use case, because what "currently relevant" means is completely different depending on what kind of intelligence you are talking about.

Misalignment has two sources.

The first source is data. The system is reasoning against intelligence that is no longer accurate. Signals that were informative when they were captured have since been superseded. Hypotheses that were plausible when they were formed have gone unreinforced. Records that reflected a relationship in one state have drifted while the relationship moved to another. This is the decay problem. The data layer ages out of alignment with reality and the system reasons against a picture of the world that is increasingly historical rather than current.

The second source is frameworks. The system is reasoning correctly against current data, but through an interpretive architecture that no longer fits the business. The frameworks were built when the business operated in one context. The business has since shifted, pursuing new markets, encountering new patterns, learning things about its own operations that the original frameworks could not have anticipated. The frameworks have not been updated. The system's understanding of what matters, what a healthy trajectory looks like, what the important signals are, all of it is calibrated to a version of the business that has been superseded. Clean data reasoned through an outdated framework still produces wrong outputs.

Both problems are permanent features of any production system deployed against a living business. Neither solves itself. And solving one without the other achieves less than it appears. A current data layer reasoned through outdated frameworks produces confident outputs from a miscalibrated interpretive lens. Updated frameworks applied to a data layer thick with stale intelligence applies the right reasoning to the wrong picture. I think of alignment as requiring both, simultaneously, as a continuous process rather than a periodic correction.

Different types of signals decay at different rates.

A production agentic system does not have a single data decay problem. It has several, running simultaneously, each with different properties and different stakes. The distinctions matter for how you design a decay architecture, because different kinds of intelligence age at fundamentally different rates and for fundamentally different reasons.

Some signals are ephemeral by nature. A moment of contact, a conversation, a logged interaction. These tend to lose their interpretive value relatively quickly unless they have been absorbed into a broader pattern. Others are structural, facts about state and change that remain load-bearing long after they were captured, because understanding how something moved and when it moved continues to inform the system's reasoning. And then there are the interpretations the system builds from signals, hypotheses and patterns that are not data at all but claims about what the data means. These age differently again. A hypothesis that keeps being reinforced by new evidence should compound in confidence. One that sits unreinforced should weaken, not because it was wrong but because the system's basis for asserting it has eroded.

The point is not that every system will have exactly these categories, or that the boundaries between them are always clean. It is that different kinds of intelligence serve different functions in the system's reasoning, and a decay architecture that treats them uniformly will inevitably over-retain some and under-retain others. The design work is in understanding what kinds of intelligence your specific system holds and how each one ages in the context it operates in.

How you design decay defines what the system currently knows.

This is where decay becomes an alignment tool rather than just a cleanup policy, and I think it is the most important idea in the article.

When a signal is referenced in a live hypothesis, when it is actively part of a claim the system is still making, that signal's freshness extends. It does not matter how old the signal is in calendar time. It is currently doing work. The hypothesis it supports is still forming, still testable, still being assessed against incoming evidence. The signal that underpins it should stay active for exactly as long as the hypothesis that needs it stays active.

When a pattern match is running, when the system has identified a set of signals that match a known pattern and is waiting for the outcome, the evidence those signals constitute should stay active as well. The pattern is still live. The signals are still load-bearing. They decay when the pattern concludes, because at that point they have been absorbed into a confirmed or disconfirmed pattern rather than floating as raw active intelligence.

The result of this logic is something precise: what survives decay at any point in time is exactly the intelligence the system is currently using. Not the intelligence that was useful six months ago. Not the intelligence that might be useful again someday. The intelligence that is actively informing current judgment. The system's active memory layer is not a rolling window of recent data. It is a dynamic map of what is currently relevant, shaped by the system's own reasoning rather than by a calendar.

This is what makes it an alignment mechanism. The shape of what survives reflects what the system is currently reasoning about, which reflects what the business currently needs the system to understand. When the business changes, when priorities shift, when the operating context moves, what is relevant changes with it. Old signals that were informing old hypotheses decay out of active memory because no new hypotheses are drawing on them. New signals accumulate around the questions the business is now asking. The system's intelligence layer tilts naturally toward the current. Not because someone ran a cleanup job, but because the architecture was built to let relevance define what persists.

Data decay handles the past. Framework evolution handles the future.

Decay keeps the data layer current. But there is a second alignment problem that decay alone cannot solve: the frameworks themselves can become misaligned.

A framework built on last year's understanding of the business is still a valid framework. It encodes real expertise from real experience. But if the business has been changing in ways that the framework does not describe, and the system has been accumulating evidence of this change across multiple instances, the framework is increasingly out of step with the business it serves. Every judgment the system makes is being run against a baseline that no longer reflects reality.

I believe the solution is not to rebuild frameworks from scratch on a schedule. That is too slow, too expensive, and disconnected from the actual evidence. The solution is a loop that reads what the system has learned from live data and proposes specific, evidence-backed updates to the framework when the evidence warrants it.

In practice, this means a dedicated process, operating on a regular cadence or triggered when pattern volume reaches a threshold, that reads confirmed patterns and asks: does this framework still accurately describe what we are seeing? Are there consistent patterns across multiple instances that the framework does not account for? When the process finds evidence of a gap, it produces a specific, evidence-cited proposal: here is the pattern observed, here are the instances, here is the proposed change, here is the confidence level. The proposal enters a human review queue. The human evaluates the evidence, approves or rejects the proposal, and the framework updates only with explicit authorisation.

The system never updates its own foundational frameworks autonomously. The evidence gathering is automated. The pattern recognition is automated. The proposal generation is automated. The decision is human. This boundary is not a limitation of the system's capability. It is the mechanism by which the system's evolution remains deliberate.

The alignment mechanisms are not all the same kind.

I think the instinct when building these loops is to ask: why does a human need to approve a framework update that is backed by strong evidence and a high confidence score? If the evidence is that strong, should the system not just apply it?

Part of the answer is that the system cannot always evaluate whether the evidence is representative. Multiple instances showing a new pattern might reflect a genuine shift. Or they might reflect a temporary period in which the business pursued an adjacent direction and found limited success, a strategic experiment that has already been decided against. The system does not know about that decision. It sees instances and a pattern. A human who reviews the framework update knows the strategic context. They can evaluate whether the pattern the system observed is the one the business wants to optimise for going forward, or whether it is an artifact of a temporary direction that has already been reversed.

But the human approval gate is not the only alignment mechanism, and I think it is important not to frame it as though it were. The decay architecture itself is an alignment mechanism. When it is well-designed, the data layer self-corrects toward what is currently relevant without any human intervention at all. The signals that are still doing work persist. The ones that have been superseded fade. That continuous, automatic recalibration is alignment too, just a different kind, alignment to the current state of the world as the system encounters it rather than alignment to strategic direction.

These are complementary. The decay cycle is where the system stays aligned to what is currently true. The framework evolution cycle, with the human in the loop, is where the system stays aligned to where the business is going. Both are necessary. Neither works without the other. A system with sophisticated decay logic but static frameworks will have a clean data layer and an increasingly outdated interpretive architecture. A system with evolving frameworks but no decay logic will be reasoning with current frameworks against a data layer thick with stale intelligence. The alignment problem requires both, and the human sits at the point where new intelligence becomes incorporated into the system's foundational judgment, not as the sole alignment mechanism but as the one that supplies the context the data alone cannot provide.

What misalignment actually looks like in production.

I want to be specific about the failure mode, because misaligned systems rarely announce themselves. The outputs do not stop. The confidence scores do not disappear. The system keeps producing well-structured, plausible-looking intelligence. That is precisely what makes the misalignment hard to detect until it has done real damage.

A scoring system reasoned against months-old engagement signals will assess things based on contact patterns that no longer reflect the actual relationship. An identification agent running a framework built for one context will misread every situation from a different context it encounters, not with obvious errors but with subtly wrong conclusions that feel right because they are internally coherent. A risk agent applying a baseline that has not been updated will flag the wrong things and miss genuine risks in areas that match patterns the business has started pursuing but the system has not been taught to recognise.

None of these failures are visible from the output alone. The outputs look like outputs. The errors are in what they mean, and detecting them requires comparing the system's intelligence against reality. That comparison requires a human who is close enough to the business to know what reality currently looks like.

This is the reason the architecture I have been describing is not optional for systems deployed against live businesses. It is not an enhancement to a baseline that works without it. The baseline, a system with no decay architecture, no relevance extension, no framework evolution loop, does not work. It works at the beginning, when the data is fresh and the frameworks are current, and then it degrades on a schedule determined entirely by how fast the business changes. For most businesses in most markets, that schedule is faster than anyone expects.

The compounding effect.

When all of this is functioning together, signals decaying at rates appropriate to their type, active references extending what is currently useful, hypotheses accumulating confidence when reinforced and losing it when not, framework proposals generated from pattern evidence and approved by humans who have the strategic context, the system does something that neither component achieves alone. It compounds.

The intelligence layer does not just stay current. It gets more precise over time. The frameworks get sharper as more instances are confirmed and incorporated. The hypothesis confidence thresholds get more accurate as the system builds a record of which confidence levels predicted confirmed patterns and which did not. The system develops a progressively more accurate model of this specific business in this specific context, one that could not have been built from scratch at day one because the business itself was not yet what it has become.

I believe this compounding only runs in one direction if the human approval gate on framework updates is maintained. Remove it and the compounding becomes amplification. The system gets faster at reinforcing whatever patterns it has observed, including patterns that reflect strategic experiments the business has already decided to abandon, conditions that have reversed, or directions the business pursued opportunistically and does not want to optimise for. The system without a human gate does not compound toward truth. It compounds toward confidence. Those are not the same thing.

Not rotting is the baseline. Compounding toward an increasingly accurate model of the business is the goal. The distance between them is the architecture I have described in this article, and the human who sits at the centre of it, not as a bottleneck but as the mechanism by which the system's growth remains pointed in the right direction.

The design philosophy

The design
philosophy.

This is the final article in the series. The full collection is here.

Back to all articles → or start a conversation