Data Decay as Architecture
Why agentic systems need domain-specific decay strategies built into the data model, not bolted on as retention policies after the fact.
I want to explore a problem that I think is underappreciated in the design of agentic systems. It is not the problem of too little memory. It is the problem of memory that no longer reflects reality, and of the wrong parts of an operation being captured as memory in the first place.
A competitive signal captured eight months ago that no longer reflects the market. A preference recorded during an evaluation that ended two quarters back. A learned pattern about behaviour that was accurate for a specific cycle and has not applied since. This kind of data does not announce itself as stale. It sits quietly in the knowledge store, looking exactly like current intelligence. And when an agent pulls it into a context window, it reasons against it with the same confidence it applies to something captured yesterday. There is no mechanism in the model to distinguish a fresh signal from one that has gone rotten, unless you build that mechanism yourself.
That is what I mean by data decay as architecture. Not a retention policy applied after the fact. Not a cleanup script that runs quarterly. A structural property of the data model that tracks freshness, defines expiration conditions, and gives the agent the information it needs to know how much to trust what it knows.
Blanket retention rules are the wrong tool for this problem.
The instinct, when people recognise the staleness problem, is to apply uniform retention rules. Everything expires after six months, or after a year, or after some arbitrary calendar interval. These are better than nothing, but I think they misunderstand the nature of the problem.
A blanket rule treats all data as if it has the same shelf life. It does not. A macro-economic trend might remain relevant for two years. A competitive positioning signal might be stale in six weeks. A set of stated priorities might hold for the duration of a project or shift after a single leadership change. The shelf life of a piece of intelligence is a property of the domain it belongs to, not a property of time itself.
When you apply a uniform decay policy, data that should still be active gets archived, and data that should have been pruned months ago survives. You get false negatives and false positives at the same time, which is worse than either one in isolation because it erodes trust in the system's memory without making that erosion visible.
Decay is a function of the domain, not the calendar.
I think the clearest way to see this is to look at how different categories of intelligence age differently.
Event-driven decay applies to intelligence whose relevance is tied to a state of affairs rather than a date. Priorities, relationships, and strategic direction do not expire on a schedule. They expire when something changes: a new leader, a reorganisation, a pivot in strategy. The decay trigger is not time passing. It is a change event that invalidates the conditions under which the intelligence was captured.
Velocity-based decay applies to intelligence in domains that move at different speeds. In a fast-moving competitive landscape, a positioning signal might be relevant for weeks. In a stable regulatory environment, the same category of signal might hold for a year or more. The decay rate is a property of how quickly the domain itself changes, which means it needs to be calibrated to the domain rather than set universally.
Reinforcement-based decay applies to learned patterns. A pattern that keeps getting reinforced by new observations stays active and maintains its confidence. A pattern that stops being reinforced loses confidence gradually and eventually archives. This is the natural complement to the learning cycles described in the companion article. What the learning cycle builds up, reinforcement-based decay allows to fade when the evidence stops arriving.
Threshold-based decay applies to intelligence that is binary in nature. Regulatory requirements, compliance standards, and certain contractual conditions are either current or they are not. The trigger for decay is a discrete external event, a law changing, a ruling being issued, a contract expiring, and there is no gradual fade. The intelligence is valid until the threshold event, and invalid after it.
Four categories of intelligence, four fundamentally different decay strategies. I think this is why blanket rules fail. They assume decay is one thing, when in practice it is at least four different things depending on what kind of intelligence you are managing. I should note that these categories are high-level. On the ground, the boundaries between them are not always clean, and a given system will likely encounter decay patterns that do not map neatly to any single category. The point is not to treat these as rigid rules but as a way of recognising that decay is structurally varied and needs to be designed with that variety in mind.
Design decay into the data model at the point of capture.
I would argue that the right place to define decay behaviour is at the moment the intelligence is captured, not in a separate cleanup process that runs later. Every piece of intelligence the system captures should carry a decay profile as part of its metadata.
This means the data model includes fields like decay type (event-driven, velocity-based, reinforcement-based, or threshold-based), decay triggers (what conditions would make this stale), confidence at capture, and last reinforced. When agents query the knowledge store, they do not just get the intelligence. They get the intelligence and its freshness profile. A reasoning framework can then teach the agent how to weight data based on its decay status, treating high-freshness intelligence differently from intelligence that is approaching its expiration conditions.
The point of doing this at capture rather than after the fact is that the person or process capturing the intelligence is in the best position to know what kind of intelligence it is and what would make it stale. A cleanup script running months later does not have that context. It can only apply calendar-based rules, which, as I have argued, are the wrong tool.
What happens to intelligence that has decayed.
I think it is worth being explicit about what happens to intelligence that has decayed past its usefulness, because the answer is not always the same.
In many cases, stale intelligence still has value as historical context, as a baseline for detecting change, and as training data for understanding how a domain evolves over time. Archiving rather than deleting makes sense in those situations. But this is not a universal rule. Some systems benefit from aggressive archiving. Others need a lighter touch, or a different approach entirely depending on the kind of intelligence and the cost of retaining it. The important principle is that decayed intelligence should not remain in the active reasoning layer as though it were current. How it is handled beyond that, whether archived, summarised, or removed, depends on the system and the use case.
When the architecture does include an archive, the critical point is that archived data should not re-enter active reasoning without explicit re-validation. The boundary between what is current and what is historical needs to be maintained, however the system chooses to implement it.
The silent failure mode of systems without decay architecture.
I want to close with the failure mode that I think makes this problem so insidious. It is not a dramatic failure. It is a quiet one.
I have seen systems where the database was technically working, queries returned results, and agents produced output. But the output quality had been declining for months. Not because the agents were worse. Not because the frameworks had degraded. Because the intelligence they were reasoning against was increasingly stale, and nothing in the system was flagging that staleness.
The symptoms are subtle. Recommendations start feeling slightly off. Analysis is technically defensible but does not match the current reality. Confidence scores are high, because the data the agent has does support its conclusions, but the data itself no longer reflects what is happening. The agents do not break. They do not hallucinate in the dramatic sense. They produce increasingly irrelevant output wrapped in high confidence, because nothing in the architecture told the system that its knowledge was aging.
This is what I mean by decay as architecture rather than policy. When decay is built into the data model, the system knows what it knows is current, what it suspects might be stale, and what it used to know but no longer trusts. That awareness is the difference between a system that is still trustworthy in month twelve and one that quietly stopped being trustworthy months ago without anyone noticing.