Building a Persistent, Personalized Agentic AI with Memory Decay and Self-Evaluation

Why persistent memory matters

A simple chatbot can answer a prompt, but a companion that remembers your preferences, past topics, and projects becomes far more useful. Persistent memory enables an agent to recall context, adapt its tone and recommendations, and improve over time. This tutorial walks through a compact, rule-based implementation that demonstrates the key ideas: storing memories, applying decay to forget stale items, retrieving relevant context, and measuring personalization gains.

Memory model and decay

The core data structures are MemoryItem and MemoryStore. MemoryItem holds a kind, content, a base score and a timestamp. MemoryStore keeps items and applies an exponential decay factor so memories fade over time unless reinforced.

import math, time, random
from typing import List
 
 
class MemoryItem:
   def __init__(self, kind:str, content:str, score:float=1.0):
       self.kind = kind
       self.content = content
       self.score = score
       self.t = time.time()
 
 
class MemoryStore:
   def __init__(self, decay_half_life=1800):
       self.items: List[MemoryItem] = []
       self.decay_half_life = decay_half_life
 
 
   def _decay_factor(self, item:MemoryItem):
       dt = time.time() - item.t
       return 0.5 ** (dt / self.decay_half_life)

This establishes the foundation for long-term memory: items with timestamps and an exponential decay function that reduces an item's effective strength over time.

Adding, searching and cleaning memories

A memory store needs simple methods to add entries, find the most relevant items for a query, and remove weak memories. The demo uses a lightweight similarity function (shared token count) combined with decay-modified scores.

 def add(self, kind:str, content:str, score:float=1.0):
       self.items.append(MemoryItem(kind, content, score))
 
 
   def search(self, query:str, topk=3):
       scored = []
       for it in self.items:
           decay = self._decay_factor(it)
           sim = len(set(query.lower().split()) & set(it.content.lower().split()))
           final = (it.score * decay) + sim
           scored.append((final, it))
       scored.sort(key=lambda x: x[0], reverse=True)
       return [it for _, it in scored[:topk] if _ > 0]
 
 
   def cleanup(self, min_score=0.1):
       new = []
       for it in self.items:
           if it.score * self._decay_factor(it) > min_score:
               new.append(it)
       self.items = new

This design keeps retrieval cheap and ensures that weak or old memories are dropped automatically, preventing memory overload and stale context.

Agent design and a mock LLM

The Agent class wraps the memory store and exposes perceive and act functions. Perceive extracts preferences, topics, and projects from user inputs and stores them. Act retrieves context for a query, passes it to a simple LLM simulator, then logs the dialog and triggers cleanup.

class Agent:
   def __init__(self, memory:MemoryStore, name="PersonalAgent"):
       self.memory = memory
       self.name = name
 
 
   def _llm_sim(self, prompt:str, context:List[str]):
       base = "OK. "
       if any("prefers short" in c for c in context):
           base = ""
       reply = base + f"I considered {len(context)} past notes. "
       if "summarize" in prompt.lower():
           return reply + "Summary: " + " | ".join(context[:2])
       if "recommend" in prompt.lower():
           if any("cybersecurity" in c for c in context):
               return reply + "Recommended: write more cybersecurity articles."
           if any("rag" in c for c in context):
               return reply + "Recommended: build an agentic RAG demo next."
           return reply + "Recommended: continue with your last topic."
       return reply + "Here's my response to: " + prompt
 
 
   def perceive(self, user_input:str):
       ui = user_input.lower()
       if "i like" in ui or "i prefer" in ui:
           self.memory.add("preference", user_input, 1.5)
       if "topic:" in ui:
           self.memory.add("topic", user_input, 1.2)
       if "project" in ui:
           self.memory.add("project", user_input, 1.0)
   def act(self, user_input:str):
       mems = self.memory.search(user_input, topk=4)
       ctx = [m.content for m in mems]
       answer = self._llm_sim(user_input, ctx)
       self.memory.add("dialog", f"user said: {user_input}", 0.6)
       self.memory.cleanup()
       return answer, ctx

The mock LLM adapts replies based on detected preferences (for example, eliminating a leading 'OK.' if the user prefers short answers) and recommends topics when requested, using retrieved memory as signal.

Measuring personalization benefit

To quantify how memory helps, the demo compares an agent with a populated memory against a cold-start agent. The evaluation measures the difference in response length as a proxy for personalization impact.

def evaluate_personalisation(agent:Agent):
   agent.memory.add("preference", "User likes cybersecurity articles", 1.6)
   q = "Recommend what to write next"
   ans_personal, _ = agent.act(q)
   empty_mem = MemoryStore()
   cold_agent = Agent(empty_mem)
   ans_cold, _ = cold_agent.act(q)
   gain = len(ans_personal) - len(ans_cold)
   return ans_personal, ans_cold, gain

Demo run and observations

The final script initializes a MemoryStore with a short half-life for demo purposes, teaches the agent a few preferences and topics, then queries it and prints the used memory snapshot. You can observe how the agent recommends different topics when it has memory versus when it is cold-started.

mem = MemoryStore(decay_half_life=60)
agent = Agent(mem)
 
 
print("=== Demo: teaching the agent about yourself ===")
inputs = [
   "I prefer short answers.",
   "I like writing about RAG and agentic AI.",
   "Topic: cybersecurity, phishing, APTs.",
   "My current project is to build an agentic RAG Q&A system."
]
for inp in inputs:
   agent.perceive(inp)
 
 
print("\n=== Now ask the agent something ===")
user_q = "Recommend what to write next in my blog"
ans, ctx = agent.act(user_q)
print("USER:", user_q)
print("AGENT:", ans)
print("USED MEMORY:", ctx)
 
 
print("\n=== Evaluate personalisation benefit ===")
p, c, g = evaluate_personalisation(agent)
print("With memory :", p)
print("Cold start  :", c)
print("Personalisation gain (chars):", g)
 
 
print("\n=== Current memory snapshot ===")
for it in agent.memory.items:
   print(f"- {it.kind} | {it.content[:60]}... | score~{round(it.score,2)}")

Practical takeaways

Even simple rule-based memory plus decay produces meaningful personalization: storing preferences and topics helps the agent give more relevant recommendations.
Decay prevents uncontrolled growth and forces reinforcement of important memories.
A small evaluation loop (personalised vs cold) quantifies benefit and helps tune which kinds of memories matter most.

This compact system is easy to extend: swap the token-based similarity for an embedding search, persist the MemoryStore to disk, or add reinforcement when a suggestion is accepted. The full demo shows how persistent memory transforms a static script into an evolving, context-aware companion.

Building a Persistent, Personalized Agentic AI with Memory Decay and Self-Evaluation

Сменить язык