## Asset Header - **Asset ID:** PBO-BVH-AIAgentMemoryArchitecture-v01 - **Version:** v01 - **Status:** Draft - **Owner:** Victor Heredia - **IntellBank:** IB-BVH-Publications - **Tipo:** PBO — PlayBook Operativo - **Propósito:** PBO BVH AIAgentMemoryArchitecture v01 - **Última actualización:** 2026-04-11 --- **Executive Summary** Out of the box, AI models are inherently stateless; each new conversation starts as a blank slate because there is no memory between calls. To build robust AI applications, we must implement a memory system that passes conversation history into the context window and persists key information. While many assume this requires complex retrieval pipelines or vector databases, highly effective systems can be built using simple markdown files combined with precise read/write mechanisms. This playbook outlines a framework for engineering agent memory, handling context limits, and building the necessary infrastructure based on industry-leading patterns. -------------------------------------------------------------------------------- 1. The Core Memory Architecture A robust memory system is divided into two main categories: the **Session** (the history of a single active conversation) and **Long-term Memory** (the categorized information that survives after the session ends). According to Google's 2025 white paper on context engineering, long-term agent memory should be structured into three distinct types: - **Episodic Memory:** Records of events or past interactions with the user (e.g., "what happened in our last conversation"). - **Semantic Memory:** Pure facts, stable identity information, and user preferences (e.g., "what does the LLM know about the user"). - **Procedural Memory:** Workflows, learned routines, and knowledge on how to accomplish specific tasks. 2. Infrastructure & Storage (The Markdown Approach) You do not strictly need specialized vector databases to start; systems like OpenClaw and Claude Code successfully use markdown files. You can model your storage using three core components: - **The Semantic Store (****memory.md****):** Contains stable facts and user preferences. This file should have a structural cap (e.g., 200 lines) and is injected into every single prompt. - **Daily Logs (Episodic):** Append-only files containing recent context organized by day. - **Session Snapshots (Episodic):** Raw markdown text files capturing the last ~15 meaningful user and assistant messages (excluding tool calls and system messages) before a session is wiped. 3. Read/Write Mechanisms (The Agentic Triggers) Files are useless unless read and written at the right moments. Implement these four core mechanisms to manage the data lifecycle: 1. **Bootstrap Loading:** At the start of every session, automatically inject the Semantic Store (`memory.md`) into the system prompt. Instruct the agent to read the Daily Logs from today and yesterday for immediate recent context. 2. **Pre-Compaction Flush:** Before a session hits the LLM's context window limit, inject a silent, invisible agentic turn instructing the LLM to save all vital information to the Daily Log. This acts as a write-ahead log, turning potential context loss into a safe checkpoint. 3. **Session Snapshot Hook:** When a user resets or starts a new session, use a system hook to grab the meaningful messages from the previous session, generate a descriptive file name, and save it as a snapshot before the context is wiped. 4. **User-Directed Routing:** When a user explicitly says "remember this," rely on the agent's file-writing capabilities and system instructions to route the information to either the Semantic Store or the Daily Log. 5. Context Window Management (Compaction) Because LLMs have finite context windows, you must implement **compaction**—the process of shrinking conversation history down to its most relevant parts to continue the session. Choose one of these three triggers: - **Count-based:** Triggers when the session exceeds a specific token size or turn count. - **Time-based:** Triggers in the background when the user stops interacting for a set period of time. - **Event-based (Semantic):** Triggers when the agent detects a task or topic has concluded. This is the most intelligent but most difficult to implement. 5. Data Maintenance & Quality Control To prevent your agent's memory from becoming a noisy, contradictory mess, the system must process data intelligently, typically utilizing a secondary, background LLM instance: - **Targeted Filtering:** Do not save everything. The system must extract only key concepts and facts. - **Consolidation & Overwriting:** If a user says they prefer dark mode, then says they hate it, and later switches back, the memory system must collapse these entries into a single updated entity ("User prefers dark mode") rather than storing three conflicting statements. **Summary Directive for the Team:** Every memory implementation you build should answer three simple questions: _What is worth remembering, where does it go, and when does it get written?_