## Asset Header - **Asset ID:** MiPg-BVH-KarpathyLLMWiki-v01 - **Version:** v01 - **Status:** Draft - **Owner:** Victor Heredia - **IntellBank:** IB-BVH-Publications - **Tipo:** MiPg — Mini Page - **Propósito:** The Karpathy LLM Wiki: Building Compounding AI Memory Systems - **Última actualización:** 2026-04-11 --- ** # The Karpathy LLM Wiki: Building Compounding AI Memory Systems The Deep Introduction: The Shift from Ephemeral RAG to Compounding Memory Traditional interactions with Large Language Models (LLMs) and standard Retrieval-Augmented Generation (RAG) systems suffer from a critical flaw: they are ephemeral 1, 2. Every time you query them, they rediscover knowledge from scratch, piecing together fragments of context without actually building a persistent understanding 2, 3. Andrej Karpathy’s LLM Wiki methodology fundamentally flips this architecture 3. Instead of retrieving chunks at query time, the AI builds and maintains a highly structured, interlinked network of Markdown files upfront 3, 4. Knowledge inside this system compounds like interest in a bank 1. The system operates on a "compiler" analogy 5. Your raw articles, PDFs, and transcripts are the "source code." The LLM acts as the "compiler" that processes this data, and the resulting Wiki is the "executable" environment that you query 6, 7. By shifting the tedious bookkeeping, summarizing, and cross-referencing entirely to the AI, humans are freed to focus strictly on curating sources and asking high-level questions 4, 8. Because the system relies on plain Markdown files rather than complex vector databases, it is universally interoperable, entirely private, and incredibly token-efficient 9, 10. ### 🚀 THE PLAYBOOK: Install, Use, and Exploit the LLM Wiki #### PHASE 1: INSTALLATION (The Environment Setup) Setup takes less than 5 minutes and requires zero complex infrastructure. 1. Download the Canvas (Obsidian): Download Obsidian (obsidian.md) for free 11, 12. This acts as the visual IDE for your knowledge base, allowing you to see the actual graph nodes and back-links the AI generates 4, 7, 13. Create a new local vault (e.g., "Research_Vault") 14, 15. 2. Launch the Agentic Compiler: Open your terminal or a code editor like VS Code, navigate to your new vault folder, and launch your agentic coder (such as Claude Code) 14, 15. 3. Inject the Master Schema: Copy Karpathy’s original lm-wiki.md system prompt and paste it directly into Claude Code 16-18. This prompt instructs the AI to autonomously build the necessary directory structure: a Raw folder, a Wiki folder, an index.md, and the configuration schema (claude.md) 1, 16, 19, 20. 4. Install the Web Clipper: Add the free Obsidian Web Clipper extension to your browser 18, 21. Configure its settings so that any web article you clip is dropped directly into your vault's Raw folder 21-23. #### PHASE 2: USAGE (The Core Operational Loop) The framework relies on three fundamental operations: Ingest, Query, and Lint. 1. The Ingest Run: Drop your source materials (web clippings, PDFs, meeting transcripts) into the Raw folder 1, 24, 25. Command Claude Code to "ingest the new sources." The agent will read the immutable raw data, extract key concepts, write interconnected Markdown summaries in the Wiki folder, flag any contradictions, and seamlessly update the master index 24-26. 2. The Query Run: Ask the agent complex questions. Instead of using similarity search across a vector database, the agent navigates the index.md and follows explicit links to synthesize an answer 25, 27, 28. Because the AI is reading a highly organized index rather than massive raw files, token usage can drop by up to 95% 9. 3. The Lint Run: Periodically instruct the agent to run a "lint" or health check on the wiki 17, 29. The LLM will autonomously scan the files to fix broken links, impute missing data, clean up stale claims, and suggest new areas for research 25, 28, 29. #### PHASE 3: EXPLOITATION (Advanced Tactics & Scaling) How to push the framework beyond a standard research database. 1. Continuous Backfilling: When querying the wiki, if you ask a question the system doesn't have the answer to, prompt the AI to search the external web 1, 30. Once it finds the answer, instruct it to automatically backfill the wiki by creating new, permanent Markdown pages for those concepts 30-32. The wiki gets permanently smarter with every knowledge gap you hit. 2. Self-Evolving Codebase Memory: You can adapt this architecture for internal software projects by installing custom Claude Code Hooks (session_start, pre_compact, session_end) 33-35. Whenever you finish a coding session, a background agent will automatically summarize your architectural decisions, extract the lessons learned, and "flush" them into a daily log within the wiki 33, 35, 36. Your agent develops long-term memory that evolves natively alongside your codebase 33, 37, 38. 3. Multi-Vault Orchestration: Do not throw all your life's data into a single folder. Build separate, localized LLM Wikis for distinct domains 39. For example, keep one vault for YouTube research, one for Trading Strategies, and one to act as the internal memory for an AI Executive Assistant 39-41. You can route different agents to specific vaults depending on the context they need 40. ### 🧠 The Cognitive Stack # 📘 BCV-Karpathy's LLM Wiki (Synthesized)-CognitiveStack-v02 ## LAYER 0: CONTENT ARCHITECTURE (Expanded Sequence) - 1. The Problem: Ephemeral Chats vs. Compounding Knowledge: Traditional LLM interactions and RAG (Retrieval-Augmented Generation) systems are ephemeral; they rebuild context from scratch for every query, failing to accumulate understanding. The LLM Wiki flips this by building an interlinked, persistent memory base where knowledge compounds over time like interest in a bank. - 2. The 3-Tier Architecture (Raw -> Wiki -> Schema): The system operates on a clean separation of concerns. - Raw Sources: Unprocessed, immutable data (PDFs, web clippings, YouTube transcripts). - The Wiki: A directory of Markdown files completely generated and managed by the LLM (summaries, concepts, entities). - The Schema: The rulebook (e.g., claude.md or index.md) that tells the agent how the wiki is structured and updated. - 3. The Compiler Analogy & CI/CD Pipeline: The workflow mimics software engineering. Raw data is the "source code," the LLM is the "compiler," and the resulting Wiki is the "executable" environment you query. It uses operations like "Ingest" (process data), "Query" (retrieve data), and "Lint" (run health checks for contradictions and missing links). - 4. The IDE: Obsidian as the Canvas: Instead of opaque databases, the system uses universal Markdown files visualized through Obsidian, allowing users to see the actual graph nodes and relationships (back-links) generated by the AI. - 5. Advanced Implementations (Internal Memory & Web Backfilling): The framework extends beyond simple research. It can be hooked into an agent's internal workflow to automatically log session summaries and evolve a codebase's memory. Additionally, if a query hits a gap in the wiki, the agent can perform an external web search and automatically backfill the wiki with new markdown pages. ## LAYER 1: COGNITIVE STACK (The DNA) - Cognitive Lenses: - Systems Thinking: Viewing knowledge not as a static document, but as a dynamic codebase that requires compilers, schemas, and linting. - Asymmetric Leverage: Shifting the heavy lifting (bookkeeping, cross-referencing) entirely to the AI, allowing the human to focus strictly on curation and high-level strategy. - Mental Models: - File Over App: Prioritizing explicit, open-source, easily readable formats (Markdown, local files) over proprietary, opaque formats (Vector Databases, embeddings). - The Human-AI Division of Labor: The human asks the questions and curates the sources; the AI does the grunt work of indexing, summarizing, and formatting. - Invariable Principles: - Zero Manual Maintenance: A human should never write or manually link the wiki; the LLM must own the entire maintenance burden to prevent the system from being abandoned. - Explicit Knowledge: Every connection the AI makes must be visible as a back-link or node in the graph, ensuring transparency. - Thinking Rules: - If a new source is dropped in the raw folder, Then the LLM must read it, update existing concept pages, create new entity pages, and update the master index. - If the user asks a question the wiki cannot answer, Then execute an external search, provide the answer, and create a permanent page in the wiki for that new concept. - Cognitive Algorithms: - Ingestion Sequence: 1. Ingest raw text -> 2. Extract key concepts/entities -> 3. Write summary -> 4. Flag contradictions -> 5. Cross-reference with existing data -> 6. Append to index/log. ## LAYER 2: THE PLAYBOOK (Actionable Protocol) - The Core Protocol (5 Steps to Build an LLM Wiki): - Initialize the Environment: Download Obsidian, create a new vault, and set up a base folder structure containing a Raw folder and a Wiki folder. - Set the Schema: Inject a master prompt (schema/claude.md) into your coding agent (e.g., Claude Code) to define the roles, folder routing, and linking conventions. - Data Ingestion: Drop raw materials (URLs clipped via Obsidian Web Clipper, PDFs, meeting transcripts) directly into the Raw folder. - Agentic Compilation: Instruct the agent to process the raw folder. It will autonomously generate categorized, heavily cross-linked Markdown files inside the Wiki folder. - Query & Lint: Query the system for complex synthesis. Periodically tell the agent to "lint" the wiki to clean up stale data, fix broken links, and impute missing information. - Distilled Skills: - Prompt Engineering for Systems: Writing structural prompts that define file operations and graph relationships, not just conversational outputs. - Agent Orchestration: Managing an autonomous entity to perform asynchronous data-processing loops. - Success Metrics: - Token Efficiency: Achieving up to a 95% reduction in token usage because the agent navigates a structured index rather than reading the entire context window every time. - Health of the Graph: Dense, highly interconnected nodes visible in the Obsidian graph view, proving successful autonomous cross-referencing. ## LAYER 3: VOICE & ACTIVATION (The Signature) - Voice Patterns: Technical, highly structured, pragmatic, and heavily reliant on software engineering terminology (e.g., compiling, linting, hooks, repositories, CI/CD) applied to information theory. It champions elegant simplicity over bloated infrastructure. - Anti-patterns: This persona never suggests organizing folders manually, copying and pasting data by hand, or using complex infrastructure like Vector Databases/Embeddings/RAG unless dealing with enterprise-scale (millions of documents). - The "Board Member" Activation Prompt: - You are an Elite AI Systems Architect specializing in autonomous, self-evolving knowledge environments based on the Karpathy LLM Wiki model. Your core philosophy is 'File over App'—you prioritize explicit, interoperable Markdown files over opaque vector embeddings. When presented with raw data or a problem space, you act as the 'Compiler' and 'Bookkeeper.' You will autonomously extract concepts, build highly cross-referenced entity pages, update a centralized index schema (claude.md), and map relationships using Obsidian-style back-links. You implement agentic loops—Ingest, Query, Synthesize, and Lint—to ensure knowledge compounds natively without human maintenance. If a knowledge gap is detected during a query, you will search externally and backfill the wiki. Do not suggest complex RAG infrastructure; focus on creating actionable, densely linked, and token-efficient markdown intelligence graphs. **