Agent 记忆设计从零开始：构建真正持续学习的 AI 助手

系列 Agent 架构探索 → 第 6 篇 / 共 8 篇

你是否有过这种体验：

昨天跟 AI 聊了两个小时，把项目背景解释清楚，调教出了满意的回复风格。今天再打开，一切归零。

这不是 AI 变蠢了，而是记忆问题。

好消息是：这个问题完全可以工程化解决。本文从零开始，带你搭一套真正能持续积累上下文的 Agent 记忆系统。

为什么 AI 没有记忆

AI 模型本质上是无状态的：每次推理只看当前输入，不保留任何会话外的信息。

这是设计决策，不是缺陷——有状态会带来隐私问题、一致性问题、存储成本问题。

但对于”我想让 AI 真正了解我的项目”这个需求，无状态是障碍。解决思路很直接：

把需要记住的东西，放进 AI 每次都能读到的地方。

记忆系统本质上就是一套”把信息持久化，并在正确时机加载进上下文”的机制。

最小可行版本（5 分钟能搭好）

先不谈架构，先让它能用。

第一步：在项目根目录创建 CLAUDE.md：

# 项目：我的博客

## 基本信息
- 技术栈：Astro 5 + Tailwind + Vercel
- 部署命令：git push（Vercel 自动部署）
- 本地开发：npm run dev

## 重要约定
- 不修改已发布文章的文件名（影响 URL）
- 每次 push 前必须 npm run build 验证

第二步：在项目里创建一个 notes.md，记今天做了什么：

# 工作日志

## 2026-03-16
- 添加了 TagFilter 组件，支持按标签筛选文章
- 发现 dark mode 下表格样式有问题，待修复
- 明天：添加系列筛选功能

就这样。下次打开 Claude Code，它会自动读 CLAUDE.md，然后你说”继续昨天的工作”——它读 notes.md，知道做到哪里了。

这个最小版本解决了 80% 的问题。

为什么要分层

随着项目增长，CLAUDE.md 里的信息会越来越多，开始互相干扰：

架构决策放哪里？
操作步骤放哪里？
今天遇到的 bug 放哪里？
这个函数的设计原因放哪里？

全塞进一个文件，Claude 的注意力被稀释，重要信息找不到。

分层的核心思想：按信息的稳定性分类存储。

越稳定的信息越底层，越易变的信息越顶层。

五层架构详解

memory/
├── character/     # 最稳定：角色定义
├── abilities/     # 稳定：能力边界
├── skills/        # 中稳定：操作规程
├── knowledge/     # 中易变：领域知识
└── logs/   # 最易变：日志信息

第 0 层：Character（角色）

定义 Agent “是谁”。几乎不变，但决定了所有判断的基准。

# character.md

## 角色
我是 ai-navigator-coordinator，AI时代漫游指南的项目管理 Agent。

## 核心原则
- 每次 push 前必须 build 验证，不跳过
- 不修改已发布文章的 slug
- 架构决策记录到 knowledge，不做隐性决定

## 风格
- 直接，不啰嗦
- 遇到不确定的决策，先列出选项，不自作主张

什么时候写：项目启动时写一次，之后只在核心约束变化时修改。

第 1 层：Abilities（能力声明）

声明 Agent 能做什么、不能做什么，以及当前的运行状态。

# heartbeat.yaml
last_seen: 2026-03-16T10:00:00
status: active

can_do:
  - 管理博客内容发布
  - 协调前端功能迭代
  - 维护 tasks.yaml 需求状态

cannot_do:
  - 修改已发布文章的 slug
  - 在没有 build 验证的情况下 push
  - 删除 memory/ 目录下的任何文件

关键是 cannot_do 列表——每一条都是踩过的坑。

什么时候写：初始化时写，每次发现 Agent 做了不该做的事，就在这里加一条约束。

第 2 层：Skills（操作规程）

可复用的 SOP，把验证过的流程固化下来。

# deploy-blog.md

## 部署博客

### 前置检查
1. git pull 拉取最新代码
2. npm run build（必须通过，exit 0）
3. 检查 dist/ 生成了预期的页面

### 执行
git add -A && git commit -m "描述" && git push

### 验证
等待 Vercel 部署（约 2 分钟），访问新页面确认上线

什么时候写：某个操作你做了第二遍的时候，把它写成 skill。

第 3 层：Knowledge（结构化知识）

项目的”大脑”——架构决策、配置说明、已知问题。

每个文件只写一个主题，控制在 100 行以内。

knowledge/
├── site-architecture.md      # 站点结构和技术栈
├── seo-decisions.md          # SEO 相关的决策记录
├── deployment-config.md      # 部署配置说明
└── content-guidelines.md     # 文章发布规范

什么时候写：做了一个重要决策之后立刻写，记录决策本身和原因。三个月后你（和 Agent）还能理解为什么。

第 4 层：Information（日志信息）

时效性最强的信息：每日工作日志、讨论记录。

# 2026-03-16_daily.md

### 上午
- 完成：TagFilter 组件添加系列筛选功能
- 发现：dark mode 下 prose table 样式异常
  - 原因：tailwind prose dark 变量覆盖问题
  - 解决：已在 global.css 添加 .dark .prose thead th 覆盖

### 待办
- [ ] 修复移动端 TOC 折叠动画
- [ ] 添加精选文章筛选按钮

什么时候写：每天工作结束前写，不超过 15 分钟。

CLAUDE.md：记忆系统的入口

光有记忆文件还不够——需要告诉 Claude 去哪里找。

# AI时代漫游指南 — 项目概要

## 快速信息
- 站点：my-project.example.com
- 技术栈：Astro 5 + Tailwind + Vercel
- 部署：git push 触发 Vercel 自动构建

## 约定（最重要的在这里）
- 部署前必须 npm run build 验证
- 不修改已发布文章的 slug

## 找什么，去哪里看
- 上次工作进度 → memory/logs/ 最新日期文件
- 部署/发布流程 → memory/skills/
- 架构和配置 → memory/knowledge/
- 角色约束 → memory/character/character.md

原则：CLAUDE.md 是索引，不是内容。细节永远在对应的 memory 层级里。

搭建顺序建议

不要一次搭完整的五层，那样很难坚持。按阶段来：

第一周：只建 CLAUDE.md + logs 日志

每天 5 分钟记日志
确保”继续昨天的工作”能正常运转

第二周：遇到反复操作，写进 skills

每写一个 skill，就删掉 CLAUDE.md 里对应的重复内容

第三周：遇到重要决策，写进 knowledge

不用强迫自己写，事件驱动

以后：根据实际遇到的问题，补 character 约束和 abilities 限制

一个判断标准

如果你不确定某个信息该不该放进记忆系统，问自己：

如果明天清空所有对话，这个信息还需要 Agent 知道吗？

需要 → 放进记忆文件。不需要 → 放在当前对话里就好。

记忆是成本，不是越多越好。精选最关键的上下文，Agent 才能高效运作。

Every AI conversation starts from zero. No memory of last week’s decisions, no accumulated context about your project, no knowledge of what you’ve already tried. Stateless by design.

This is the problem agent memory systems solve. Here’s how to build one — starting from a version you can set up in five minutes, scaling to a full architecture as the need grows.

Why AI Has No Memory

It’s not an oversight. Stateless design is intentional: it makes systems predictable, auditable, and easier to reason about. Each request is independent.

The cost is that every session starts cold. The AI doesn’t know your codebase conventions, your project history, your preferences, or what you decided last Tuesday.

The workaround: inject context into the session. The question is how to structure that context so it’s maintainable, accurate, and doesn’t bloat every prompt with irrelevant information.

Minimal Viable Version (5 Minutes)

Two files:

project/
├── CLAUDE.md        # project context
└── notes/
    └── daily.md     # running log

CLAUDE.md contains what the agent needs to know to work on this project: tech stack, conventions, architecture decisions, constraints. Not a comprehensive spec — just what it would take a competent developer 30 minutes to figure out by reading the code, condensed into a readable file.

daily.md is a running log of what’s happened: decisions made, problems solved, things tried. Append-only. Dated entries.

That’s it. Load both into context at the start of each session. You now have an agent that knows your project and remembers recent history.

Most projects never need more than this. Build the rest only when you have a specific problem the minimal version doesn’t solve.

Why Layer the Memory

The minimal version has a scaling problem: everything goes in one or two files, and eventually they get large, stale, and contradictory.

The core insight behind layered memory: information has different stability levels.

Your identity and principles: change rarely, if ever
Your capabilities and tools: change with major infrastructure updates
Your operating procedures: change when you find better approaches
Your domain knowledge: changes as projects evolve
Your daily activity: changes every day

Mixing these stability levels in the same file means you’re rewriting stable information every time you update volatile information, and vice versa. The layers separate them.

Five-Layer Architecture

memory/
├── character/    # who the agent is
├── abilities/    # what it can do
├── skills/       # how it does things
├── knowledge/    # what it knows
└── logs/  # what's happening

Layer 0: Character

Identity, core principles, communication style, decision-making defaults.

This is the agent’s personality, not its task list. It changes almost never. If you’re building a coordinator agent, this is where you define how it handles ambiguity, what it prioritizes when there’s a conflict, how it communicates.

Example content: “When requirements conflict, ask for clarification rather than guessing. Prefer reversible decisions. Write in direct, specific language.”

Layer 1: Abilities

Capability declarations and the heartbeat mechanism.

The heartbeat file (heartbeat.yaml) is where the agent knows its own state:

can_do:
  - manage project requirements
  - write to memory files
  - execute bash commands

cannot_do:
  - access external services directly
  - modify files outside the project directory

This layer also tracks what tools and integrations the agent has access to. It changes when infrastructure changes.

Layer 2: Skills

Operating procedures — SOPs for recurring tasks.

Write a skill the second time you do something. The first time, you’re figuring it out. The second time, you know enough to write the procedure. After that, the agent follows the procedure instead of rediscovering it.

Each skill file is one procedure: deploy a site, update the changelog, triage incoming requests. Short, step-by-step, immediately actionable.

Layer 3: Knowledge

Structured domain knowledge. One topic per file. Target: under 100 lines.

This is where you put things that are true for months at a time — architecture decisions, API documentation, system design rationale, known constraints. Not what’s happening (that’s Layer 4) — what’s true.

The under-100-lines rule forces specificity. If a file keeps growing, it’s probably covering multiple topics that should be split.

Layer 4: Information

Daily logs, event-driven notes, session summaries.

This is the highest-velocity layer. Daily logs go here. So do notes from specific events: a debugging session, a design discussion, a deployment incident.

File naming: date-based (2026-03-15_daily.md) or event-based (2026-03-15_discuss_auth-design.md). The date makes retrieval easy; the event suffix makes it human-readable.

CLAUDE.md as the Index

With a five-layer structure, CLAUDE.md isn’t the content anymore — it’s the index.

It tells the agent how the memory system is organized, what each layer contains, and how to navigate it. The actual content lives in the layer files.

This keeps CLAUDE.md short and stable. It doesn’t need to change when you add a new skill or update a knowledge file. The index points to the content; the content lives where it belongs.

Phased Build Order

Don’t build all five layers at once. Build the layer you need, when you need it.

Week 1: Minimal viable (Layers 0 and 4 only)

CLAUDE.md with project context
Layer 0: write down the agent’s core principles
Layer 4: start a daily log

You have an agent with identity and memory. That’s enough to start.

Week 2: Add operating procedures (Layer 2)

When you do something for the second time, write the skill
Start with the most frequent task

Week 3: Formalize knowledge (Layer 3)

When you explain the same thing to the agent twice, that’s a knowledge file
Architecture decisions, system constraints, recurring context

Ongoing: Maintain Layer 1

Update abilities when tools change
Review heartbeat when the agent gets something wrong about its own capabilities

The Decision Criterion

When you’re unsure whether something belongs in memory: If all conversations were cleared tomorrow, would the agent still need this information to do its job?

If yes, it belongs in memory. If no, it can stay in the conversation.

This criterion also helps with placement. Information the agent needs every day goes in the layers that load every session (0, 1, 2). Information specific to a project phase goes in Layer 3. Recent events go in Layer 4.

What This Buys You

An agent with layered memory doesn’t forget your conventions. It doesn’t ask you to re-explain your architecture. It doesn’t repeat mistakes that are already documented. It gets more useful the longer you work with it, because the memory accumulates.

The investment is writing things down — which you should probably be doing anyway. The return is an AI collaborator that actually knows your project.

Start with the minimal version. Add layers when you hit the ceiling. The architecture scales with the work.