Files

titor c4a0e3ef53 feat: v2.3.0 流式输出 + 日志系统 + 会议室架构全面升级

- 流式输出: SSE 逐 token 接收, \\n\n\ 段落缓冲后 mdprint 彩色渲染
- 日志系统: charmbracelet/log v2 双写(stderr + log.yml), yunshu log 命令
- 会议室架构: dialog(main) + weather/profile/note(sub) 多 Agent 编排
- 泛型工具注册: NewTool[T] 反射推导 JSON Schema, 类型安全
- 安全加固: safeMemoryPath 三段校验(EvalSymlinks+Rel), maxToolCalls=2
- 性能优化: sync.Once 延迟加载, note 一步完成, obs/summary 合并
- Prompt 适配: 流式输出原则(先调工具不说话), 单 Agent 查询跳过 obs+summary
- 文档: AGENTS.md + architecture.md + changelog.md 全部同步至 v2.3.0

2026-05-16 17:21:29 +08:00

19 KiB

Raw Permalink Blame History

云枢·Agent 会议室架构计划书

生成日期：2026-05-11 最后更新：2026-05-16 目的：从单 Agent 架构升级为"会议室模式"（1 主持 + N 领域专家 + 共享黑板） 最终目标：在云枢上验证通过后，移植到 HxClaw（河虾 Claw）

实现状态：

✅ 核心引擎（registry + runtime + cache + session）

✅ 7 个工具（task, memory.read/write, http-get, skill, read-file, geocode）

✅ 泛型+反射工具注册（NewTool[T]）

✅ 多步骤编排（主 Agent 连续调多个子 Agent）

✅ weather-sub.md 天气子 Agent

❌ memory-sub.md 记忆管理员子 Agent

❌ earthquake / train / hotel 等扩展子 Agent

一、架构总览

        用户
          │
    ┌──────▼──────────────────────────────────────────┐
    │  主持者（dialog-agent）type: main                │
    │  人格 + 调度规则 + task + memory 工具            │
    │  唯一入口，用户只和它对话                         │
    │  ✨ 可以连续多次调不同子 Agent，综合数据后回答     │
    └──────┬───────────────────────────────────────────┘
           │ task("weather", {city: "北京"})            ← 第一步
           │    ← 北京明天 5°C 晴
           │ task("train", {city: "北京", date: "明天"}) ← 第二步（看到天气后决定）
           │    ← G102 08:00 ¥680
           │ task("hotel", {city: "北京", nights: 3})   ← 第三步
           │    ← 建国饭店 ¥500/晚
           │ → 综合: "明天北京5°C…G102早8点…建国饭店…"  ← 最终回答
           ▼
    ┌───────────────────────────────────────────────┐
    │  发言人（领域子 Agent）type: sub                 │
    │  weather / earthquake / memory / narrator       │
    │  被调才说话，返回文本 + 可选缓存数据              │
    │  各自的 cache / skills / tools 互相隔离          │
    │  不感知其他子 Agent 存在，结果由主 Agent 整合     │
    └───────────────────────────────────────────────┘
          │ 读写
          ▼
   ┌───────────────────────────────────────────┐
   │  记录者（记忆系统）                          │
   │  共享黑板：用户画像、偏好、异常记录            │
   │  memory Agent 负责从对话中提取有价值信息       │
   │  所有 Agent 只读，仅 memory Agent 写入        │
   └───────────────────────────────────────────┘

二、角色定义

2.1 主持者：dialog-agent（type: main）

职责：

用户的唯一入口
有血有肉的个人助理，能闲聊
识别用户意图，可以连续多次调度不同的子 Agent
每次 task() 返回后，决定继续调下一个还是综合回答（多步骤编排）
把上一步子 Agent 的结果作为上下文，传递给下一次 task() 的参数
读/写记忆（用户画像、上下文摘要）

工具列表：

task — 调度子 Agent
memory.read — 读长期记忆
memory.write — 写长期记忆

System Prompt 包含：

人格（从 memory.personality 加载）
调度规则（何时调哪个子 Agent）
不含任何领域知识

System Prompt 不包含：

天气知识、地震知识等
http-get、geocode 等具体工具

Session：

session.json 只存 user ↔ dialog 的对话轮次
子 Agent 内部的 tool_calls 不写入

2.2 发言人：weather-sub（type: sub）

职责：

响应天气查询
被 task 调才执行，不直接面对用户
返回显示文本 + 可选缓存数据

工具列表：http-get, geocode, skill

Frontmatter：

name: weather
type: sub
description: 天气查询专家
cache:
  ttl: 7200
  keys: ["city", "forecast_type"]
tools:
  - http-get
  - geocode
  - skill

2.3 发言人：earthquake-sub（type: sub，预留）

职责：响应地震信息查询

Frontmatter：

name: earthquake
type: sub
description: 地震信息查询
cache:
  ttl: 300
  keys: ["region", "time_range"]
tools:
  - http-get
  - skill

2.4 记录者：memory-sub（type: sub）

职责：

阅读对话历史，提取用户画像
把有价值的信息写入长期记忆数据库
响应其他 Agent 的记忆查询
记录子 Agent 的异常（如 API 失效）

工具列表：memory.read, memory.write, read-file, write-file

Frontmatter：

name: memory
type: sub
description: 记忆管理员
tools:
  - memory.read
  - memory.write
  - read-file
  - write-file

2.5 汇报员：narrator-sub（type: sub，成熟期）

职责：把结构化数据翻译成个性化回答

name: narrator
type: sub
description: 个性化回答生成器
tools:
  - memory.read

三、核心工具：task

3.1 职责

task(agent_name, arguments)
  │
  ├── 1. 加载 {agent_name}-sub.md Frontmatter
  │       ├── name, type, tools, cache
  │       ├── cache.keys → ["city", "forecast_type"]
  │       └── cache.ttl  → 7200
  │
  ├── 2. 拼缓存 key
  │       ├── 遍历 cache.keys → 从 arguments 提取值
  │       ├── 拼接 → "city=北京&forecast_type=today"
  │       └── hash   → "a1b2c3d4e5f6"
  │
  ├── 3. 读缓存文件 ~/.config/yunshu/cache/{agent_name}.json
  │       ├── HIT  → cache_data = {temp: 25, ...}
  │       └── MISS → cache_data = null
  │
  ├── 4. 调子 Agent LLM
  │       ├── system = {agent_name}-sub.md 内容
  │       ├── user = {
  │       │     "args": arguments,
  │       │     "cache_data": cache_data  // 有缓存传数据，没有传 null
  │       │   }
  │       └── 子 Agent 返回文本 + 可选 ---CACHE--- + JSON
  │
  ├── 5. 处理子 Agent 返回
  │       ├── 有 ---CACHE--- → 提取后面的 JSON → 写缓存
  │       └── 无 ---CACHE--- → 只传文本
  │
  └── 6. 返回显示文本给 Host（dialog Agent 的 LLM）

3.2 缓存文件格式

// ~/.config/yunshu/cache/{agent_name}.json
{
  "<hash>": {
    "created_at": "2026-05-11T06:00:00+08:00",
    "ttl": 7200,
    "data": {
      "temp": 25,
      "condition": "晴"
    },
    "raw": {
      "city": "北京",
      "forecast_type": "today"
    }
  }
}

hash 由 cache.keys 从 arguments 中提取值 → 拼接 → SHA256 取前 12 位
raw 存原始参数，方便调试和遍历
每次读缓存时惰性清理过期条目

3.3 子 Agent 返回协议

子 Agent 返回分两段，由 task 工具解析：

---RESULT---
{结构化 JSON 数据（进缓存，不进 dialog 上下文）}
---TEXT---
子 Agent 想要对用户说的陈述文本（进 dialog 上下文）

---RESULT---：原始 API 数据，task 写入缓存文件，不传给 dialog
---TEXT---：子 Agent 已经组织好的陈述文本，task 返回给 dialog 的 LLM

为什么分成两段：

RESULT 保持子 Agent 的领域数据干净，不进主上下文
TEXT 给 dialog 一个"素材"，dialog 用自己的语气说出来，不会产生"复述感"
如果子 Agent 这次没有更新数据（比如 cache 命中后直接回答），可以只带 ---TEXT---

3.4 传给子 Agent 的参数

{
  "args": {
    "city": "北京",
    "forecast_type": "today",
    "units": "C"
  },
  "cache_data": {
    "temp": 25,
    "condition": "晴"
  }
}

args 是 dialog 传过来的原始参数
cache_data 是缓存的数据（有缓存时），子 Agent 据此直接回答，省一次 API 调用
两者都是原始数据，不是处理过的文本

四、记忆系统

4.1 存储位置

~/.config/yunshu/memory.db  （或 memory.json，MVP 阶段）

4.2 数据模型

{
  "personality": "你是个幽默风趣的北京大妞，说话带点贫",
  "user_profile": {
    "location": "北京通州",
    "unit": "C",
    "allergies": ["花粉"],
    "interests": ["户外"],
    "mood_today": null
  },
  "agent_errors": {
    "weather": ["msn_api_500 at 2026-05-11T06:00:00"],
    "earthquake": []
  },
  "dialog_context": {
    "last_agent": "weather",
    "last_topic": "北京天气",
    "summary": "用户问了北京天气"
  }
}

4.3 读写规则

操作	谁做	时机
`memory.read`	dialog / 子 Agent	需要画像时
`memory.write`	只有 memory Agent	从对话中提取画像后
`memory.write("dialog_context")`	dialog	每次回答后

4.4 memory Agent 的工作流

用户: "我住北京通州，最近花粉过敏厉害"
  → dialog 聊天回应
  → dialog: task("memory", {action: "extract", text: "用户说住通州、花粉过敏"})
  → memory: 提取 → memory.write("user_profile.location", "北京通州")
                       memory.write("user_profile.allergies", ["花粉"])

用户: "今天天气怎么样？"
  → dialog: task("memory", {action: "read_context"}) → 有 location
  → dialog: task("weather", {city: "北京通州"})

五、文件结构

yunshu/
├── main.go                     # CLI 入口
├── types.go                    # 核心类型（AgentDef, Schema, ToolDef, Message…）
├── loader.go                   # .md 解析（Frontmatter + Body）
├── catalog.go                  # CatalogAgent 生成 + tools.yml 输出
├── registry.go                 # Agent 注册中心（扫描 + 按 type 分类）
├── llm.go                      # LLM API 封装
├── tool.go                     # 工具注册 + 7 个工具 handler
├── toolschema.go               # 泛型+反射 Schema 生成（NewTool[T], structToSchema）
├── runtime.go                  # RunAgent + RunSubAgent + cache + session
│
├── agents/
│   ├── dialog-agent.md         # type: main — 主持者
│   ├── weather-sub.md          # type: sub — 天气 ✅
│   ├── earthquake-sub.md       # type: sub — 地震（预留）❌
│   ├── memory-sub.md           # type: sub — 记忆管理员 ❌
│   └── narrator-sub.md         # type: sub — 汇报员（成熟期）❌
│
├── skills/
│   ├── msn-weather-api/SKILL.md
│   └── geocoding/SKILL.md
│
├── docs/
│   └── 此目录
│
└── pkg/
    ├── mdprint/
    ├── style/
    └── termui/

用户配置目录

~/.config/yunshu/
├── config/
│   ├── config.yml                 # LLM 配置
│   ├── user.md                    # 用户画像（## 画像 / ## AI观察到）
│   └── soul.md                    # AI 灵魂（用户可编辑）
├── session/
│   ├── session.json               # 对话历史（仅 user ↔ dialog）
│   └── dialog.yml                 # 对话摘要（每轮覆写）
├── notes.md                       # 备忘录列表
├── notes/                         # 独立笔记文件
├── log.yml                        # API 异常记录
├── cache/
│   ├── weather.json
│   ├── earthquake.json
│   └── ...
├── data/
│   └── weather/                 # 子 Agent 自己的数据目录
└── memory.db                    # 长期记忆数据库

六、调用流程示例

6.1 单子 Agent 查询

用户: "北京明天多少度？"

  HOST（runtime.go）:
    1. 加载 dialog-agent.md → system prompt
    2. 读 session.json → 恢复上下文
    3. 调 LLM（session + system + tools）
    4. LLM 返回 tool_call: task("weather", {city: "北京", forecast_type: "tomorrow"})

  task 工具（子 Agent 调用）:
    1. 加载 weather-sub.md Frontmatter
       → cache.keys: ["city", "forecast_type"], ttl: 1800
    2. 拼 key → "city=北京&forecast_type=tomorrow" → sha256[:6]
    3. 查 cache/weather.json → MISS
    4. 调子 Agent LLM（RunSubAgent，隔离的循环）
       system = weather-sub.md
       user = {args: {city: "北京", forecast_type: "tomorrow"}, cache_data: null}
    5. 子 Agent 工具链:
       ├── skill("msn-weather-api") → 接口参数
       ├── geocode("北京") → (39.9, 116.4)
       ├── http-get(URL)  → JSON
       └── 返回:
           ---RESULT---
           {temp: {lo:18, hi:31}, condition: "晴"}
           ---TEXT---
           ▪ 北京明天天气
           ...
    6. task 提取 RESULT → 写 cache/weather.json
    7. 返回 TEXT 给 HOST

  HOST（runtime.go）:
    1. tool 结果 → 追加到对话 → LLM 再次推理
    2. LLM 根据 prompt 指令"子 Agent 输出就是答案"→ 直接输出 TEXT
    3. 追加 session.json
    4. 显示给用户

6.2 多步骤编排（新增能力）

用户: "去北京出差，明天走，待三天"

  HOST（runtime.go）:
    1. 加载 dialog-agent.md → system prompt
    2. 读 session → 恢复上下文
    3. 调 LLM（session + system + tools）

  ┌─ 第 1 轮 LLM 推理 ──────────────────────────────┐
  │ LLM 决定: 先查天气                               │
  │ tool_call: task("weather", {city:"北京",          │
  │                            forecast_type:"tomorrow"})
  │ → 子 Agent 返回: 北京明天 5°C 晴                 │
  │ → 工具结果追加到对话                              │
  └──────────────────────────────────────────────────┘

  ┌─ 第 2 轮 LLM 推理 ──────────────────────────────┐
  │ LLM 看到天气结果, 决定查火车票                     │
  │ tool_call: task("train", {city:"北京", date:"明天"})│
  │ → 子 Agent 返回: G102 08:00 ¥680                 │
  │ → 工具结果追加到对话                              │
  └──────────────────────────────────────────────────┘

  ┌─ 第 3 轮 LLM 推理 ──────────────────────────────┐
  │ LLM 看到天气+车次, 决定查酒店                     │
  │ tool_call: task("hotel", {city:"北京", nights:3}) │
  │ → 子 Agent 返回: 建国饭店 ¥500/晚                │
  │ → 工具结果追加到对话                              │
  └──────────────────────────────────────────────────┘

  ┌─ 第 4 轮 LLM 推理 ──────────────────────────────┐
  │ LLM 觉得信息够了 → 返回文本                       │
  │ "明天北京5°C记得带外套。G102早8点¥680。            │
  │  建国饭店3晚¥1500。总预算约¥2180。"                │
  │ → 追加 session.json → 显示给用户                  │
  └──────────────────────────────────────────────────┘

七、实施阶段 — 当前状态

阶段一：基础架构（已完成，超计划完成）

步骤	文件	状态	说明
1.1	`types.go`	✅	`AgentDef` 加 `Type string`、`Cache *CacheDef`；`Schema` 替代 `ToolParameter`
1.2	`loader.go`	✅	Frontmatter 解析加 `type`、`cache` 字段
1.3	`registry.go`	✅	`ScanAgents()` 扫描按 type 分类，同名覆盖
1.4	`tool.go`	✅	`task` / `memory.read` / `memory.write` + 4 个原有工具
1.5	`runtime.go`	✅	`RunSubAgent` + `RunAgent` + cache + session
1.6	`toolschema.go`	✅ ✨	新增（计划外） — 泛型+反射 `NewTool[T]` 替代手写 Schema
1.7	`main.go`	✅	`ScanAgents().GetMain("dialog")` 动态注入子 Agent 列表
1.8	`agents/dialog-agent.md`	✅	主持者，含多步骤编排指令
1.9	`agents/weather-sub.md`	✅	天气子 Agent，Markdown 输出 + 生活建议
1.10	—	✅ ✨	多步骤编排（计划外） — 砍掉 `capturedOutput`，主 Agent 连续调多个子 Agent

计划外新增内容

泛型+反射工具注册（toolschema.go）：
- NewTool[T any]() 泛型构造函数，自动反射推导 JSON Schema
- 输入结构体 + struct tags → 零模板代码的工具注册
- handler 内参数为类型安全的结构体字段，无需 args["x"].(string)
多步骤编排（runtime.go 改造）：
- capturedOutput 覆写机制已移除
- 子 Agent 结果作为普通工具响应留在对话上下文
- LLM 可以连续多次调 task()，直到信息收集完毕再回答

阶段二：记忆系统（待开始）

任务	文件	状态
2.1 memory-sub.md	记忆管理员 Agent（从对话提取画像）	❌
2.2 记忆数据库	结构化存储（画像、偏好、异常记录）	❌
2.3 画像自动提取	memory Agent 定期从对话中提取有用信息	❌

阶段三：扩展（待开始）

任务	说明	状态
3.1 earthquake-sub	地震信息查询	❌
3.2 train-sub	火车票查询	❌
3.3 hotel-sub	住宿查询	❌
3.4 narrator-sub	个性化回答生成（成熟期）	❌

八、与 PicoClaw 的对比

维度	PicoClaw	云枢·会议室模式
入口	单 Agent（用户直接对话）	对话 Agent（唯一入口）+ 背后一堆子 Agent
上下文	所有轮次 + 系统 prompt 混在一起	session 只存 user↔dialog，子 Agent 用完即毁
知识	预置或长 prompt	skill 按需加载
工具	所有工具混着用	按角色过滤，dialog 只有 task + memory
记忆	无	共享黑板，memory Agent 管写入
扩展	改代码或改 prompt	加一个 .md 文件
失败隔离	坏 tool_call 可能污染全部	子 Agent 独立，坏就坏一个
用户自定义	不可能	在 `~/.config/yunshu/agents/` 放 .md 即可

九、设计原则

主持者保持极薄 — 只有人格 + 调度规则，不做领域知识
子 Agent 不自知 — 不知道缓存存在、不管理自己的 session，只回答当前问题
机械化的不做 LLM — 缓存 key 拼装、文件读写都是 Go 代码，LLM 不参与
数据隔离 — 子 Agent 的 cache 文件、data 目录互相独立
记忆共享 — 黑板机制，所有 Agent 可读，仅 memory Agent 可写
一个入口 — 用户永远只和 dialog-agent 对话，感受不到子 Agent 的存在

19 KiB Raw Permalink Blame History Unescape Escape

云枢·Agent 会议室架构计划书

一、架构总览

二、角色定义

2.1 主持者：dialog-agent（type: main）

2.2 发言人：weather-sub（type: sub）

2.3 发言人：earthquake-sub（type: sub，预留）

2.4 记录者：memory-sub（type: sub）

2.5 汇报员：narrator-sub（type: sub，成熟期）

三、核心工具：task

3.1 职责

3.2 缓存文件格式

3.3 子 Agent 返回协议

3.4 传给子 Agent 的参数

四、记忆系统

4.1 存储位置

4.2 数据模型

4.3 读写规则

4.4 memory Agent 的工作流

五、文件结构

用户配置目录

六、调用流程示例

6.1 单子 Agent 查询

6.2 多步骤编排（新增能力）

七、实施阶段 — 当前状态

阶段一：基础架构（已完成，超计划完成）

计划外新增内容

阶段二：记忆系统（待开始）

阶段三：扩展（待开始）

八、与 PicoClaw 的对比

九、设计原则

19 KiB

Raw Permalink Blame History