Some checks failed
Release / build (push) Failing after 4m28s
## 核心功能 - 双记忆系统合并:picoclaw MEMORY.md + hxclaw 会话摘要 - 独立上下文系统:不依赖 picoclaw session - 向量检索:硅基流动 BGE-M3 API - 三重检测:关键词/向量相似度/命令 ## 数据库 - libSQL (TursoDB) 存储 - sessions + chats 表设计 - 向量存储使用 binary 编码 ## 查询场景 - RecallHistory: 查询所有会话摘要 - RecallTopic: 按话题向量检索 - RecallSession: 指定会话详情 - RecallWithinSession: 会话内检索 ## 导出 - MongoDB 风格:~/.config/hxclaw/export-data.json - chats 嵌套在 sessions 下 - 增量导出,同 session 累加 ## UI 优化 - 合并状态显示(耗时 · 状态 · 消息数) - 颜色设计:金色图标 + 暗绿色/暗红色状态 ## 配置项 - memory.recall: keywords, auto_recall, similarity_threshold - memory.vector: max_search_results - memory.auto_export
461 lines
9.2 KiB
Markdown
461 lines
9.2 KiB
Markdown
# hxclaw 记忆体系统架构图
|
||
|
||
## 一、数据流向总图
|
||
|
||
```mermaid
|
||
flowchart TB
|
||
subgraph 用户输入
|
||
A[用户输入]
|
||
A1[普通对话]
|
||
A2[查询历史]
|
||
A3[/recall 命令]
|
||
end
|
||
|
||
subgraph 上下文注入
|
||
B[GetContextPrompt]
|
||
B0[读取 picoclaw MEMORY.md]
|
||
B1[获取当前 Session 摘要]
|
||
B2[检测查询意图]
|
||
B3[按需调用 Recall]
|
||
end
|
||
|
||
subgraph AI 处理
|
||
C[ProcessDirect]
|
||
C1[工具调用]
|
||
C2[多轮对话]
|
||
end
|
||
|
||
subgraph 保存流程
|
||
D[SaveChat]
|
||
D1[INSERT chat]
|
||
D2[UPDATE chat 摘要+向量]
|
||
D3[UPDATE session 摘要+向量]
|
||
end
|
||
|
||
subgraph 数据库
|
||
E[sessions 表]
|
||
F[chats 表]
|
||
end
|
||
|
||
subgraph 向量服务
|
||
G[硅基流动 API]
|
||
end
|
||
|
||
subgraph 外部记忆
|
||
H[picoclaw MEMORY.md]
|
||
end
|
||
|
||
A --> B
|
||
B0 --> B
|
||
H --> B0
|
||
B --> C
|
||
C --> D
|
||
D --> F
|
||
D --> E
|
||
F --> G
|
||
E --> G
|
||
|
||
A1 -->|普通对话| B1
|
||
A2 -->|检测关键词| B2
|
||
A3 -->|强制触发| B3
|
||
```
|
||
|
||
---
|
||
|
||
## 二、对话流程(默认模式)
|
||
|
||
```mermaid
|
||
sequenceDiagram
|
||
participant U as 用户
|
||
participant M as main.go
|
||
participant CP as GetContextPrompt
|
||
participant PicoMem as picoclaw MEMORY.md
|
||
participant AI as ProcessDirect
|
||
participant SC as SaveChat
|
||
participant DB as libSQL
|
||
participant VS as 向量服务
|
||
|
||
U->>M: 用户输入
|
||
M->>CP: GetContextPrompt(userInput)
|
||
CP->>PicoMem: 读取长期记忆
|
||
PicoMem-->>CP: 长期记忆内容
|
||
CP->>DB: 获取 currentSession
|
||
DB->>CP: session.Summary
|
||
|
||
Note over CP: 合并:长期记忆 + 会话摘要
|
||
|
||
CP->>M: 返回上下文
|
||
M->>AI: ProcessDirect(context + input)
|
||
AI->>U: 返回 AI 回复
|
||
|
||
U->>SC: SaveChat(userInput, aiReply)
|
||
SC->>DB: INSERT chat
|
||
SC->>DB: UPDATE session
|
||
|
||
Note over SC: 更新摘要和向量
|
||
|
||
SC->>VS: 异步生成向量
|
||
VS-->>SC: embedding
|
||
|
||
SC->>DB: 保存到 chats.summary_embedding
|
||
SC->>DB: 保存到 sessions.summary_embedding
|
||
```
|
||
|
||
---
|
||
|
||
## 三、四种查询场景
|
||
|
||
```mermaid
|
||
flowchart LR
|
||
subgraph 查询场景
|
||
Q1[场景1: 历史摘要]
|
||
Q2[场景2: 话题检索]
|
||
Q3[场景3: 会话详情]
|
||
Q4[场景4: 会话内检索]
|
||
end
|
||
|
||
subgraph 触发条件
|
||
T1["之前聊过什么?"]
|
||
T2["谈论过 xxx?"]
|
||
T3["那次还说过什么?"]
|
||
T4["xxx 呢?"]
|
||
end
|
||
|
||
subgraph 查询逻辑
|
||
L1[查 sessions 表]
|
||
L2[chats 向量检索<br/>Group By session_id]
|
||
L3[查 sessions 表<br/>WHERE id = ?]
|
||
L4[chats 向量检索<br/>WHERE session_id = ?]
|
||
end
|
||
|
||
subgraph 返回
|
||
R1[所有会话摘要]
|
||
R2[按 session 分组<br/>top5 摘要拼接]
|
||
R3[指定 session 摘要]
|
||
R4[同 session 内相关摘要]
|
||
end
|
||
|
||
T1 --> Q1
|
||
T2 --> Q2
|
||
T3 --> Q3
|
||
T4 --> Q4
|
||
|
||
Q1 --> L1
|
||
Q2 --> L2
|
||
Q3 --> L3
|
||
Q4 --> L4
|
||
|
||
L1 --> R1
|
||
L2 --> R2
|
||
L3 --> R3
|
||
L4 --> R4
|
||
```
|
||
|
||
---
|
||
|
||
## 四、三重检测机制
|
||
|
||
```mermaid
|
||
flowchart TB
|
||
I[用户输入]
|
||
|
||
subgraph 检测层
|
||
D1[/recall 命令]
|
||
D2[关键词匹配]
|
||
D3[向量相似度]
|
||
end
|
||
|
||
subgraph 配置
|
||
C1[keywords]
|
||
C2[auto_recall]
|
||
C3[similarity_threshold]
|
||
end
|
||
|
||
I --> D1
|
||
I --> D2
|
||
I --> D3
|
||
|
||
D2 --> C1
|
||
D3 --> C3
|
||
|
||
C1 -->|匹配成功| R[强制 Recall]
|
||
D3 --> C2
|
||
|
||
C2 -->|开启| C3
|
||
C3 -->|阈值判断| R
|
||
|
||
D1 -->|触发| R
|
||
```
|
||
|
||
---
|
||
|
||
## 五、数据库表结构
|
||
|
||
```mermaid
|
||
erDiagram
|
||
sessions {
|
||
int id PK
|
||
string uuid
|
||
text summary
|
||
blob summary_embedding
|
||
string chat_ids
|
||
int created_at
|
||
int updated_at
|
||
}
|
||
|
||
chats {
|
||
int id PK
|
||
int session_id FK
|
||
text user_input
|
||
text ai_replies
|
||
text summary
|
||
blob summary_embedding
|
||
int created_at
|
||
int updated_at
|
||
}
|
||
|
||
sessions ||--o{ chats : "has many"
|
||
```
|
||
|
||
---
|
||
|
||
## 六、上下文演变
|
||
|
||
```mermaid
|
||
flowchart LR
|
||
subgraph 时间线
|
||
T1[开始]
|
||
T2[第1次对话]
|
||
T3[第2次对话]
|
||
T4[第N次对话]
|
||
T5[第1000次对话]
|
||
end
|
||
|
||
subgraph 上下文状态
|
||
S1[空]
|
||
S2[摘要1]
|
||
S3[摘要1+摘要2]
|
||
SN[摘要N]
|
||
S1000[摘要1000]
|
||
end
|
||
|
||
subgraph 实际状态
|
||
A1["context = 空 + 长期记忆"]
|
||
A2["context = 摘要1 + 长期记忆"]
|
||
A3["context = 摘要2 + 长期记忆"]
|
||
AN["context = 摘要N + 长期记忆"]
|
||
A1000["context = 摘要1000 + 长期记忆"]
|
||
end
|
||
|
||
T1 --> S1 --> A1
|
||
T2 --> S2 --> A2
|
||
T3 --> S3 -->|覆盖| A3
|
||
T4 --> SN -->|覆盖| AN
|
||
T5 --> S1000 -->|覆盖| A1000
|
||
|
||
Note over A1000: 始终只有1条会话摘要 + 长期记忆
|
||
```
|
||
|
||
**注意**:长期记忆来自 `~/.picoclaw/workspace/memory/MEMORY.md`,不受会话影响,会持续保留。
|
||
|
||
---
|
||
|
||
## 七、完整流程图
|
||
|
||
```mermaid
|
||
flowchart TB
|
||
subgraph 输入层
|
||
INPUT[用户输入]
|
||
CMD[/recall]
|
||
KW[关键词]
|
||
end
|
||
|
||
subgraph 检测层
|
||
CHECK{检测查询意图}
|
||
RECALL{Recall 触发?}
|
||
end
|
||
|
||
subgraph 上下文构建
|
||
CONTEXT[上下文]
|
||
SESSION_SUM[当前 Session 摘要]
|
||
RECALL_RES[Recall 结果]
|
||
end
|
||
|
||
subgraph AI 层
|
||
AI[ProcessDirect]
|
||
RESP[AI 回复]
|
||
end
|
||
|
||
subgraph 保存层
|
||
SAVE[SaveChat]
|
||
INSERT_CHAT[INSERT chat]
|
||
UPDATE_CHAT[UPDATE chat]
|
||
UPDATE_SESSION[UPDATE session]
|
||
end
|
||
|
||
subgraph 数据库
|
||
DB[(libSQL)]
|
||
SESSIONS[sessions 表]
|
||
CHATS[chats 表]
|
||
end
|
||
|
||
subgraph 向量服务
|
||
VS[向量服务]
|
||
EMB[Generate Embedding]
|
||
end
|
||
|
||
INPUT --> CHECK
|
||
CMD --> CHECK
|
||
KW --> CHECK
|
||
|
||
CHECK -->|普通对话| RECALL
|
||
CHECK -->|是查询| RECALL_RES
|
||
|
||
RECALL -->|否| SESSION_SUM
|
||
RECALL_RES --> CONTEXT
|
||
|
||
SESSION_SUM --> CONTEXT
|
||
CONTEXT --> AI
|
||
INPUT --> AI
|
||
|
||
AI --> RESP
|
||
RESP --> SAVE
|
||
|
||
SAVE --> INSERT_CHAT
|
||
INSERT_CHAT --> DB
|
||
DB --> CHATS
|
||
|
||
SAVE --> UPDATE_CHAT
|
||
UPDATE_CHAT --> DB
|
||
|
||
SAVE --> UPDATE_SESSION
|
||
UPDATE_SESSION --> DB
|
||
|
||
CHATS --> EMB
|
||
SESSIONS --> EMB
|
||
|
||
EMB --> VS
|
||
VS --> CHATS
|
||
VS --> SESSIONS
|
||
```
|
||
|
||
---
|
||
|
||
## 八、关键文件对应关系
|
||
|
||
| 模块 | 文件 | 职责 |
|
||
|------|------|------|
|
||
| 主流程 | `main.go` | 调用 GetContextPrompt、SaveChat |
|
||
| 配置 | `internal/config.go` | RecallConfig、VectorConfig |
|
||
| 数据库 | `internal/memory/db.go` | CRUD 操作 |
|
||
| 模型 | `internal/memory/model.go` | Session、Chat 结构体 |
|
||
| 向量 | `internal/memory/vector.go` | 硅基流动 API 调用 |
|
||
| 保存 | `internal/memory/save.go` | SaveChat、三重检测、长期记忆读取 |
|
||
| 查询 | `internal/memory/skill.go` | 4 个 Recall 函数 |
|
||
| 导出 | `internal/memory/export.go` | JSON 导出 |
|
||
|
||
---
|
||
|
||
## 九、双记忆系统合并
|
||
|
||
### 记忆来源
|
||
|
||
| 类型 | 来源 | 持久性 |
|
||
|------|------|--------|
|
||
| **长期记忆** | `~/.picoclaw/workspace/memory/MEMORY.md` | 跨会话,AI 自动更新 |
|
||
| **会话摘要** | `~/.config/hxclaw/hxclaw.db` sessions 表 | 当前会话,动态更新 |
|
||
| **聊天详情** | `~/.config/hxclaw/hxclaw.db` chats 表 | 所有历史,支持向量检索 |
|
||
|
||
### 上下文注入格式
|
||
|
||
```markdown
|
||
=== 长期记忆 ===
|
||
(picoclaw MEMORY.md 内容)
|
||
|
||
=== 当前会话摘要 ===
|
||
(hxclaw sessions 表中的摘要)
|
||
|
||
用户新问题: xxx
|
||
```
|
||
|
||
### 实现原理
|
||
|
||
`GetContextPrompt()` 函数在构建上下文时:
|
||
1. 读取 picoclaw 的 `MEMORY.md` 文件
|
||
2. 解析并提取有效内容(跳过 Markdown 标题)
|
||
3. 合并到上下文顶部
|
||
4. 追加会话摘要
|
||
5. 如有 recall 结果,继续追加
|
||
|
||
### 数据流向
|
||
|
||
```
|
||
picoclaw MEMORY.md ──┐
|
||
├──→ GetContextPrompt ──→ AI 上下文
|
||
hxclaw sessions ────┘
|
||
```
|
||
|
||
---
|
||
|
||
## 十、配置项说明
|
||
|
||
```yaml
|
||
memory:
|
||
recall:
|
||
keywords: ["之前", "聊过", "记得", "找找", "曾经", "谈论过", "提过"]
|
||
auto_recall: true # 自动相似度检测
|
||
similarity_threshold: 0.7 # 相似度阈值
|
||
max_results: 5 # 最大检索结果
|
||
|
||
vector:
|
||
max_search_results: 10 # 向量检索最大结果
|
||
```
|
||
|
||
---
|
||
|
||
## 十一、MongoDB 风格导出
|
||
|
||
### 导出文件
|
||
|
||
- 路径:`~/.config/hxclaw/export-data.json`
|
||
- 格式:每次退出时自动增量导出
|
||
|
||
### JSON 结构
|
||
|
||
```json
|
||
{
|
||
"version": 1,
|
||
"exported_at": "2026-04-27T06:12:22+08:00",
|
||
"sessions": [
|
||
{
|
||
"id": 1,
|
||
"uuid": "session-uuid",
|
||
"summary": "会话摘要...",
|
||
"chat_ids": [1, 2, 3],
|
||
"created_at": 1740000000,
|
||
"updated_at": 1740000000,
|
||
"chats": [
|
||
{
|
||
"id": 1,
|
||
"session_id": 1,
|
||
"user_input": "用户输入",
|
||
"ai_replies": ["AI回复1", "AI回复2"],
|
||
"summary": "摘要",
|
||
"created_at": 1740000000,
|
||
"updated_at": 1740000000
|
||
}
|
||
]
|
||
}
|
||
]
|
||
}
|
||
```
|
||
|
||
### 设计特点
|
||
|
||
| 特性 | 说明 |
|
||
|------|------|
|
||
| 嵌套结构 | chats 嵌套在 sessions 下,类似 MongoDB 文档 |
|
||
| 增量导出 | 同 session 的 chats 累加,不重复创建 |
|
||
| UUID 匹配 | 按 UUID 判断是新建还是更新 |
|
||
| 版本控制 | version 字段支持格式演进 |
|
||
``` |