docs: update for ReAct tool use, web_search, and build-time config

Update READMEs with config file setup (Option A/B), tool section,
set_search_key command, and touch-before-build note. Update
ARCHITECTURE.md with ReAct data flow, tools module map, non-streaming
API protocol, and config priority. Mark tool use items done in TODO.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
crispyberry
2026-02-07 00:37:49 +08:00
parent 0e1da79b74
commit e04254fa94
4 changed files with 271 additions and 146 deletions

View File

@@ -62,13 +62,45 @@ git clone https://github.com/memovai/mimiclaw.git
cd mimiclaw cd mimiclaw
idf.py set-target esp32s3 idf.py set-target esp32s3
```
### Configure
**Option A: Config file (recommended)** — fill in once, baked into firmware at build time:
```bash
cp main/mimi_secrets.h.example main/mimi_secrets.h
```
Edit `main/mimi_secrets.h`:
```c
#define MIMI_SECRET_WIFI_SSID "YourWiFiName"
#define MIMI_SECRET_WIFI_PASS "YourWiFiPassword"
#define MIMI_SECRET_TG_TOKEN "123456:ABC-DEF1234ghIkl-zyx57W2v1u123ew11"
#define MIMI_SECRET_API_KEY "sk-ant-api03-xxxxx"
#define MIMI_SECRET_SEARCH_KEY "" // optional: Brave Search API key
#define MIMI_SECRET_PROXY_HOST "" // optional: e.g. "10.0.0.1"
#define MIMI_SECRET_PROXY_PORT "" // optional: e.g. "7897"
```
Then build and flash:
```bash
idf.py build idf.py build
idf.py -p /dev/ttyACM0 flash monitor idf.py -p /dev/ttyACM0 flash monitor
``` ```
### Set Up Config file values have the **highest priority** — they override anything set via CLI.
After flashing, a serial console appears. Type these commands: > **Note:** After editing `mimi_secrets.h`, run `touch main/mimi_config.h` before `idf.py build` to force recompilation.
**Option B: Serial CLI** — configure at runtime after flashing:
```bash
idf.py build
idf.py -p /dev/ttyACM0 flash monitor
```
``` ```
mimi> wifi_set YourWiFiName YourWiFiPassword mimi> wifi_set YourWiFiName YourWiFiPassword
@@ -77,15 +109,19 @@ mimi> set_api_key sk-ant-api03-xxxxx
mimi> restart mimi> restart
``` ```
That's it. After restart, find your bot on Telegram and start chatting. CLI values are stored in NVS (persistent flash) and used when no config file value is set.
### More Commands ### CLI Commands
``` ```
mimi> wifi_set <ssid> <pass> # set WiFi credentials
mimi> wifi_status # am I connected? mimi> wifi_status # am I connected?
mimi> set_tg_token <token> # set Telegram bot token
mimi> set_api_key <key> # set Anthropic API key
mimi> set_model claude-opus-4-6 # use a different model mimi> set_model claude-opus-4-6 # use a different model
mimi> set_proxy 10.0.0.1 7897 # optional: route through HTTP proxy mimi> set_search_key <key> # set Brave Search API key (for web_search tool)
mimi> clear_proxy # optional: remove proxy, connect directly mimi> set_proxy 10.0.0.1 7897 # route through HTTP proxy
mimi> clear_proxy # remove proxy, connect directly
mimi> memory_read # see what the bot remembers mimi> memory_read # see what the bot remembers
mimi> heap_info # how much RAM is free? mimi> heap_info # how much RAM is free?
mimi> session_list # list all chat sessions mimi> session_list # list all chat sessions
@@ -105,12 +141,23 @@ MimiClaw stores everything as plain text files you can read and edit:
| `2026-02-05.md` | Daily notes — what happened today | | `2026-02-05.md` | Daily notes — what happened today |
| `tg_12345.jsonl` | Chat history — your conversation with the bot | | `tg_12345.jsonl` | Chat history — your conversation with the bot |
## Tools
MimiClaw uses Anthropic's tool use protocol — Claude can call tools during a conversation and loop until the task is done (ReAct pattern).
| Tool | Description |
|------|-------------|
| `web_search` | Search the web via Brave Search API for current information |
To enable web search, set a [Brave Search API key](https://brave.com/search/api/) in your config file or via CLI (`set_search_key`).
## Also Included ## Also Included
- **WebSocket gateway** on port 18789 — connect from your LAN with any WebSocket client - **WebSocket gateway** on port 18789 — connect from your LAN with any WebSocket client
- **OTA updates** — flash new firmware over WiFi, no USB needed - **OTA updates** — flash new firmware over WiFi, no USB needed
- **Dual-core** — network I/O and AI processing run on separate CPU cores - **Dual-core** — network I/O and AI processing run on separate CPU cores
- **HTTP proxy** — CONNECT tunnel support for restricted networks - **HTTP proxy** — CONNECT tunnel support for restricted networks
- **Tool use** — ReAct agent loop with Anthropic tool use protocol
## For Developers ## For Developers

View File

@@ -62,22 +62,54 @@ git clone https://github.com/memovai/mimiclaw.git
cd mimiclaw cd mimiclaw
idf.py set-target esp32s3 idf.py set-target esp32s3
```
### 配置
**方式 A配置文件推荐** — 填一次,编译时写入固件:
```bash
cp main/mimi_secrets.h.example main/mimi_secrets.h
```
编辑 `main/mimi_secrets.h`
```c
#define MIMI_SECRET_WIFI_SSID "你的WiFi名"
#define MIMI_SECRET_WIFI_PASS "你的WiFi密码"
#define MIMI_SECRET_TG_TOKEN "123456:ABC-DEF1234ghIkl-zyx57W2v1u123ew11"
#define MIMI_SECRET_API_KEY "sk-ant-api03-xxxxx"
#define MIMI_SECRET_SEARCH_KEY "" // 可选Brave Search API key
#define MIMI_SECRET_PROXY_HOST "10.0.0.1" // 可选:代理地址
#define MIMI_SECRET_PROXY_PORT "7897" // 可选:代理端口
```
然后编译烧录:
```bash
idf.py build idf.py build
idf.py -p /dev/ttyACM0 flash monitor idf.py -p /dev/ttyACM0 flash monitor
``` ```
### 设置 配置文件的值**优先级最高** — 会覆盖 CLI 设置的值。
烧录后会出现串口终端,输入以下命令: > **注意**:修改 `mimi_secrets.h` 后,需要先执行 `touch main/mimi_config.h` 再 `idf.py build`,否则不会重新编译。
**方式 B串口命令行** — 烧录后在运行时配置:
```bash
idf.py build
idf.py -p /dev/ttyACM0 flash monitor
```
``` ```
mimi> wifi_set YourWiFiName YourWiFiPassword mimi> wifi_set 你的WiFi名 你的WiFi密码
mimi> set_tg_token 123456:ABC-DEF1234ghIkl-zyx57W2v1u123ew11 mimi> set_tg_token 123456:ABC-DEF1234ghIkl-zyx57W2v1u123ew11
mimi> set_api_key sk-ant-api03-xxxxx mimi> set_api_key sk-ant-api03-xxxxx
mimi> restart mimi> restart
``` ```
就这样。重启后在 Telegram 找到你的 Bot开始聊天 CLI 设置的值存在 NVS持久 Flash仅在配置文件未设置对应值时生效
### 代理配置(国内用户) ### 代理配置(国内用户)
@@ -85,15 +117,14 @@ mimi> restart
**前提**:局域网内有一个支持 HTTP CONNECT 的代理Clash Verge、V2Ray 等),并开启了「允许局域网连接」。 **前提**:局域网内有一个支持 HTTP CONNECT 的代理Clash Verge、V2Ray 等),并开启了「允许局域网连接」。
推荐直接在 `mimi_secrets.h` 中配置代理(见上方方式 A也可以用命令行
``` ```
mimi> set_proxy 10.0.0.1 7897 mimi> set_proxy 10.0.0.1 7897
mimi> restart mimi> restart
``` ```
- `10.0.0.1` — 代理机器的局域网 IP 清除代理恢复直连:
- `7897` — 代理的 HTTP 端口(不是 SOCKS 端口)
设置后所有 HTTPS 请求通过 CONNECT 隧道发出TLS 证书正常验证。清除代理恢复直连:
``` ```
mimi> clear_proxy mimi> clear_proxy
@@ -102,13 +133,17 @@ mimi> restart
> **提示**:确保 ESP32-S3 和代理机器在同一局域网。Clash Verge 在「设置 → 允许局域网」中开启。 > **提示**:确保 ESP32-S3 和代理机器在同一局域网。Clash Verge 在「设置 → 允许局域网」中开启。
### 更多命令 ### 所有命令
``` ```
mimi> wifi_set <ssid> <pass> # 设置 WiFi
mimi> wifi_status # 连上了吗? mimi> wifi_status # 连上了吗?
mimi> set_tg_token <token> # 设置 Telegram Bot Token
mimi> set_api_key <key> # 设置 Anthropic API Key
mimi> set_model claude-opus-4-6 # 换个模型 mimi> set_model claude-opus-4-6 # 换个模型
mimi> set_proxy 10.0.0.1 7897 # 可选:通过 HTTP 代理 mimi> set_search_key <key> # 设置 Brave Search API Keyweb_search 工具用)
mimi> clear_proxy # 可选:清除代理,直连 mimi> set_proxy 10.0.0.1 7897 # 通过 HTTP 代理
mimi> clear_proxy # 清除代理,直连
mimi> memory_read # 看看它记住了什么 mimi> memory_read # 看看它记住了什么
mimi> heap_info # 还剩多少内存? mimi> heap_info # 还剩多少内存?
mimi> session_list # 列出所有会话 mimi> session_list # 列出所有会话
@@ -128,12 +163,23 @@ MimiClaw 把所有数据存为纯文本文件,可以直接读取和编辑:
| `2026-02-05.md` | 每日笔记 — 今天发生了什么 | | `2026-02-05.md` | 每日笔记 — 今天发生了什么 |
| `tg_12345.jsonl` | 聊天记录 — 你和它的对话 | | `tg_12345.jsonl` | 聊天记录 — 你和它的对话 |
## 工具
MimiClaw 使用 Anthropic 的 tool use 协议 — Claude 在对话中可以调用工具循环执行直到任务完成ReAct 模式)。
| 工具 | 说明 |
|------|------|
| `web_search` | 通过 Brave Search API 搜索网页,获取实时信息 |
启用网页搜索需要设置 [Brave Search API key](https://brave.com/search/api/),在配置文件或 CLI`set_search_key`)中设置。
## 其他功能 ## 其他功能
- **WebSocket 网关** — 端口 18789局域网内用任意 WebSocket 客户端连接 - **WebSocket 网关** — 端口 18789局域网内用任意 WebSocket 客户端连接
- **OTA 更新** — WiFi 远程刷固件,无需 USB - **OTA 更新** — WiFi 远程刷固件,无需 USB
- **双核** — 网络 I/O 和 AI 处理分别跑在不同 CPU 核心 - **双核** — 网络 I/O 和 AI 处理分别跑在不同 CPU 核心
- **HTTP 代理** — CONNECT 隧道,适配受限网络 - **HTTP 代理** — CONNECT 隧道,适配受限网络
- **工具调用** — ReAct Agent 循环Anthropic tool use 协议
## 开发者 ## 开发者

View File

@@ -20,17 +20,21 @@ Telegram App (User)
│ │ Poller │ └────────┬─────────┘ │ │ │ Poller │ └────────┬─────────┘ │
│ │ (Core 0) │ │ │ │ │ (Core 0) │ │ │
│ └─────────────┘ ▼ │ │ └─────────────┘ ▼ │
┌──────────────┐ │ ┌────────────────────────┐ │
│ ┌─────────────┐ │ Agent Loop │ │ │ ┌─────────────┐ │ Agent Loop │ │
│ │ WebSocket │──────▶│ (Core 1) │ │ WebSocket │─▶│ (Core 1)
│ │ Server │ │ │ │ │ │ Server │ │ │ │
│ │ (:18789) │ │ Context ──▶ LLM Proxy │ │ │ (:18789) │ │ Context ──▶ LLM Proxy │
│ └─────────────┘ │ Builder (HTTPS) │ │ └─────────────┘ │ Builder (HTTPS) │
└──────┬───────┘ │ ▲ │
│ ┌─────────────┐ │ ┌─────────────┐ tool_use?
│ │ Serial CLI │ ▼ │ │ Serial CLI │
│ │ (Core 0) │ ┌──────────────┐ │ │ (Core 0) │ Tool Results ◀─ Tools │
│ └─────────────┘ │ Outbound Queue│ │ └─────────────┘ (web_search)│
│ └──────────┬─────────────┘ │
│ │ │
│ ┌──────▼───────┐ │
│ │ Outbound Queue│ │
│ └──────┬───────┘ │ │ └──────┬───────┘ │
│ │ │ │ │ │
│ ┌──────▼───────┐ │ │ ┌──────▼───────┐ │
@@ -48,13 +52,14 @@ Telegram App (User)
│ │ /spiffs/memory/ MEMORY.md, YYYY-MM-DD │ │ │ │ /spiffs/memory/ MEMORY.md, YYYY-MM-DD │ │
│ │ /spiffs/sessions/ tg_<chat_id>.jsonl │ │ │ │ /spiffs/sessions/ tg_<chat_id>.jsonl │ │
│ └──────────────────────────────────────────┘ │ │ └──────────────────────────────────────────┘ │
└──────────────────────────────────────────────────┘ └──────────────────────────────────────────────────
│ Anthropic Messages API (HTTPS + SSE) │ Anthropic Messages API (HTTPS)
│ + Brave Search API (HTTPS)
┌───────────┐ ┌───────────┐ ┌──────────────┐
│ Claude API │ │ Claude API │ │ Brave Search │
└───────────┘ └───────────┘ └──────────────┘
``` ```
--- ---
@@ -67,12 +72,18 @@ Telegram App (User)
3. Message pushed to Inbound Queue (FreeRTOS xQueue) 3. Message pushed to Inbound Queue (FreeRTOS xQueue)
4. Agent Loop (Core 1) pops message: 4. Agent Loop (Core 1) pops message:
a. Load session history from SPIFFS (JSONL) a. Load session history from SPIFFS (JSONL)
b. Build system prompt (SOUL.md + USER.md + MEMORY.md + recent notes) b. Build system prompt (SOUL.md + USER.md + MEMORY.md + recent notes + tool guidance)
c. Build messages array (history + current message) c. Build cJSON messages array (history + current message)
d. Call Claude API via HTTPS (SSE streaming) d. ReAct loop (max 10 iterations):
e. Accumulate streamed response tokens i. Call Claude API via HTTPS (non-streaming, with tools array)
f. Save user + assistant messages to session file ii. Parse JSON response → text blocks + tool_use blocks
g. Push response to Outbound Queue iii. If stop_reason == "tool_use":
- Execute each tool (e.g. web_search → Brave Search API)
- Append assistant content + tool_result to messages
- Continue loop
iv. If stop_reason == "end_turn": break with final text
e. Save user message + final assistant text to session file
f. Push response to Outbound Queue
5. Outbound Dispatch (Core 0) pops response: 5. Outbound Dispatch (Core 0) pops response:
a. Route by channel field ("telegram" → sendMessage, "websocket" → WS frame) a. Route by channel field ("telegram" → sendMessage, "websocket" → WS frame)
6. User receives reply 6. User receives reply
@@ -85,7 +96,9 @@ Telegram App (User)
``` ```
main/ main/
├── mimi.c Entry point — app_main() orchestrates init + startup ├── mimi.c Entry point — app_main() orchestrates init + startup
├── mimi_config.h All compile-time constants in one place ├── mimi_config.h All compile-time constants + build-time secrets include
├── mimi_secrets.h Build-time credentials (gitignored, highest priority)
├── mimi_secrets.h.example Template for mimi_secrets.h
├── bus/ ├── bus/
│ ├── message_bus.h mimi_msg_t struct, queue API │ ├── message_bus.h mimi_msg_t struct, queue API
@@ -100,14 +113,20 @@ main/
│ └── telegram_bot.c Long polling loop, JSON parsing, message splitting │ └── telegram_bot.c Long polling loop, JSON parsing, message splitting
├── llm/ ├── llm/
│ ├── llm_proxy.h llm_chat() API │ ├── llm_proxy.h llm_chat() + llm_chat_tools() API, tool_use types
│ └── llm_proxy.c Anthropic Messages API, SSE stream parser │ └── llm_proxy.c Anthropic Messages API (non-streaming), tool_use parsing
├── agent/ ├── agent/
│ ├── agent_loop.h Agent task init/start │ ├── agent_loop.h Agent task init/start
│ ├── agent_loop.c Main processing loop: inbound → context → LLM → outbound │ ├── agent_loop.c ReAct loop: LLM call → tool execution → repeat
│ ├── context_builder.h System prompt + messages builder API │ ├── context_builder.h System prompt + messages builder API
│ └── context_builder.c Reads bootstrap files + memory, assembles prompt │ └── context_builder.c Reads bootstrap files + memory + tool guidance
├── tools/
│ ├── tool_registry.h Tool definition struct, register/dispatch API
│ ├── tool_registry.c Tool registration, JSON schema builder, dispatch by name
│ ├── tool_web_search.h Web search tool API
│ └── tool_web_search.c Brave Search API via HTTPS (direct + proxy)
├── memory/ ├── memory/
│ ├── memory_store.h Long-term + daily memory API │ ├── memory_store.h Long-term + daily memory API
@@ -125,7 +144,7 @@ main/
├── cli/ ├── cli/
│ ├── serial_cli.h CLI init API │ ├── serial_cli.h CLI init API
│ └── serial_cli.c esp_console REPL with 14 commands │ └── serial_cli.c esp_console REPL with 15 commands
└── ota/ └── ota/
├── ota_manager.h OTA update API ├── ota_manager.h OTA update API
@@ -207,7 +226,7 @@ Session files are JSONL (one JSON object per line):
## NVS Configuration ## NVS Configuration
| Namespace | Key | Description | | Namespace | Key | Description |
|---------------|--------------|-----------------------------------------| |-----------------|--------------|-----------------------------------------|
| `wifi_config` | `ssid` | WiFi SSID | | `wifi_config` | `ssid` | WiFi SSID |
| `wifi_config` | `password` | WiFi password | | `wifi_config` | `password` | WiFi password |
| `tg_config` | `bot_token` | Telegram Bot API token | | `tg_config` | `bot_token` | Telegram Bot API token |
@@ -215,8 +234,11 @@ Session files are JSONL (one JSON object per line):
| `llm_config` | `model` | Model ID (default: claude-opus-4-6) | | `llm_config` | `model` | Model ID (default: claude-opus-4-6) |
| `proxy_config` | `host` | HTTP proxy hostname/IP | | `proxy_config` | `host` | HTTP proxy hostname/IP |
| `proxy_config` | `port` | HTTP proxy port | | `proxy_config` | `port` | HTTP proxy port |
| `search_config` | `api_key` | Brave Search API key |
All configured via Serial CLI commands: `wifi_set`, `set_tg_token`, `set_api_key`, `set_model`, `set_proxy`, `clear_proxy`. **Configuration priority**: `mimi_secrets.h` (build-time) > NVS (CLI-set) > defaults.
All configurable via Serial CLI or build-time config file (`mimi_secrets.h`).
--- ---
@@ -260,33 +282,50 @@ Client `chat_id` is auto-assigned on connection (`ws_<fd>`) but can be overridde
Endpoint: `POST https://api.anthropic.com/v1/messages` Endpoint: `POST https://api.anthropic.com/v1/messages`
Request format (Anthropic-native, not OpenAI): Request format (Anthropic-native, non-streaming, with tools):
```json ```json
{ {
"model": "claude-opus-4-6", "model": "claude-opus-4-6",
"max_tokens": 4096, "max_tokens": 4096,
"stream": true,
"system": "<system prompt>", "system": "<system prompt>",
"tools": [
{
"name": "web_search",
"description": "Search the web for current information.",
"input_schema": {"type": "object", "properties": {"query": {"type": "string"}}, "required": ["query"]}
}
],
"messages": [ "messages": [
{"role": "user", "content": "Hello"}, {"role": "user", "content": "Hello"},
{"role": "assistant", "content": "Hi!"}, {"role": "assistant", "content": "Hi!"},
{"role": "user", "content": "How are you?"} {"role": "user", "content": "What's the weather today?"}
] ]
} }
``` ```
Key difference from OpenAI: `system` is a top-level field, not inside the `messages` array. Key difference from OpenAI: `system` is a top-level field, not inside the `messages` array.
SSE streaming response events: Non-streaming JSON response:
``` ```json
event: content_block_delta {
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":"Hello"}} "id": "msg_xxx",
"type": "message",
event: message_stop "role": "assistant",
data: {"type":"message_stop"} "content": [
{"type": "text", "text": "Let me search for that."},
{"type": "tool_use", "id": "toolu_xxx", "name": "web_search", "input": {"query": "weather today"}}
],
"stop_reason": "tool_use"
}
``` ```
The SSE parser in `llm_proxy.c` accumulates `text_delta` tokens into a response buffer. When `stop_reason` is `"tool_use"`, the agent loop executes each tool and sends results back:
```json
{"role": "assistant", "content": [<text + tool_use blocks>]}
{"role": "user", "content": [{"type": "tool_result", "tool_use_id": "toolu_xxx", "content": "..."}]}
```
The loop repeats until `stop_reason` is `"end_turn"` (max 10 iterations).
--- ---
@@ -301,13 +340,14 @@ app_main()
├── memory_store_init() Verify SPIFFS paths ├── memory_store_init() Verify SPIFFS paths
├── session_mgr_init() ├── session_mgr_init()
├── wifi_manager_init() Init WiFi STA mode + event handlers ├── wifi_manager_init() Init WiFi STA mode + event handlers
├── http_proxy_init() Load proxy config from NVS ├── http_proxy_init() Load proxy config (secrets > NVS)
├── telegram_bot_init() Load bot token from NVS ├── telegram_bot_init() Load bot token (secrets > NVS)
├── llm_proxy_init() Load API key + model from NVS ├── llm_proxy_init() Load API key + model (secrets > NVS)
├── tool_registry_init() Register tools, build tools JSON
├── agent_loop_init() ├── agent_loop_init()
├── serial_cli_init() Start REPL (works without WiFi) ├── serial_cli_init() Start REPL (works without WiFi)
├── wifi_manager_start() Connect using NVS credentials ├── wifi_manager_start() Connect (secrets > NVS credentials)
│ └── wifi_manager_wait_connected(30s) │ └── wifi_manager_wait_connected(30s)
└── [if WiFi connected] └── [if WiFi connected]
@@ -330,6 +370,7 @@ If WiFi credentials are missing or connection times out, the CLI remains availab
| `set_tg_token <TOKEN>` | Save Telegram bot token | | `set_tg_token <TOKEN>` | Save Telegram bot token |
| `set_api_key <KEY>` | Save Anthropic API key | | `set_api_key <KEY>` | Save Anthropic API key |
| `set_model <MODEL_ID>` | Set LLM model identifier | | `set_model <MODEL_ID>` | Set LLM model identifier |
| `set_search_key <KEY>` | Save Brave Search API key |
| `set_proxy <HOST> <PORT>` | Set HTTP CONNECT proxy | | `set_proxy <HOST> <PORT>` | Set HTTP CONNECT proxy |
| `clear_proxy` | Remove proxy, use direct connection | | `clear_proxy` | Remove proxy, use direct connection |
| `memory_read` | Print MEMORY.md contents | | `memory_read` | Print MEMORY.md contents |
@@ -340,22 +381,24 @@ If WiFi credentials are missing or connection times out, the CLI remains availab
| `restart` | Reboot the device | | `restart` | Reboot the device |
| `help` | List all available commands | | `help` | List all available commands |
> **Note**: CLI-set values are stored in NVS but are overridden by `mimi_secrets.h` build-time values if set.
--- ---
## Nanobot Reference Mapping ## Nanobot Reference Mapping
| Nanobot Module | MimiClaw Equivalent | Notes | | Nanobot Module | MimiClaw Equivalent | Notes |
|-----------------------------|--------------------------------|------------------------------| |-----------------------------|--------------------------------|------------------------------|
| `agent/loop.py` | `agent/agent_loop.c` | Simplified: no tool use loop | | `agent/loop.py` | `agent/agent_loop.c` | ReAct loop with tool use |
| `agent/context.py` | `agent/context_builder.c` | Loads SOUL.md + USER.md + memory | | `agent/context.py` | `agent/context_builder.c` | Loads SOUL.md + USER.md + memory + tool guidance |
| `agent/memory.py` | `memory/memory_store.c` | MEMORY.md + daily notes | | `agent/memory.py` | `memory/memory_store.c` | MEMORY.md + daily notes |
| `session/manager.py` | `memory/session_mgr.c` | JSONL per chat, ring buffer | | `session/manager.py` | `memory/session_mgr.c` | JSONL per chat, ring buffer |
| `channels/telegram.py` | `telegram/telegram_bot.c` | Raw HTTP, no python-telegram-bot | | `channels/telegram.py` | `telegram/telegram_bot.c` | Raw HTTP, no python-telegram-bot |
| `bus/events.py` + `queue.py`| `bus/message_bus.c` | FreeRTOS queues vs asyncio | | `bus/events.py` + `queue.py`| `bus/message_bus.c` | FreeRTOS queues vs asyncio |
| `providers/litellm_provider.py` | `llm/llm_proxy.c` | Direct Anthropic API only | | `providers/litellm_provider.py` | `llm/llm_proxy.c` | Direct Anthropic API only |
| `config/schema.py` | `mimi_config.h` + NVS | Compile-time + NVS storage | | `config/schema.py` | `mimi_config.h` + `mimi_secrets.h` + NVS | Build-time secrets > NVS |
| `cli/commands.py` | `cli/serial_cli.c` | esp_console REPL | | `cli/commands.py` | `cli/serial_cli.c` | esp_console REPL |
| `agent/tools/*` | *(not yet implemented)* | See TODO.md | | `agent/tools/*` | `tools/tool_registry.c` + `tool_web_search.c` | web_search via Brave API |
| `agent/subagent.py` | *(not yet implemented)* | See TODO.md | | `agent/subagent.py` | *(not yet implemented)* | See TODO.md |
| `agent/skills.py` | *(not yet implemented)* | See TODO.md | | `agent/skills.py` | *(not yet implemented)* | See TODO.md |
| `cron/service.py` | *(not yet implemented)* | See TODO.md | | `cron/service.py` | *(not yet implemented)* | See TODO.md |

View File

@@ -7,11 +7,8 @@
## P0 — Core Agent Capabilities ## P0 — Core Agent Capabilities
### [ ] Tool Use Loop (multi-turn agent iteration) ### [x] ~~Tool Use Loop (multi-turn agent iteration)~~
- **nanobot**: `loop.py` L167-210 — while loop calls LLM, checks `response.has_tool_calls`, executes tools, feeds results back into messages, repeats until LLM stops calling tools (max 20 iterations) - Implemented: `agent_loop.c` ReAct loop with `llm_chat_tools()`, max 10 iterations, non-streaming JSON parsing
- **MimiClaw**: `agent_loop.c` only makes a single LLM call (one-shot), cannot use any tools
- **Scope**: Need to parse Anthropic API `tool_use` content blocks, implement tool execution loop
- **Note**: Anthropic tool_use format differs from OpenAI — uses content blocks, not function_call
### [ ] Memory Write via Tool Use (agent-driven memory persistence) ### [ ] Memory Write via Tool Use (agent-driven memory persistence)
- **openclaw**: Agent uses standard `write`/`edit` tools to write `MEMORY.md` and `memory/YYYY-MM-DD.md`; system prompt instructs agent to persist important information; pre-compaction memory flush triggers a silent agent turn to save durable memories before context window limit - **openclaw**: Agent uses standard `write`/`edit` tools to write `MEMORY.md` and `memory/YYYY-MM-DD.md`; system prompt instructs agent to persist important information; pre-compaction memory flush triggers a silent agent turn to save durable memories before context window limit
@@ -19,20 +16,13 @@
- **Scope**: Expose `memory_write` and `memory_append_today` as tool_use tools for Claude; add system prompt guidance on when to persist memory; optionally add pre-compaction flush (trigger memory save when session history nears `MIMI_SESSION_MAX_MSGS`) - **Scope**: Expose `memory_write` and `memory_append_today` as tool_use tools for Claude; add system prompt guidance on when to persist memory; optionally add pre-compaction flush (trigger memory save when session history nears `MIMI_SESSION_MAX_MSGS`)
- **Depends on**: Tool Use Loop - **Depends on**: Tool Use Loop
### [ ] Tool Registry + Built-in Tools ### [x] ~~Tool Registry + web_search Tool~~
- **nanobot**: `tools/registry.py` dynamic tool registration/execution, `tools/base.py` defines abstract Tool base class - Implemented: `tools/tool_registry.c` — tool registration, JSON schema builder, dispatch by name
- **nanobot built-in tools**: - Implemented: `tools/tool_web_search.c` — Brave Search API via HTTPS (direct + proxy support)
- `read_file` — read files (`tools/filesystem.py`)
- `write_file` — write files ### [ ] More Built-in Tools
- `edit_file`edit files - **nanobot built-in tools** not yet ported: `read_file`, `write_file`, `edit_file`, `list_dir`, `message`
- `list_dir`list directory - **Recommendation**: Reasonable tool subset for ESP32: `read_file`, `write_file`, `list_dir` (SPIFFS), `message`, `memory_write`
- `exec` — execute shell commands (`tools/shell.py`)
- `web_search` — web search (`tools/web.py`)
- `web_fetch` — fetch web pages
- `message` — send message to user (`tools/message.py`)
- `spawn` — launch subagent (`tools/spawn.py`)
- **MimiClaw**: No tool system at all
- **Recommendation**: Reasonable tool subset for ESP32: `read_file`, `write_file`, `list_dir` (SPIFFS), `message`. Shell/web not suitable for MCU
### [ ] Subagent / Spawn Background Tasks ### [ ] Subagent / Spawn Background Tasks
- **nanobot**: `subagent.py` — SubagentManager spawns independent agent instances with isolated tool sets and system prompts, announces results back to main agent via system channel - **nanobot**: `subagent.py` — SubagentManager spawns independent agent instances with isolated tool sets and system prompts, announces results back to main agent via system channel
@@ -77,10 +67,8 @@
- **MimiClaw**: `context_builder.c` only reads last 3 days - **MimiClaw**: `context_builder.c` only reads last 3 days
- **Recommendation**: Make configurable, but mind token budget - **Recommendation**: Make configurable, but mind token budget
### [ ] System Prompt Tool Guidance ### [x] ~~System Prompt Tool Guidance~~
- **nanobot**: `context.py` L74-101 — includes current time, workspace path, tool usage instructions - Implemented: `context_builder.c` includes tool usage guidance in system prompt
- **MimiClaw**: Has current time, but lacks tool usage guide and workspace description
- **Depends on**: Tool Use implementation
### [ ] Message Metadata (media, reply_to, metadata) ### [ ] Message Metadata (media, reply_to, metadata)
- **nanobot**: `bus/events.py` — InboundMessage has media, metadata fields; OutboundMessage has reply_to - **nanobot**: `bus/events.py` — InboundMessage has media, metadata fields; OutboundMessage has reply_to
@@ -116,10 +104,9 @@
- **MimiClaw**: Not implemented - **MimiClaw**: Not implemented
- **Recommendation**: Requires extra HTTPS request to Whisper API: download Telegram voice -> forward -> get text - **Recommendation**: Requires extra HTTPS request to Whisper API: download Telegram voice -> forward -> get text
### [ ] YAML Config File System ### [x] ~~Build-time Config File~~
- **nanobot**: `config/loader.py` + `config/schema.py` — Pydantic config validation, YAML config support - Implemented: `mimi_secrets.h` — build-time credentials with highest priority over NVS/CLI
- **MimiClaw**: All configuration via NVS key-value storage - Replaces need for YAML config; suitable for MCU workflow
- **Recommendation**: Current NVS approach is suitable for MCU, no change needed
### [ ] WebSocket Gateway Protocol Enhancement ### [ ] WebSocket Gateway Protocol Enhancement
- **nanobot**: Gateway port 18790 + richer protocol - **nanobot**: Gateway port 18790 + richer protocol
@@ -150,32 +137,34 @@
- [x] Telegram Bot long polling (getUpdates) - [x] Telegram Bot long polling (getUpdates)
- [x] Message Bus (inbound/outbound queues) - [x] Message Bus (inbound/outbound queues)
- [x] Agent Loop basic flow (single LLM call) - [x] Agent Loop with ReAct tool use (multi-turn, max 10 iterations)
- [x] Claude API (Anthropic Messages API + SSE streaming) - [x] Claude API (Anthropic Messages API, non-streaming, tool_use protocol)
- [x] Context Builder (system prompt + bootstrap files + memory) - [x] Tool Registry + web_search tool (Brave Search API)
- [x] Context Builder (system prompt + bootstrap files + memory + tool guidance)
- [x] Memory Store (MEMORY.md + daily notes) - [x] Memory Store (MEMORY.md + daily notes)
- [x] Session Manager (JSONL per chat_id, ring buffer history) - [x] Session Manager (JSONL per chat_id, ring buffer history)
- [x] WebSocket Gateway (port 18789, JSON protocol) - [x] WebSocket Gateway (port 18789, JSON protocol)
- [x] Serial CLI (esp_console, 14 commands) - [x] Serial CLI (esp_console, 15 commands)
- [x] HTTP CONNECT Proxy (Telegram + Claude API via proxy tunnel) - [x] HTTP CONNECT Proxy (Telegram + Claude API + Brave Search via proxy tunnel)
- [x] OTA Update - [x] OTA Update
- [x] WiFi Manager (NVS credentials, exponential backoff) - [x] WiFi Manager (NVS credentials, exponential backoff)
- [x] SPIFFS storage - [x] SPIFFS storage
- [x] NVS configuration (token, API key, model) - [x] Build-time config (`mimi_secrets.h`, highest priority over NVS)
- [x] NVS configuration (token, API key, model, search key)
--- ---
## Suggested Implementation Order ## Suggested Implementation Order
``` ```
1. Tool Use Loop + Tool Registry <- this determines whether the agent is truly "intelligent" 1. [done] Tool Use Loop + Tool Registry + web_search
2. Memory Write via Tool Use <- makes the agent actually remember 2. Memory Write via Tool Use <- makes the agent actually remember
3. Built-in Tools (read_file, write_file, message) 3. Built-in Tools (read_file, write_file, message)
3. Telegram Allowlist (allow_from) <- security essential 4. Telegram Allowlist (allow_from) <- security essential
4. Bootstrap File Completion (AGENTS.md, TOOLS.md) 5. Bootstrap File Completion (AGENTS.md, TOOLS.md)
5. Subagent (simplified) 6. Subagent (simplified)
6. Telegram Markdown -> HTML 7. Telegram Markdown -> HTML
7. Media Handling 8. Media Handling
8. Cron / Heartbeat 9. Cron / Heartbeat
9. Other enhancements 10. Other enhancements
``` ```