Files

crispyberry 7a6609b722 docs: add memory write via tool use to TODO, mark proxy as done

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-02-06 23:12:57 +08:00

9.4 KiB

Raw Blame History

MimiClaw vs Nanobot — Feature Gap Tracker

Comparing against nanobot/ reference implementation. Tracks features MimiClaw has not yet aligned with. Priority: P0 = Core missing, P1 = Important enhancement, P2 = Nice to have

P0 — Core Agent Capabilities

[ ] Tool Use Loop (multi-turn agent iteration)

nanobot: loop.py L167-210 — while loop calls LLM, checks response.has_tool_calls, executes tools, feeds results back into messages, repeats until LLM stops calling tools (max 20 iterations)
MimiClaw: agent_loop.c only makes a single LLM call (one-shot), cannot use any tools
Scope: Need to parse Anthropic API tool_use content blocks, implement tool execution loop
Note: Anthropic tool_use format differs from OpenAI — uses content blocks, not function_call

[ ] Memory Write via Tool Use (agent-driven memory persistence)

openclaw: Agent uses standard write/edit tools to write MEMORY.md and memory/YYYY-MM-DD.md; system prompt instructs agent to persist important information; pre-compaction memory flush triggers a silent agent turn to save durable memories before context window limit
MimiClaw: memory_write_long_term and memory_append_today exist but are only called from CLI; agent loop never writes memory
Scope: Expose memory_write and memory_append_today as tool_use tools for Claude; add system prompt guidance on when to persist memory; optionally add pre-compaction flush (trigger memory save when session history nears MIMI_SESSION_MAX_MSGS)
Depends on: Tool Use Loop

[ ] Tool Registry + Built-in Tools

nanobot: tools/registry.py — dynamic tool registration/execution, tools/base.py defines abstract Tool base class
nanobot built-in tools:
- read_file — read files (tools/filesystem.py)
- write_file — write files
- edit_file — edit files
- list_dir — list directory
- exec — execute shell commands (tools/shell.py)
- web_search — web search (tools/web.py)
- web_fetch — fetch web pages
- message — send message to user (tools/message.py)
- spawn — launch subagent (tools/spawn.py)
MimiClaw: No tool system at all
Recommendation: Reasonable tool subset for ESP32: read_file, write_file, list_dir (SPIFFS), message. Shell/web not suitable for MCU

[ ] Subagent / Spawn Background Tasks

nanobot: subagent.py — SubagentManager spawns independent agent instances with isolated tool sets and system prompts, announces results back to main agent via system channel
MimiClaw: Not implemented
Recommendation: ESP32 memory is limited; simplify to a single background FreeRTOS task for long-running work, inject result into inbound queue on completion

P1 — Important Features

[ ] Telegram User Allowlist (allow_from)

nanobot: channels/base.py L59-82 — is_allowed() checks sender_id against allow_list
MimiClaw: No authentication; anyone can message the bot and consume API credits
Recommendation: Store allow_from list in NVS, filter in process_updates()

[ ] Telegram Markdown to HTML Conversion

nanobot: channels/telegram.py L16-76 — _markdown_to_telegram_html() full converter: code blocks, inline code, bold, italic, links, strikethrough, lists
MimiClaw: Uses parse_mode: Markdown directly; special characters can cause send failures (has fallback to plain text)
Recommendation: Implement simplified Markdown-to-HTML converter, or switch to parse_mode: HTML

[ ] Telegram /start Command

nanobot: telegram.py L183-192 — handles /start command, replies with welcome message
MimiClaw: Not handled; /start is sent to Claude as a regular message

[ ] Telegram Media Handling (photos/voice/files)

nanobot: telegram.py L194-289 — handles photo, voice, audio, document; downloads files; transcribes voice
MimiClaw: Only processes message.text, ignores all media messages
Recommendation: Images can be base64-encoded for Claude Vision; voice requires Whisper API (extra HTTPS request)

[ ] Skills System (pluggable capabilities)

nanobot: agent/skills.py — loads skills from SKILL.md files, supports always-loaded and on-demand, frontmatter metadata, requirements checking
MimiClaw: Not implemented
Recommendation: Simplified version: store SKILL.md files on SPIFFS, load into system prompt via context_builder

[ ] Full Bootstrap File Alignment

nanobot: Loads AGENTS.md, SOUL.md, USER.md, TOOLS.md, IDENTITY.md (5 files)
MimiClaw: Only loads SOUL.md and USER.md
Recommendation: Add AGENTS.md (behavior guidelines) and TOOLS.md (tool documentation)

[ ] Longer Memory Lookback

nanobot: memory.py L56-80 — get_recent_memories(days=7) defaults to 7 days
MimiClaw: context_builder.c only reads last 3 days
Recommendation: Make configurable, but mind token budget

[ ] System Prompt Tool Guidance

nanobot: context.py L74-101 — includes current time, workspace path, tool usage instructions
MimiClaw: Has current time, but lacks tool usage guide and workspace description
Depends on: Tool Use implementation

[ ] Message Metadata (media, reply_to, metadata)

nanobot: bus/events.py — InboundMessage has media, metadata fields; OutboundMessage has reply_to
MimiClaw: mimi_msg_t only has channel + chat_id + content
Recommendation: Extend msg struct, add media_path and metadata fields

[ ] Outbound Subscription Pattern

nanobot: bus/queue.py L41-49 — supports subscribe_outbound(channel, callback) subscription model
MimiClaw: Hardcoded if-else dispatch
Recommendation: Current approach is simple and reliable; not worth changing with few channels

P2 — Advanced Features

[ ] Cron Scheduled Task Service

nanobot: cron/service.py — full cron scheduler supporting at/every/cron expressions, persistent storage, timed agent triggers
MimiClaw: Not implemented
Recommendation: Use FreeRTOS timer for simplified version, support "every N minutes" only

[ ] Heartbeat Service

nanobot: heartbeat/service.py — reads HEARTBEAT.md every 30 minutes, triggers agent if tasks are found
MimiClaw: Not implemented
Recommendation: Simple FreeRTOS timer that periodically checks HEARTBEAT.md

[ ] Multi-LLM Provider Support

nanobot: providers/litellm_provider.py — supports OpenRouter, Anthropic, OpenAI, Gemini, DeepSeek, Groq, Zhipu, vLLM via LiteLLM
MimiClaw: Hardcoded to Anthropic Messages API
Recommendation: Abstract LLM interface, support OpenAI-compatible API (most providers are compatible)

[ ] Voice Transcription

nanobot: providers/transcription.py — Groq Whisper API
MimiClaw: Not implemented
Recommendation: Requires extra HTTPS request to Whisper API: download Telegram voice -> forward -> get text

[ ] YAML Config File System

nanobot: config/loader.py + config/schema.py — Pydantic config validation, YAML config support
MimiClaw: All configuration via NVS key-value storage
Recommendation: Current NVS approach is suitable for MCU, no change needed

[ ] WebSocket Gateway Protocol Enhancement

nanobot: Gateway port 18790 + richer protocol
MimiClaw: Basic JSON protocol, lacks streaming token push
Recommendation: Add {"type":"token","content":"..."} streaming push

[ ] Multi-Channel Manager

nanobot: channels/manager.py — unified lifecycle management for multiple channels
MimiClaw: Hardcoded in app_main()
Recommendation: Not worth abstracting with few channels

[ ] WhatsApp / Feishu Channels

nanobot: channels/whatsapp.py, channels/feishu.py
MimiClaw: Only Telegram + WebSocket
Recommendation: Low priority, Telegram is sufficient

[x] Telegram Proxy Support (HTTP CONNECT)

Implemented: HTTP CONNECT tunnel via proxy/http_proxy.c, configurable via NVS + CLI (set_proxy/clear_proxy)

[ ] Session Metadata Persistence

nanobot: session/manager.py L136-153 — session file includes metadata line (created_at, updated_at)
MimiClaw: JSONL only stores role/content/ts, no metadata header
Recommendation: Low priority

Completed Alignment

Telegram Bot long polling (getUpdates)
Message Bus (inbound/outbound queues)
Agent Loop basic flow (single LLM call)
Claude API (Anthropic Messages API + SSE streaming)
Context Builder (system prompt + bootstrap files + memory)
Memory Store (MEMORY.md + daily notes)
Session Manager (JSONL per chat_id, ring buffer history)
WebSocket Gateway (port 18789, JSON protocol)
Serial CLI (esp_console, 14 commands)
HTTP CONNECT Proxy (Telegram + Claude API via proxy tunnel)
OTA Update
WiFi Manager (NVS credentials, exponential backoff)
SPIFFS storage
NVS configuration (token, API key, model)

Suggested Implementation Order

1. Tool Use Loop + Tool Registry    <- this determines whether the agent is truly "intelligent"
2. Memory Write via Tool Use         <- makes the agent actually remember
3. Built-in Tools (read_file, write_file, message)
3. Telegram Allowlist (allow_from)   <- security essential
4. Bootstrap File Completion (AGENTS.md, TOOLS.md)
5. Subagent (simplified)
6. Telegram Markdown -> HTML
7. Media Handling
8. Cron / Heartbeat
9. Other enhancements

9.4 KiB Raw Blame History

MimiClaw vs Nanobot — Feature Gap Tracker

P0 — Core Agent Capabilities

[ ] Tool Use Loop (multi-turn agent iteration)

[ ] Memory Write via Tool Use (agent-driven memory persistence)

[ ] Tool Registry + Built-in Tools

[ ] Subagent / Spawn Background Tasks

P1 — Important Features

[ ] Telegram User Allowlist (allow_from)

[ ] Telegram Markdown to HTML Conversion

[ ] Telegram /start Command

[ ] Telegram Media Handling (photos/voice/files)

[ ] Skills System (pluggable capabilities)

[ ] Full Bootstrap File Alignment

[ ] Longer Memory Lookback

[ ] System Prompt Tool Guidance

[ ] Message Metadata (media, reply_to, metadata)

[ ] Outbound Subscription Pattern

P2 — Advanced Features

[ ] Cron Scheduled Task Service

[ ] Heartbeat Service

[ ] Multi-LLM Provider Support

[ ] Voice Transcription

[ ] YAML Config File System

[ ] WebSocket Gateway Protocol Enhancement

[ ] Multi-Channel Manager

[ ] WhatsApp / Feishu Channels

[x] Telegram Proxy Support (HTTP CONNECT)

[ ] Session Metadata Persistence

Completed Alignment

Suggested Implementation Order

9.4 KiB

Raw Blame History