Update README.MD and add nano-claude-code v3.0 + original-source-code/src

- README.MD: add original-source-code and nano-claude-code sections, update overview table (4 subprojects), add v3.0 news entry, expand comparison table with memory/multi-agent/skills dimensions - nano-claude-code v3.0: multi-agent package (multi_agent/), memory package (memory/), skill package (skill/) with built-in /commit and /review skills, context compression (compaction.py), tool registry plugin system, diff view, 17 slash commands, 18 built-in tools, 101 tests (~5000 lines total) - original-source-code/src: add raw TypeScript source tree (1884 files) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-03 10:26:29 -07:00
parent 3de4c595ea
commit 1d4ffa964d
1942 changed files with 521644 additions and 112 deletions
--- a/nano-claude-code/.gitignore
+++ b/nano-claude-code/.gitignore
@@ -0,0 +1,2 @@
+__pycache__/
+*.pyc
--- a/nano-claude-code/README.md
+++ b/nano-claude-code/README.md
@@ -1,9 +1,10 @@


 <div align="center">
-  <a href="https://github.com/SafeRL-Lab/nano-claude-code">
-    <img src="https://github.com/SafeRL-Lab/nano-claude-code/blob/main/docs/demo.gif" alt="Logo" width="800"> 
+  <a href="[https://github.com/SafeRL-Lab/Robust-Gymnasium](https://github.com/SafeRL-Lab/nano-claude-code)">
+    <img src="https://github.com/SafeRL-Lab/nano-claude-code/blob/main/docs/logo-v1.png" alt="Logo" width="280"> 
  </a>
+
  
 <h1 align="center" style="font-size: 30px;"><strong><em>Nano Claude Code</em></strong>: A Minimal Python Reimplementation</h1>
 <p align="center">
@@ -13,18 +14,27 @@
  </p>
 </div>

+ <div align=center>
+ <img src="https://github.com/SafeRL-Lab/nano-claude-code/blob/main/docs/demo.gif" width="850"/> 
+ </div>
+<div align=center>
+<center style="color:#000000;text-decoration:underline"> </center>
+ </div>
+
+---

 ## 🔥🔥🔥 News (Pacific Time)
- 01:47 PM, Apr 01, 2026: Support VLLM inference (**~2000** lines of Python Code)
- 11:30 AM, Apr 01, 2026: Support more **closed-source** models and **open-source models**: Claude, GPT, Gemini, Kimi, Qwen, Zhipu, DeepSeek, and local open-source models via Ollama or any OpenAI-compatible endpoint. (**~1700** lines of Python Code)
- 09:50 AM, Apr 01, 2026: Support more **closed-source** models: Claude, GPT, Gemini. (**~1300** lines of Python Code)
- 08:23 AM, Apr 01, 2026: Release the initial version of Nano Claude Code (**~900 lines** of Python Code)
+- 12:20 PM, Apr 02, 2026: **v3.0** — Multi-agent packages (`multi_agent/`), memory package (`memory/`), skill package (`skill/`) with built-in skills, argument substitution, fork/inline execution, AI memory search, git worktree isolation, agent type definitions (**~5000** lines of Python), see [update](https://github.com/SafeRL-Lab/nano-claude-code/blob/main/Update_README.MD).
+- 10:00 AM, Apr 02, 2026: **v2.0** — Context compression, memory, sub-agents, skills, diff view, tool plugin system (**~3400** lines of Python Code).
+- 01:47 PM, Apr 01, 2026: Support VLLM inference (**~2000** lines of Python Code).
+- 11:30 AM, Apr 01, 2026: Support more **closed-source** models and **open-source models**: Claude, GPT, Gemini, Kimi, Qwen, Zhipu, DeepSeek, and local open-source models via Ollama or any OpenAI-compatible endpoint. (**~1700** lines of Python Code).
+- 09:50 AM, Apr 01, 2026: Support more **closed-source** models: Claude, GPT, Gemini. (**~1300** lines of Python Code).
+- 08:23 AM, Apr 01, 2026: Release the initial version of Nano Claude Code (**~900 lines** of Python Code).

+---

 # Nano Claude Code

-![demo](demo.gif)
-
 A minimal Python implementation of Claude Code in ~900 lines (Initial version), **supporting Claude, GPT, Gemini, Kimi, Qwen, Zhipu, DeepSeek, and local open-source models via Ollama or any OpenAI-compatible endpoint.**

 ---
@@ -32,30 +42,20 @@ A minimal Python implementation of Claude Code in ~900 lines (Initial version),
 ## Content
  * [Features](#features)
  * [Supported Models](#supported-models)
-    + [Closed-Source (API)](#closed-source--api-)
-    + [Open-Source (Local via Ollama)](#open-source--local-via-ollama-)
  * [Installation](#installation)
  * [Usage: Closed-Source API Models](#usage--closed-source-api-models)
-    + [Anthropic Claude](#anthropic-claude)
-    + [OpenAI GPT](#openai-gpt)
-    + [Google Gemini](#google-gemini)
-    + [Kimi (Moonshot AI)](#kimi--moonshot-ai-)
-    + [Qwen (Alibaba DashScope)](#qwen--alibaba-dashscope-)
-    + [Zhipu GLM](#zhipu-glm)
-    + [DeepSeek](#deepseek)
  * [Usage: Open-Source Models (Local)](#usage--open-source-models--local-)
-    + [Option A — Ollama (Recommended)](#option-a---ollama--recommended-)
-    + [Option B — LM Studio](#option-b---lm-studio)
-    + [Option C — vLLM / Self-Hosted OpenAI-Compatible Server](#option-c---vllm---self-hosted-openai-compatible-server)
  * [Model Name Format](#model-name-format)
  * [CLI Reference](#cli-reference)
  * [Slash Commands (REPL)](#slash-commands--repl-)
  * [Configuring API Keys](#configuring-api-keys)
-    + [Method 1: Environment Variables (recommended)](#method-1--environment-variables--recommended-)
-    + [Method 2: Set Inside the REPL (persisted)](#method-2--set-inside-the-repl--persisted-)
-    + [Method 3: Edit the Config File Directly](#method-3--edit-the-config-file-directly)
  * [Permission System](#permission-system)
  * [Built-in Tools](#built-in-tools)
+  * [Memory](#memory)
+  * [Skills](#skills)
+  * [Sub-Agents](#sub-agents)
+  * [Context Compression](#context-compression)
+  * [Diff View](#diff-view)
  * [CLAUDE.md Support](#claudemd-support)
  * [Session Management](#session-management)
  * [Project Structure](#project-structure)
@@ -71,10 +71,16 @@ A minimal Python implementation of Claude Code in ~900 lines (Initial version),
 | Multi-provider | Anthropic · OpenAI · Gemini · Kimi · Qwen · Zhipu · DeepSeek · Ollama · LM Studio · Custom endpoint |
 | Interactive REPL | readline history, Tab-complete slash commands |
 | Agent loop | Streaming API + automatic tool-use loop |
-| 8 built-in tools | Read · Write · Edit · Bash · Glob · Grep · WebFetch · WebSearch |
+| 18 built-in tools | Read · Write · Edit · Bash · Glob · Grep · WebFetch · WebSearch · MemorySave · MemoryDelete · MemorySearch · MemoryList · Agent · SendMessage · CheckAgentResult · ListAgentTasks · ListAgentTypes · Skill · SkillList |
+| Diff view | Git-style red/green diff display for Edit and Write |
+| Context compression | Auto-compact long conversations to stay within model limits |
+| Persistent memory | Dual-scope memory (user + project) with 4 types, AI search, staleness warnings |
+| Multi-agent | Spawn typed sub-agents (coder/reviewer/researcher/…), git worktree isolation, background mode |
+| Skills | Built-in `/commit` · `/review` + custom markdown skills with argument substitution and fork/inline execution |
+| Plugin tools | Register custom tools via `tool_registry.py` |
 | Permission system | `auto` / `accept-all` / `manual` modes |
-| 14 slash commands | `/model` · `/config` · `/save` · `/cost` · … |
-| Context injection | Auto-loads `CLAUDE.md`, git status, cwd |
+| 17 slash commands | `/model` · `/config` · `/save` · `/cost` · `/memory` · `/skills` · `/agents` · … |
+| Context injection | Auto-loads `CLAUDE.md`, git status, cwd, persistent memory |
 | Session persistence | Save / load conversations to `~/.nano_claude/sessions/` |
 | Extended Thinking | Toggle on/off (Claude models only) |
 | Cost tracking | Token usage + estimated USD cost |
@@ -173,6 +179,7 @@ export OPENAI_API_KEY=sk-...

 python nano_claude.py --model gpt-4o
 python nano_claude.py --model gpt-4o-mini
+python nano_claude.py --model gpt-4.1-mini
 python nano_claude.py --model o3-mini
 ```

@@ -206,9 +213,9 @@ Get your API key at [dashscope.aliyun.com](https://dashscope.aliyun.com).
 ```bash
 export DASHSCOPE_API_KEY=sk-...

-python nano_claude.py --model qwen/qwen-max
-python nano_claude.py --model qwen/qwq-32b
-python nano_claude.py --model qwen/qwen2.5-coder-32b-instruct
+python nano_claude.py --model qwen/Qwen3.5-Plus
+python nano_claude.py --model qwen/Qwen3-MAX
+python nano_claude.py --model qwen/Qwen3.5-Flash
 ```

 ### Zhipu GLM
@@ -478,6 +485,10 @@ Type `/` and press **Tab** to autocomplete.
 | `/permissions <mode>` | Set permission mode: `auto` / `accept-all` / `manual` |
 | `/cwd` | Show current working directory |
 | `/cwd <path>` | Change working directory |
+| `/memory` | List all persistent memories |
+| `/memory <query>` | Search memories by keyword |
+| `/skills` | List available skills |
+| `/agents` | Show sub-agent task status |
 | `/exit` / `/quit` | Exit |

 **Switching models inside a session:**
@@ -573,17 +584,247 @@ Keys are saved to `~/.nano_claude/config.json` and loaded automatically on next

 ## Built-in Tools

+### Core Tools
+
 | Tool | Description | Key Parameters |
 |---|---|---|
 | `Read` | Read file with line numbers | `file_path`, `limit`, `offset` |
-| `Write` | Create or overwrite file | `file_path`, `content` |
-| `Edit` | Exact string replacement in file | `file_path`, `old_string`, `new_string`, `replace_all` |
+| `Write` | Create or overwrite file (shows diff) | `file_path`, `content` |
+| `Edit` | Exact string replacement (shows diff) | `file_path`, `old_string`, `new_string`, `replace_all` |
 | `Bash` | Execute shell command | `command`, `timeout` (default 30s) |
 | `Glob` | Find files by glob pattern | `pattern` (e.g. `**/*.py`), `path` |
 | `Grep` | Regex search in files (uses ripgrep if available) | `pattern`, `path`, `glob`, `output_mode` |
 | `WebFetch` | Fetch and extract text from URL | `url`, `prompt` |
 | `WebSearch` | Search the web via DuckDuckGo | `query` |

+### Memory Tools
+
+| Tool | Description | Key Parameters |
+|---|---|---|
+| `MemorySave` | Save or update a persistent memory | `name`, `type`, `description`, `content`, `scope` |
+| `MemoryDelete` | Delete a memory by name | `name`, `scope` |
+| `MemorySearch` | Search memories by keyword (or AI ranking) | `query`, `scope`, `use_ai`, `max_results` |
+| `MemoryList` | List all memories with age and metadata | `scope` |
+
+### Sub-Agent Tools
+
+| Tool | Description | Key Parameters |
+|---|---|---|
+| `Agent` | Spawn a sub-agent for a task | `prompt`, `subagent_type`, `isolation`, `name`, `model`, `wait` |
+| `SendMessage` | Send a message to a named background agent | `name`, `message` |
+| `CheckAgentResult` | Check status/result of a background agent | `task_id` |
+| `ListAgentTasks` | List all active and finished agent tasks | — |
+| `ListAgentTypes` | List available agent type definitions | — |
+
+### Skill Tools
+
+| Tool | Description | Key Parameters |
+|---|---|---|
+| `Skill` | Invoke a skill by name from within the conversation | `name`, `args` |
+| `SkillList` | List all available skills with triggers and metadata | — |
+
+> **Adding custom tools:** See [Architecture Guide](docs/architecture.md#tool-registry) for how to register your own tools.
+
+---
+
+## Memory
+
+The model can remember things across conversations using the built-in memory system.
+
+**How it works:** Memories are stored as markdown files. There are two scopes:
+- **User scope** (`~/.nano_claude/memory/`) — follows you across all projects
+- **Project scope** (`.nano_claude/memory/` in cwd) — specific to the current repo
+
+A `MEMORY.md` index (≤ 200 lines / 25 KB) is auto-rebuilt on every save or delete and injected into the system prompt so Claude always has an overview.
+
+**Memory types:**
+
+| Type | Use for |
+|---|---|
+| `user` | Your role, preferences, background |
+| `feedback` | How you want the model to behave |
+| `project` | Ongoing work, deadlines, decisions |
+| `reference` | Links to external resources |
+
+**Memory file format** (`~/.nano_claude/memory/coding_style.md`):
+```markdown
+---
+name: coding style
+description: Python formatting preferences
+type: feedback
+created: 2026-04-02
+---
+Prefer 4-space indentation and full type hints in all Python code.
+**Why:** user explicitly stated this preference.
+**How to apply:** apply to every Python file written or edited.
+```
+
+**Example interaction:**
+
+```
+You: Remember that I prefer 4-space indentation and type hints in all Python code.
+AI: [calls MemorySave] Memory saved: coding_style [feedback/user]
+
+You: /memory
+  [feedback/user] coding_style (today): Python formatting preferences
+
+You: /memory python
+  [feedback/user] coding_style: Prefers 4-space indent and type hints in Python
+```
+
+**Staleness warnings:** Memories older than 1 day get a freshness note in `/memory` output so you know when to review or update them.
+
+**AI-ranked search:** `MemorySearch(query="...", use_ai=true)` uses the model to rank results by relevance rather than simple keyword matching.
+
+---
+
+## Skills
+
+Skills are reusable prompt templates that give the model specialized capabilities. Two built-in skills ship out of the box — no setup required.
+
+**Built-in skills:**
+
+| Trigger | Description |
+|---|---|
+| `/commit` | Review staged changes and create a well-structured git commit |
+| `/review [PR]` | Review code or PR diff with structured feedback |
+
+**Quick start — custom skill:**
+
+```bash
+mkdir -p ~/.nano_claude/skills
+```
+
+Create `~/.nano_claude/skills/deploy.md`:
+
+```markdown
+---
+name: deploy
+description: Deploy to an environment
+triggers: [/deploy]
+allowed-tools: [Bash, Read]
+when_to_use: Use when the user wants to deploy a version to an environment.
+argument-hint: [env] [version]
+arguments: [env, version]
+context: inline
+---
+
+Deploy $VERSION to the $ENV environment.
+Full args: $ARGUMENTS
+```
+
+Now use it:
+
+```
+You: /deploy staging 2.1.0
+AI: [deploys version 2.1.0 to staging]
+```
+
+**Argument substitution:**
+- `$ARGUMENTS` — the full raw argument string
+- `$ARG_NAME` — positional substitution by named argument (first word → first name)
+- Missing args become empty strings
+
+**Execution modes:**
+- `context: inline` (default) — runs inside current conversation history
+- `context: fork` — runs as an isolated sub-agent with fresh history; supports `model` override
+
+**Priority** (highest wins): project-level > user-level > built-in
+
+**List skills:** `/skills` — shows triggers, argument hint, source, and `when_to_use`
+
+**Skill search paths:**
+
+```
+./.nano_claude/skills/     # project-level (overrides user-level)
+~/.nano_claude/skills/     # user-level
+```
+
+---
+
+## Sub-Agents
+
+The model can spawn independent sub-agents to handle tasks in parallel.
+
+**Specialized agent types** — built-in:
+
+| Type | Optimized for |
+|---|---|
+| `general-purpose` | Research, exploration, multi-step tasks |
+| `coder` | Writing, reading, and modifying code |
+| `reviewer` | Security, correctness, and code quality analysis |
+| `researcher` | Web search and documentation lookup |
+| `tester` | Writing and running tests |
+
+**Basic usage:**
+```
+You: Search this codebase for all TODO comments and summarize them.
+AI: [calls Agent(prompt="...", subagent_type="researcher")]
+    Sub-agent reads files, greps for TODOs...
+    Result: Found 12 TODOs across 5 files...
+```
+
+**Background mode** — spawn without waiting, collect result later:
+```
+AI: [calls Agent(prompt="run all tests", name="test-runner", wait=false)]
+AI: [continues other work...]
+AI: [calls CheckAgentResult / SendMessage to follow up]
+```
+
+**Git worktree isolation** — agents work on an isolated branch with no conflicts:
+```
+Agent(prompt="refactor auth module", isolation="worktree")
+```
+The worktree is auto-cleaned up if no changes were made; otherwise the branch name is reported.
+
+**Custom agent types** — create `~/.nano_claude/agents/myagent.md`:
+```markdown
+---
+name: myagent
+description: Specialized for X
+model: claude-haiku-4-5-20251001
+tools: [Read, Grep, Bash]
+---
+Extra system prompt for this agent type.
+```
+
+**List running agents:** `/agents`
+
+Sub-agents have independent conversation history, share the file system, and are limited to 3 levels of nesting.
+
+---
+
+## Context Compression
+
+Long conversations are automatically compressed to stay within the model's context window.
+
+**Two layers:**
+
+1. **Snip** — Old tool outputs (file reads, bash results) are truncated after a few turns. Fast, no API cost.
+2. **Auto-compact** — When token usage exceeds 70% of the context limit, older messages are summarized by the model into a concise recap.
+
+This happens transparently. You don't need to do anything.
+
+---
+
+## Diff View
+
+When the model edits or overwrites a file, you see a git-style diff:
+
+```diff
+  Changes applied to config.py:
+
+--- a/config.py
+++ b/config.py
+@@ -12,7 +12,7 @@
+     "model": "claude-opus-4-6",
+-    "max_tokens": 8192,
+    "max_tokens": 16384,
+     "permission_mode": "auto",
+```
+
+Green lines = added, red lines = removed. New file creations show a summary instead.
+
 ---

 ## CLAUDE.md Support
@@ -637,19 +878,49 @@ Sessions are stored as JSON in `~/.nano_claude/sessions/`.

 ```
 nano_claude_code/
-├── nano_claude.py   # Entry point: REPL + slash commands + output rendering  (~580 lines)
-├── agent.py         # Agent loop: neutral message format + tool dispatch      (~160 lines)
-├── providers.py     # Multi-provider: adapters + message format conversion    (~480 lines)
-├── tools.py         # 8 tool implementations + JSON schemas                  (~360 lines)
-├── context.py       # System prompt builder: CLAUDE.md + git + cwd           (~100 lines)
-├── config.py        # Config load/save/defaults                               (~70 lines)
-├── demo.py          # Demo script (requires API key)
-├── make_demo.py     # Generates demo.gif and screenshot.png
-├── demo.gif         # Animated demo
-├── screenshot.png   # Static screenshot
-└── requirements.txt
+├── nano_claude.py        # Entry point: REPL + slash commands + diff rendering
+├── agent.py              # Agent loop: streaming, tool dispatch, compaction
+├── providers.py          # Multi-provider: Anthropic, OpenAI-compat streaming
+├── tools.py              # Core tools (Read/Write/Edit/Bash/Glob/Grep/Web) + registry wiring
+├── tool_registry.py      # Tool plugin registry: register, lookup, execute
+├── compaction.py         # Context compression: snip + auto-summarize
+├── context.py            # System prompt builder: CLAUDE.md + git + memory
+├── config.py             # Config load/save/defaults
+│
+├── multi_agent/          # Multi-agent package
+│   ├── __init__.py       # Re-exports
+│   ├── subagent.py       # AgentDefinition, SubAgentManager, worktree helpers
+│   └── tools.py          # Agent, SendMessage, CheckAgentResult, ListAgentTasks, ListAgentTypes
+├── subagent.py           # Backward-compat shim → multi_agent/
+│
+├── memory/               # Memory package
+│   ├── __init__.py       # Re-exports
+│   ├── types.py          # MEMORY_TYPES and format guidance
+│   ├── store.py          # save/load/delete/search, MEMORY.md index rebuilding
+│   ├── scan.py           # MemoryHeader, age/freshness helpers
+│   ├── context.py        # get_memory_context(), truncation, AI search
+│   └── tools.py          # MemorySave, MemoryDelete, MemorySearch, MemoryList
+├── memory.py             # Backward-compat shim → memory/
+│
+├── skill/                # Skill package
+│   ├── __init__.py       # Re-exports; imports builtin to register built-ins
+│   ├── loader.py         # SkillDef, parse, load_skills, find_skill, substitute_arguments
+│   ├── builtin.py        # Built-in skills: /commit, /review
+│   ├── executor.py       # execute_skill(): inline or forked sub-agent
+│   └── tools.py          # Skill, SkillList
+├── skills.py             # Backward-compat shim → skill/
+│
+└── tests/                # 101 unit tests
+    ├── test_memory.py
+    ├── test_skills.py
+    ├── test_subagent.py
+    ├── test_tool_registry.py
+    ├── test_compaction.py
+    └── test_diff_view.py
 ```

+> **For developers:** Each feature package (`multi_agent/`, `memory/`, `skill/`) is self-contained. Add custom tools by calling `register_tool(ToolDef(...))` from any module imported by `tools.py`.
+
 ---

 ## FAQ
--- a/nano-claude-code/Update_README.MD
+++ b/nano-claude-code/Update_README.MD
@@ -0,0 +1,526 @@
+# Nano Claude Code — Update Notes
+
+This document describes three major feature additions to nano-claude-code:
+**Multi-Agent**, **Memory**, and **Skill**. Each feature is organized as a
+self-contained Python package, follows the same architectural pattern, and
+includes a backward-compatibility shim so existing code continues to work.
+
+---
+
+## Architecture Overview
+
+All three packages follow the same pattern:
+
+```
+feature/
+  __init__.py   — public re-exports
+  <core>.py     — data model, loading, business logic
+  tools.py      — registers tools into the central tool_registry
+  ...
+feature.py      — backward-compat shim (re-exports from feature/)
+```
+
+The **tool registry** (`tool_registry.py`) is the central hub. Each feature's
+`tools.py` calls `register_tool(ToolDef(...))` at import time. The top-level
+`tools.py` imports all three feature tool modules, triggering auto-registration.
+
+The **agent loop** (`agent.py`) injects `_depth` and `_system_prompt` into the
+`config` dict on every call, so tool functions can read them via `config.get(...)`.
+
+---
+
+## 1. Multi-Agent (`multi_agent/`)
+
+### What it does
+
+Allows Claude to spawn sub-agents — nested agent loops that run concurrently
+in background threads. Sub-agents can share the parent's context or run in an
+isolated git worktree. The user can send follow-up messages to named background
+agents and retrieve their results.
+
+### Package structure
+
+```
+multi_agent/
+  __init__.py       — re-exports AgentDefinition, SubAgentTask, SubAgentManager, etc.
+  subagent.py       — core: AgentDefinition, SubAgentTask, SubAgentManager, worktree helpers
+  tools.py          — registers: Agent, SendMessage, CheckAgentResult, ListAgentTasks, ListAgentTypes
+subagent.py         — backward-compat shim
+```
+
+### Key classes and functions
+
+**`AgentDefinition`** (`multi_agent/subagent.py`)
+```python
+@dataclass
+class AgentDefinition:
+    name: str
+    description: str
+    system_prompt: str   # prepended to base prompt for this agent type
+    model: str           # "" = inherit from parent
+    tools: list          # [] = all tools
+    source: str          # "built-in" | "user" | "project"
+```
+
+**Built-in agent types**: `general-purpose`, `coder`, `reviewer`, `researcher`, `tester`
+
+**Custom agent definitions** — place a `.md` file with YAML frontmatter in:
+- `~/.nano_claude/agents/<name>.md` (user-level)
+- `.nano_claude/agents/<name>.md` (project-level, takes priority)
+
+Frontmatter format:
+```markdown
+---
+name: my-agent
+description: What this agent does
+model: claude-opus-4-6
+tools: [Read, Glob, Grep]
+---
+Extra system prompt instructions for this agent.
+```
+
+**`SubAgentManager`** (`multi_agent/subagent.py`)
+- `spawn(prompt, config, agent_def, isolation, name, wait)` — runs agent in thread pool
+- `send_message(task_id_or_name, message)` — enqueues message to a running background agent
+- `get_result(task_id)` — returns final text or status
+- `list_tasks()` — returns all SubAgentTask objects
+
+**Git worktree isolation**:
+When `isolation="worktree"` is passed to `Agent`, a temporary git worktree is
+created on a fresh branch. The sub-agent works in isolation; if it makes no
+changes the worktree is cleaned up automatically.
+
+### Tools registered
+
+| Tool | Description |
+|------|-------------|
+| `Agent` | Spawn a sub-agent (sync or background with `wait=false`) |
+| `SendMessage` | Send a follow-up message to a named background agent |
+| `CheckAgentResult` | Poll status / retrieve result of a background agent |
+| `ListAgentTasks` | List all active and finished sub-agent tasks |
+| `ListAgentTypes` | List all available agent type definitions |
+
+### Agent tool parameters
+
+```python
+Agent(
+    prompt="...",           # required — task description
+    subagent_type="coder",  # optional — use a specialized agent
+    isolation="worktree",   # optional — isolated git branch
+    name="my-agent",        # optional — name for SendMessage later
+    wait=False,             # optional — run in background
+    model="...",            # optional — model override
+)
+```
+
+### How it was wired in
+
+1. `multi_agent/subagent.py` uses **absolute imports** (`import agent as _agent_mod`)
+   because the project root is in `sys.path` when running from that directory.
+2. `agent.py` was updated to inject `_system_prompt` into `config`:
+   ```python
+   config = {**config, "_depth": depth, "_system_prompt": system_prompt}
+   ```
+3. `tools.py` (top-level) was updated to pass `config` through to the registry:
+   ```python
+   return _registry_execute(name, inputs, cfg)
+   ```
+   and at the bottom:
+   ```python
+   import multi_agent.tools as _multiagent_tools
+   ```
+4. `context.py` system prompt template lists Agent, SendMessage, etc. under
+   `## Multi-Agent`.
+5. `nano_claude.py` `/agents` command calls `get_agent_manager().list_tasks()`
+   and prints status/worktree info. A `_print_background_notifications()` function
+   checks for newly completed background agents before each user prompt.
+
+### Files changed
+
+| File | Change |
+|------|--------|
+| `multi_agent/__init__.py` | Created (re-exports) |
+| `multi_agent/subagent.py` | Created (moved + enhanced from `subagent.py`) |
+| `multi_agent/tools.py` | Created (tool registrations) |
+| `subagent.py` | Converted to backward-compat shim |
+| `agent.py` | Inject `_system_prompt` into config |
+| `tools.py` | Pass config to registry; import `multi_agent.tools` |
+| `context.py` | Add Multi-Agent section to system prompt |
+| `nano_claude.py` | `/agents` command; background notification; `_tool_desc()` |
+| `tests/test_subagent.py` | Update imports to `multi_agent.subagent` |
+
+---
+
+## 2. Memory (`memory/`)
+
+### What it does
+
+Provides persistent, file-based memory across sessions. Memories are stored as
+markdown files with YAML frontmatter. There are two scopes — **user** (global,
+`~/.nano_claude/memory/`) and **project** (per-repo, `.nano_claude/memory/`).
+A `MEMORY.md` index is auto-rebuilt after every save/delete and injected into
+the system prompt so Claude knows what memories exist.
+
+### Package structure
+
+```
+memory/
+  __init__.py   — re-exports all public symbols
+  types.py      — MEMORY_TYPES, type descriptions, format guidance
+  store.py      — MemoryEntry, save/load/delete/search, index rebuilding
+  scan.py       — MemoryHeader, scan_memory_dir, age/freshness helpers
+  context.py    — get_memory_context(), find_relevant_memories(), truncation
+  tools.py      — registers: MemorySave, MemoryDelete, MemorySearch, MemoryList
+memory.py       — backward-compat shim
+```
+
+### Memory types
+
+Defined in `memory/types.py`, mirrors the four types from Claude Code:
+
+| Type | Purpose |
+|------|---------|
+| `user` | User's role, goals, preferences |
+| `feedback` | Corrections and confirmed approaches |
+| `project` | Ongoing work, decisions, deadlines |
+| `reference` | Pointers to external resources |
+
+### Storage layout
+
+```
+~/.nano_claude/memory/
+  MEMORY.md          ← auto-generated index (<=200 lines, <=25 KB)
+  my_note.md
+  feedback_testing.md
+  ...
+
+.nano_claude/memory/   ← project-local (relative to cwd)
+  MEMORY.md
+  ...
+```
+
+Each memory file format:
+```markdown
+---
+name: My Note
+description: one-line description for relevance decisions
+type: user
+created: 2026-04-02
+---
+
+Memory content goes here.
+**Why:** ...
+**How to apply:** ...
+```
+
+### Key API
+
+**`memory/store.py`**
+```python
+save_memory(entry: MemoryEntry, scope="user")   # save or update (same name = update)
+delete_memory(name: str, scope="user")           # remove entry + rebuild index
+load_entries(scope="user") -> list[MemoryEntry]  # load all entries for scope
+load_index(scope="all") -> list[MemoryEntry]     # "all" merges user + project
+search_memory(query: str, scope="all") -> list   # keyword search across content+name
+get_index_content(scope="all") -> str            # raw MEMORY.md text
+```
+
+**`memory/scan.py`**
+```python
+scan_memory_dir(mem_dir, scope) -> list[MemoryHeader]  # newest-first, capped at 200
+scan_all_memories() -> list[MemoryHeader]              # user + project merged
+memory_age_str(mtime_s) -> str          # "today" | "yesterday" | "N days ago"
+memory_freshness_text(mtime_s) -> str   # staleness warning for memories >1 day old
+format_memory_manifest(headers) -> str  # formatted list for display
+```
+
+**`memory/context.py`**
+```python
+get_memory_context() -> str             # injected into system prompt
+truncate_index_content(raw) -> str      # enforces <=200 lines / <=25 KB
+find_relevant_memories(query, max_results=5, use_ai=False, config=None)
+```
+
+`find_relevant_memories` supports optional AI ranking: when `use_ai=True` it
+makes a small API call to rank candidates by relevance to the query.
+
+### Tools registered
+
+| Tool | Parameters | Description |
+|------|-----------|-------------|
+| `MemorySave` | `name, description, type, content, scope` | Save or update a memory |
+| `MemoryDelete` | `name, scope` | Delete a memory by name |
+| `MemorySearch` | `query, scope, use_ai, max_results` | Search by keyword (or AI) |
+| `MemoryList` | `scope` | List all memories with age and metadata |
+
+### Index truncation
+
+The `MEMORY.md` index is truncated before being injected into the system prompt:
+- Hard limit: **200 lines** (mirrors Claude Code's limit)
+- Byte limit: **25 000 bytes** (mirrors Claude Code's limit)
+- A `WARNING:` line is appended when either limit is hit
+
+### How it was wired in
+
+1. `memory/store.py` exports `USER_MEMORY_DIR` and `get_project_memory_dir` as
+   module-level names so tests can monkeypatch them cleanly.
+2. `context.py` (system prompt builder) calls `get_memory_context()` at the end
+   of `build_system_prompt()` and appends the result.
+3. `tools.py` (top-level) adds:
+   ```python
+   import memory.tools as _memory_tools
+   ```
+4. `memory.py` (top-level) is now a shim:
+   ```python
+   from memory.store import MemoryEntry, save_memory, ...
+   from memory.context import get_memory_context
+   ```
+5. `nano_claude.py` `/memory` command uses `scan_all_memories()` to display a
+   mtime-sorted list with freshness warnings.
+
+### Files changed
+
+| File | Change |
+|------|--------|
+| `memory/__init__.py` | Created (re-exports) |
+| `memory/types.py` | Created (MEMORY_TYPES, descriptions, format guidance) |
+| `memory/store.py` | Created (replaced top-level `memory.py` logic) |
+| `memory/scan.py` | Created (MemoryHeader, age/freshness, manifest) |
+| `memory/context.py` | Created (context injection, truncation, AI search) |
+| `memory/tools.py` | Created (MemorySave, MemoryDelete, MemorySearch, MemoryList) |
+| `memory.py` | Converted to backward-compat shim |
+| `tools.py` | Import `memory.tools` |
+| `context.py` | Call `get_memory_context()` in `build_system_prompt()` |
+| `nano_claude.py` | `/memory` command uses `scan_all_memories()` |
+| `tests/test_memory.py` | Completely rewritten (101 tests total) |
+
+---
+
+## 3. Skill (`skill/`)
+
+### What it does
+
+Skills are reusable prompt templates stored as markdown files. A user types
+`/commit` or `/review pr-123` in the REPL and the skill's prompt (with
+arguments substituted) is injected into the conversation. Skills can run
+**inline** (current conversation context) or **forked** (isolated sub-agent).
+Two built-in skills (`/commit`, `/review`) are registered programmatically.
+
+### Package structure
+
+```
+skill/
+  __init__.py   — re-exports all public symbols; imports builtin to register them
+  loader.py     — SkillDef dataclass, file parsing, load_skills, find_skill, substitute_arguments
+  builtin.py    — built-in skills: /commit, /review
+  executor.py   — execute_skill() (inline or forked)
+  tools.py      — registers: Skill, SkillList
+skills.py       — backward-compat shim
+```
+
+### Skill file format
+
+Place `.md` files in:
+- `~/.nano_claude/skills/<name>.md` (user-level)
+- `.nano_claude/skills/<name>.md` (project-level, takes priority)
+
+```markdown
+---
+name: deploy
+description: Deploy to an environment
+triggers: [/deploy]
+allowed-tools: [Bash, Read]
+when_to_use: Use when the user wants to deploy. Examples: '/deploy staging v1.2'
+argument-hint: [env] [version]
+arguments: [env, version]
+context: inline
+---
+
+Deploy $VERSION to $ENV.
+
+Full args provided: $ARGUMENTS
+```
+
+### Frontmatter fields
+
+| Field | Default | Description |
+|-------|---------|-------------|
+| `name` | required | Skill identifier |
+| `description` | `""` | One-line description shown in `/skills` |
+| `triggers` | `[/<name>]` | Slash commands or phrases that activate this skill |
+| `allowed-tools` / `tools` | `[]` | Tools the skill is allowed to use |
+| `when_to_use` | `""` | Guidance for when Claude should auto-invoke |
+| `argument-hint` | `""` | Hint shown in `/skills` list, e.g. `[branch] [desc]` |
+| `arguments` | `[]` | Named argument list for `$ARG_NAME` substitution |
+| `model` | `""` | Model override (fork context only) |
+| `user-invocable` | `true` | Show in `/skills` list |
+| `context` | `inline` | `inline` = current conversation, `fork` = isolated sub-agent |
+
+### Argument substitution
+
+`substitute_arguments(prompt, args, arg_names)` in `skill/loader.py`:
+
+- `$ARGUMENTS` → the full raw args string
+- `$ARG_NAME` → positional substitution (first word → first arg name, etc.)
+- Missing args become empty strings
+
+```
+prompt:    "Deploy $VERSION to $ENV. Full: $ARGUMENTS"
+args:      "1.0 staging"
+arg_names: ["env", "version"]
+
+result:    "Deploy staging to 1.0. Full: 1.0 staging"
+```
+
+### Execution modes
+
+**Inline** (`context: inline`, default):
+- Skill prompt is injected into the current `AgentState`
+- History is shared — the user can see and continue the conversation
+
+**Fork** (`context: fork`):
+- A fresh `AgentState` is created (no shared history)
+- Optional `model` and `allowed-tools` overrides are applied
+- Good for self-contained tasks that don't need mid-process user input
+
+### Built-in skills
+
+Defined in `skill/builtin.py` and registered via `register_builtin_skill()`:
+
+| Trigger | Name | Description |
+|---------|------|-------------|
+| `/commit` | commit | Review staged changes and create a well-structured git commit |
+| `/review`, `/review-pr` | review | Review code or PR diff with structured feedback |
+
+Project-level skill files with the same name override built-ins.
+
+### Tools registered
+
+| Tool | Parameters | Description |
+|------|-----------|-------------|
+| `Skill` | `name, args` | Invoke a skill by name from inside a conversation |
+| `SkillList` | — | List all available skills with triggers and metadata |
+
+### Priority order
+
+When multiple skill sources define the same name, the highest priority wins:
+
+```
+builtin  <  user (~/.nano_claude/skills/)  <  project (.nano_claude/skills/)
+```
+
+### REPL usage
+
+```
+/commit                          # run built-in commit skill
+/review 123                      # review PR #123 (args = "123")
+/deploy staging 2.1.0            # custom skill with named args
+/skills                          # list all skills
+```
+
+The `/skills` command output includes source label, triggers, argument hint,
+and the first 80 chars of `when_to_use` per skill.
+
+### How it was wired in
+
+1. `skill/__init__.py` imports `skill.builtin` which calls `register_builtin_skill()`
+   for each built-in — just importing the package registers them.
+2. `tools.py` (top-level) adds:
+   ```python
+   import skill.tools as _skill_tools
+   ```
+3. `skills.py` (top-level) becomes a shim re-exporting from `skill/`.
+4. `context.py` adds a `## Skills` section listing `Skill` and `SkillList`.
+5. `nano_claude.py`:
+   - `cmd_skills` imports from `skill`, shows `when_to_use` and source label
+   - `handle_slash` imports `find_skill` from `skill`; returns `(skill, args)` tuple
+   - REPL loop calls `substitute_arguments` before building the injected message
+
+### Files changed
+
+| File | Change |
+|------|--------|
+| `skill/__init__.py` | Created (re-exports; imports builtin) |
+| `skill/loader.py` | Created (SkillDef, parse, load, find, substitute) |
+| `skill/builtin.py` | Created (/commit, /review built-ins) |
+| `skill/executor.py` | Created (inline + fork execution) |
+| `skill/tools.py` | Created (Skill, SkillList tool registration) |
+| `skills.py` | Converted to backward-compat shim |
+| `tools.py` | Import `skill.tools` |
+| `context.py` | Add Skills section to system prompt |
+| `nano_claude.py` | `cmd_skills`, `handle_slash`, REPL loop updated |
+| `tests/test_skills.py` | Rewritten (22 tests; patches `skill.loader`) |
+
+---
+
+## How to add custom agents, memories, and skills
+
+### Custom agent type
+
+Create `~/.nano_claude/agents/myagent.md`:
+```markdown
+---
+name: myagent
+description: Does specialized work
+model: claude-haiku-4-5-20251001
+tools: [Read, Grep, Bash]
+---
+You are specialized in X. Focus on Y. Never do Z.
+```
+
+Then use: `Agent(prompt="...", subagent_type="myagent")`
+
+### Custom memory
+
+Use the REPL `MemorySave` tool or write a file directly to
+`~/.nano_claude/memory/my_note.md` with frontmatter:
+```markdown
+---
+name: my note
+description: short description
+type: feedback
+created: 2026-04-02
+---
+Memory content here.
+```
+
+### Custom skill
+
+Create `~/.nano_claude/skills/myskill.md` (user-level) or
+`.nano_claude/skills/myskill.md` (project-level):
+```markdown
+---
+name: myskill
+description: Does something useful
+triggers: [/myskill]
+arguments: [target]
+argument-hint: [target]
+when_to_use: Use when the user wants to do X with a target.
+---
+
+Do something useful with $TARGET.
+
+Full context: $ARGUMENTS
+```
+
+Then invoke with `/myskill some-target`.
+
+---
+
+## Running tests
+
+```bash
+cd nano-claude-code
+
+# All tests
+python -m pytest tests/ -v
+
+# Per-feature
+python -m pytest tests/test_subagent.py -v   # multi-agent
+python -m pytest tests/test_memory.py   -v   # memory
+python -m pytest tests/test_skills.py   -v   # skills
+```
+
+Total: **101 tests**, all passing. Each feature's tests use `monkeypatch` to
+redirect file system paths to `tmp_path` so no real `~/.nano_claude/`
+directories are touched during testing.
--- a/nano-claude-code/agent.py
+++ b/nano-claude-code/agent.py
@@ -5,8 +5,11 @@ import uuid
 from dataclasses import dataclass, field
 from typing import Generator

-from tools import TOOL_SCHEMAS, execute_tool
+from tool_registry import get_tool_schemas
+from tools import execute_tool
+import tools as _tools_init  # ensure built-in tools are registered on import
 from providers import stream, AssistantTurn, TextChunk, ThinkingChunk, detect_provider
+from compaction import maybe_compact

 # ── Re-export event types (used by nano_claude.py) ────────────────────────
 __all__ = [
@@ -54,25 +57,39 @@ def run(
    state: AgentState,
    config: dict,
    system_prompt: str,
+    depth: int = 0,
+    cancel_check=None,
 ) -> Generator:
    """
    Multi-turn agent loop (generator).
    Yields: TextChunk | ThinkingChunk | ToolStart | ToolEnd |
            PermissionRequest | TurnDone
+
+    Args:
+        depth: sub-agent nesting depth, 0 for top-level
+        cancel_check: callable returning True to abort the loop early
    """
    # Append user turn in neutral format
    state.messages.append({"role": "user", "content": user_message})

+    # Inject runtime metadata into config so tools (e.g. Agent) can access it
+    config = {**config, "_depth": depth, "_system_prompt": system_prompt}
+
    while True:
+        if cancel_check and cancel_check():
+            return
        state.turn_count += 1
        assistant_turn: AssistantTurn | None = None

+        # Compact context if approaching window limit
+        maybe_compact(state, config)
+
        # Stream from provider (auto-detected from model name)
        for event in stream(
            model=config["model"],
            system=system_prompt,
            messages=state.messages,
-            tool_schemas=TOOL_SCHEMAS,
+            tool_schemas=get_tool_schemas(),
            config=config,
        ):
            if isinstance(event, (TextChunk, ThinkingChunk)):
@@ -114,6 +131,7 @@ def run(
                result = execute_tool(
                    tc["name"], tc["input"],
                    permission_mode="accept-all",  # already gate-checked above
+                    config=config,
                )

            yield ToolEnd(tc["name"], result, permitted)
--- a/nano-claude-code/compaction.py
+++ b/nano-claude-code/compaction.py
@@ -0,0 +1,196 @@
+"""Context window management: two-layer compression for long conversations."""
+from __future__ import annotations
+
+import providers
+
+
+# ── Token estimation ──────────────────────────────────────────────────────
+
+def estimate_tokens(messages: list) -> int:
+    """Estimate token count by summing content lengths / 3.5.
+
+    Args:
+        messages: list of message dicts with "content" field (str or list of dicts)
+    Returns:
+        approximate token count, int
+    """
+    total_chars = 0
+    for m in messages:
+        content = m.get("content", "")
+        if isinstance(content, str):
+            total_chars += len(content)
+        elif isinstance(content, list):
+            for block in content:
+                if isinstance(block, dict):
+                    # Sum all string values in the block
+                    for v in block.values():
+                        if isinstance(v, str):
+                            total_chars += len(v)
+        # Also count tool_calls if present
+        for tc in m.get("tool_calls", []):
+            if isinstance(tc, dict):
+                for v in tc.values():
+                    if isinstance(v, str):
+                        total_chars += len(v)
+    return int(total_chars / 3.5)
+
+
+def get_context_limit(model: str) -> int:
+    """Look up context window size for a model.
+
+    Args:
+        model: model string (e.g. "claude-opus-4-6", "ollama/llama3.3")
+    Returns:
+        context limit in tokens
+    """
+    provider_name = providers.detect_provider(model)
+    prov = providers.PROVIDERS.get(provider_name, {})
+    return prov.get("context_limit", 128000)
+
+
+# ── Layer 1: Snip old tool results ────────────────────────────────────────
+
+def snip_old_tool_results(
+    messages: list,
+    max_chars: int = 2000,
+    preserve_last_n_turns: int = 6,
+) -> list:
+    """Truncate tool-role messages older than preserve_last_n_turns from end.
+
+    For old tool messages whose content exceeds max_chars, keep the first half
+    and last quarter, inserting '[... N chars snipped ...]' in between.
+    Mutates in place and returns the same list.
+
+    Args:
+        messages: list of message dicts (mutated in place)
+        max_chars: maximum character length before truncation
+        preserve_last_n_turns: number of messages from end to preserve
+    Returns:
+        the same messages list (mutated)
+    """
+    cutoff = max(0, len(messages) - preserve_last_n_turns)
+    for i in range(cutoff):
+        m = messages[i]
+        if m.get("role") != "tool":
+            continue
+        content = m.get("content", "")
+        if not isinstance(content, str) or len(content) <= max_chars:
+            continue
+        first_half = content[: max_chars // 2]
+        last_quarter = content[-(max_chars // 4):]
+        snipped = len(content) - len(first_half) - len(last_quarter)
+        m["content"] = f"{first_half}\n[... {snipped} chars snipped ...]\n{last_quarter}"
+    return messages
+
+
+# ── Layer 2: Auto-compact ─────────────────────────────────────────────────
+
+def find_split_point(messages: list, keep_ratio: float = 0.3) -> int:
+    """Find index that splits messages so ~keep_ratio of tokens are in the recent portion.
+
+    Walks backwards from end, accumulating token estimates, and returns the
+    index where the recent portion reaches ~keep_ratio of total tokens.
+
+    Args:
+        messages: list of message dicts
+        keep_ratio: fraction of tokens to keep in the recent portion
+    Returns:
+        split index (messages[:idx] = old, messages[idx:] = recent)
+    """
+    total = estimate_tokens(messages)
+    target = int(total * keep_ratio)
+    running = 0
+    for i in range(len(messages) - 1, -1, -1):
+        running += estimate_tokens([messages[i]])
+        if running >= target:
+            return i
+    return 0
+
+
+def compact_messages(messages: list, config: dict) -> list:
+    """Compress old messages into a summary via LLM call.
+
+    Splits at find_split_point, summarizes old portion, returns
+    [summary_msg, ack_msg, *recent_messages].
+
+    Args:
+        messages: full message list
+        config: agent config dict (must contain "model")
+    Returns:
+        new compacted message list
+    """
+    split = find_split_point(messages)
+    if split <= 0:
+        return messages
+
+    old = messages[:split]
+    recent = messages[split:]
+
+    # Build summary request
+    old_text = ""
+    for m in old:
+        role = m.get("role", "?")
+        content = m.get("content", "")
+        if isinstance(content, str):
+            old_text += f"[{role}]: {content[:500]}\n"
+        elif isinstance(content, list):
+            old_text += f"[{role}]: (structured content)\n"
+
+    summary_prompt = (
+        "Summarize the following conversation history concisely. "
+        "Preserve key decisions, file paths, tool results, and context "
+        "needed to continue the conversation:\n\n" + old_text
+    )
+
+    # Call LLM for summary
+    summary_text = ""
+    for event in providers.stream(
+        model=config["model"],
+        system="You are a concise summarizer.",
+        messages=[{"role": "user", "content": summary_prompt}],
+        tool_schemas=[],
+        config=config,
+    ):
+        if isinstance(event, providers.TextChunk):
+            summary_text += event.text
+
+    summary_msg = {
+        "role": "user",
+        "content": f"[Previous conversation summary]\n{summary_text}",
+    }
+    ack_msg = {
+        "role": "assistant",
+        "content": "Understood. I have the context from the previous conversation. Let's continue.",
+    }
+    return [summary_msg, ack_msg, *recent]
+
+
+# ── Main entry ────────────────────────────────────────────────────────────
+
+def maybe_compact(state, config: dict) -> bool:
+    """Check if context window is getting full and compress if needed.
+
+    Runs snip_old_tool_results first, then auto-compact if still over threshold.
+
+    Args:
+        state: AgentState with .messages list
+        config: agent config dict (must contain "model")
+    Returns:
+        True if compaction was performed
+    """
+    model = config.get("model", "")
+    limit = get_context_limit(model)
+    threshold = limit * 0.7
+
+    if estimate_tokens(state.messages) <= threshold:
+        return False
+
+    # Layer 1: snip old tool results
+    snip_old_tool_results(state.messages)
+
+    if estimate_tokens(state.messages) <= threshold:
+        return True
+
+    # Layer 2: auto-compact
+    state.messages = compact_messages(state.messages, config)
+    return True
--- a/nano-claude-code/config.py
+++ b/nano-claude-code/config.py
@@ -16,6 +16,9 @@ DEFAULTS = {
    "thinking":         False,
    "thinking_budget":  10000,
    "custom_base_url":  "",       # for "custom" provider
+    "max_tool_output":  32000,
+    "max_agent_depth":  3,
+    "max_concurrent_agents": 3,
    # Per-provider API keys (optional; env vars take priority)
    # "anthropic_api_key": "sk-ant-..."
    # "openai_api_key":    "sk-..."
--- a/nano-claude-code/context.py
+++ b/nano-claude-code/context.py
@@ -4,11 +4,15 @@ import subprocess
 from pathlib import Path
 from datetime import datetime

+from memory import get_memory_context
+
 SYSTEM_PROMPT_TEMPLATE = """\
-You are Nano Claude Code, Created by SAIL Lab (Safe AI and Robot Learning Lab), an AI coding assistant running in the terminal.
+You are Nano Claude Code, Created by SAIL Lab (Safe AI and Robot Learning Lab at UC Berkeley), an AI coding assistant running in the terminal.
 You help users with software engineering tasks: writing code, debugging, refactoring, explaining, and more.

 # Available Tools
+
+## File & Shell
 - **Read**: Read file contents with line numbers
 - **Write**: Create or overwrite files
 - **Edit**: Replace text in a file (exact string replacement)
@@ -18,6 +22,27 @@ You help users with software engineering tasks: writing code, debugging, refacto
 - **WebFetch**: Fetch and extract content from a URL
 - **WebSearch**: Search the web via DuckDuckGo

+## Multi-Agent
+- **Agent**: Spawn a sub-agent to handle a task autonomously. Supports:
+  - `subagent_type`: specialized agent types (coder, reviewer, researcher, tester, general-purpose)
+  - `isolation="worktree"`: isolated git branch/worktree for parallel coding
+  - `name`: give the agent a name for later addressing
+  - `wait=false`: run in background, then check result later
+- **SendMessage**: Send a follow-up message to a named background agent
+- **CheckAgentResult**: Check status/result of a background agent by task ID
+- **ListAgentTasks**: List all sub-agent tasks
+- **ListAgentTypes**: List all available agent types and their descriptions
+
+## Memory
+- **MemorySave**: Save a persistent memory entry (user or project scope)
+- **MemoryDelete**: Delete a persistent memory entry by name
+- **MemorySearch**: Search memories by keyword (set use_ai=true for AI ranking)
+- **MemoryList**: List all memories with type, scope, age, and description
+
+## Skills
+- **Skill**: Invoke a named skill (reusable prompt template) by name with optional args
+- **SkillList**: List all available skills with names, triggers, and descriptions
+
 # Guidelines
 - Be concise and direct. Lead with the answer.
 - Prefer editing existing files over creating new ones.
@@ -27,6 +52,12 @@ You help users with software engineering tasks: writing code, debugging, refacto
 - For multi-step tasks, work through them systematically.
 - If a task is unclear, ask for clarification before proceeding.

+## Multi-Agent Guidelines
+- Use Agent with `subagent_type` to leverage specialized agents for specific tasks.
+- Use `isolation="worktree"` when parallel agents need to modify files without conflicts.
+- Use `wait=false` + `name=...` to run multiple agents in parallel, then collect results.
+- Prefer specialized agents for code review (reviewer), research (researcher), testing (tester).
+
 # Environment
 - Current date: {date}
 - Working directory: {cwd}
@@ -91,10 +122,14 @@ def get_claude_md() -> str:

 def build_system_prompt() -> str:
    import platform
-    return SYSTEM_PROMPT_TEMPLATE.format(
+    prompt = SYSTEM_PROMPT_TEMPLATE.format(
        date=datetime.now().strftime("%Y-%m-%d %A"),
        cwd=str(Path.cwd()),
        platform=platform.system(),
        git_info=get_git_info(),
        claude_md=get_claude_md(),
    )
+    memory_ctx = get_memory_context()
+    if memory_ctx:
+        prompt += f"\n\n# Memory\nYour persistent memories:\n{memory_ctx}\n"
+    return prompt
--- a/nano-claude-code/docs/architecture.md
+++ b/nano-claude-code/docs/architecture.md
@@ -0,0 +1,374 @@
+# Architecture Guide
+
+This document is for developers who want to understand, modify, or extend nano-claude-code.
+For user-facing docs, see [README.md](../README.md).
+
+---
+
+## Overview
+
+Nano-claude-code is a ~3.4K-line Python CLI that lets LLMs (GPT, Gemini, etc.) operate as
+coding agents with tool use, memory, sub-agents, and skills. The architecture is a flat
+module layout designed for readability and future migration to a package structure.
+
+```
+User Input
+    │
+    ▼
+nano_claude.py  ── REPL, slash commands, rendering
+    │
+    ├──► agent.py  ── multi-turn loop, permission gates
+    │       │
+    │       ├──► providers.py  ── API streaming (Anthropic / OpenAI-compat)
+    │       ├──► tool_registry.py ──► tools.py  ── 13 tools
+    │       ├──► compaction.py  ── context window management
+    │       └──► subagent.py  ── threaded sub-agent lifecycle
+    │
+    ├──► context.py  ── system prompt (git, CLAUDE.md, memory)
+    │       └──► memory.py  ── persistent file-based memory
+    │
+    ├──► skills.py  ── markdown skill loading + execution
+    └──► config.py  ── configuration persistence
+```
+
+**Key invariant:** Dependencies flow downward. No circular imports at the module level
+(subagent.py uses lazy imports to call agent.py).
+
+---
+
+## Module Reference
+
+### `tool_registry.py` — Tool Plugin System
+
+The central registry that all tools register into. This is the foundation for extensibility.
+
+**Data model:**
+
+```python
+@dataclass
+class ToolDef:
+    name: str               # unique identifier (e.g. "Read", "MemorySave")
+    schema: dict            # JSON schema sent to the LLM API
+    func: Callable          # (params: dict, config: dict) -> str
+    read_only: bool         # True = auto-approve in 'auto' permission mode
+    concurrent_safe: bool   # True = safe to run in parallel (for sub-agents)
+```
+
+**Public API:**
+
+| Function | Description |
+|---|---|
+| `register_tool(tool_def)` | Add a tool to the registry (overwrites by name) |
+| `get_tool(name)` | Look up by name, returns `None` if not found |
+| `get_all_tools()` | List all registered tools |
+| `get_tool_schemas()` | Return schemas for API calls |
+| `execute_tool(name, params, config, max_output=32000)` | Execute with output truncation |
+| `clear_registry()` | Reset — for testing only |
+
+**Output truncation:** If a tool returns more than `max_output` chars, the result is
+truncated to `first_half + [... N chars truncated ...] + last_quarter`. This prevents
+a single tool call (e.g. reading a huge file) from blowing up the context window.
+
+**Registering a custom tool:**
+
+```python
+from tool_registry import ToolDef, register_tool
+
+def my_tool(params, config):
+    return f"Hello, {params['name']}!"
+
+register_tool(ToolDef(
+    name="MyTool",
+    schema={
+        "name": "MyTool",
+        "description": "A greeting tool",
+        "input_schema": {
+            "type": "object",
+            "properties": {"name": {"type": "string"}},
+            "required": ["name"],
+        },
+    },
+    func=my_tool,
+    read_only=True,
+    concurrent_safe=True,
+))
+```
+
+### `tools.py` — Built-in Tool Implementations
+
+Contains the 8 core tools (Read, Write, Edit, Bash, Glob, Grep, WebFetch, WebSearch)
+plus memory tools (MemorySave, MemoryDelete) and sub-agent tools (Agent, CheckAgentResult,
+ListAgentTasks). All register themselves via `tool_registry` at import time.
+
+**Key internals:**
+
+- `_is_safe_bash(cmd)` — whitelist of safe shell commands for auto-approval
+- `generate_unified_diff(old, new, filename)` — diff generation for Edit/Write
+- `maybe_truncate_diff(diff_text, max_lines=80)` — truncate large diffs for display
+- `_get_agent_manager()` — lazy singleton for SubAgentManager
+- Backward-compatible `execute_tool(name, inputs, permission_mode, ask_permission)` wrapper
+
+### `agent.py` — Core Agent Loop
+
+The heart of the system. `run()` is a generator that yields events as they happen.
+
+```python
+def run(user_message, state, config, system_prompt,
+        depth=0, cancel_check=None) -> Generator:
+```
+
+**Loop logic:**
+
+```
+1. Append user message
+2. Inject depth into config (for sub-agent depth tracking)
+3. While True:
+   a. Check cancel_check() — cooperative cancellation for sub-agents
+   b. maybe_compact(state, config) — compress if near context limit
+   c. Stream from provider → yield TextChunk / ThinkingChunk
+   d. Record assistant message
+   e. If no tool_calls → break
+   f. For each tool_call:
+      - Permission check (_check_permission)
+      - If denied → yield PermissionRequest → user decides
+      - Execute tool → yield ToolStart / ToolEnd
+      - Append tool result
+   g. Loop (model sees tool results and responds)
+```
+
+**Event types:**
+
+| Event | Fields | When |
+|---|---|---|
+| `TextChunk` | `text` | Streaming text delta |
+| `ThinkingChunk` | `text` | Extended thinking block |
+| `ToolStart` | `name, inputs` | Before tool execution |
+| `ToolEnd` | `name, result, permitted` | After tool execution |
+| `PermissionRequest` | `description, granted` | Needs user approval |
+| `TurnDone` | `input_tokens, output_tokens` | End of one API turn |
+
+### `compaction.py` — Context Window Management
+
+Keeps conversations within model context limits using two layers.
+
+**Layer 1: Snip** (`snip_old_tool_results`)
+- Rule-based, no API cost
+- Truncates tool-role messages older than `preserve_last_n_turns` (default 6)
+- Keeps first half + last quarter of the content
+
+**Layer 2: Auto-Compact** (`compact_messages`)
+- Model-driven: calls the current model to summarize old messages
+- Splits messages into [old | recent] at ~70/30 ratio
+- Replaces old messages with a summary + acknowledgment
+
+**Trigger:** `maybe_compact()` checks `estimate_tokens(messages) > context_limit * 0.7`.
+Runs snip first (cheap), then auto-compact if still over.
+
+**Token estimation:** `len(content) / 3.5` — simple heuristic. Works for most models.
+`get_context_limit(model)` reads from the provider registry.
+
+### `memory.py` — Persistent Memory
+
+File-based memory system stored in `~/.nano_claude/memory/`.
+
+**Storage format:**
+
+```
+~/.nano_claude/memory/
+├── MEMORY.md              # Index: one line per memory
+├── user_preferences.md    # Individual memory file
+└── project_auth.md
+```
+
+Each memory file uses markdown with YAML frontmatter:
+
+```markdown
+---
+name: user preferences
+description: coding style preferences
+type: feedback
+created: 2026-04-02
+---
+
+User prefers 4-space indentation and type hints.
+```
+
+**How it integrates:**
+- `get_memory_context()` returns the MEMORY.md index text
+- `context.py` injects this into the system prompt
+- The model reads the index, then uses `Read` tool to access full memory content
+- The model uses `MemorySave` / `MemoryDelete` tools to manage memories
+
+### `subagent.py` — Threaded Sub-Agents
+
+Sub-agents run in background threads via `ThreadPoolExecutor`.
+
+**Key design decisions:**
+
+1. **Fresh context** — each sub-agent starts with empty message history + task prompt
+2. **Depth limiting** — `max_depth=3`, checked at spawn time. Model gets an error message
+   (not silent tool removal) so it can adapt.
+3. **Cooperative cancellation** — `cancel_check` callable checked each loop iteration.
+   Python threads can't be killed safely, so we set a flag.
+4. **Threading, not asyncio** — the entire codebase is synchronous generators. Threading
+   via `concurrent.futures` keeps things simple. The SubAgentManager API is designed to
+   be compatible with a future async migration.
+
+**Lifecycle:**
+
+```
+spawn(prompt, config, system_prompt, depth)
+  → Creates SubAgentTask
+  → Submits _run to ThreadPoolExecutor
+  → _run calls agent.run() with depth+1
+
+wait(task_id, timeout)  → blocks until complete
+cancel(task_id)         → sets _cancel_flag
+get_result(task_id)     → returns result string
+```
+
+### `skills.py` — Reusable Prompt Templates
+
+Skills are markdown files with frontmatter. They are **not code** — just structured prompts
+that get injected into the agent loop.
+
+**Skill file format:**
+
+```markdown
+---
+name: commit
+description: Create a conventional commit
+triggers: ["/commit"]
+tools: [Bash, Read]
+---
+
+Your prompt instructions here...
+```
+
+**Execution:** `execute_skill()` wraps the skill prompt as a user message and calls
+`agent.run()`. The skill runs through the exact same agent loop as a normal query.
+
+**Search order:** Project-level (`./.nano_claude/skills/`) overrides user-level
+(`~/.nano_claude/skills/`) when skill names collide.
+
+### `providers.py` — Multi-Provider Abstraction
+
+Two streaming adapters cover all providers:
+
+| Adapter | Providers |
+|---|---|
+| `stream_anthropic()` | Anthropic (native SDK) |
+| `stream_openai_compat()` | OpenAI, Gemini, Kimi, Qwen, Zhipu, DeepSeek, Ollama, LM Studio, Custom |
+
+**Neutral message format** (provider-independent):
+
+```python
+{"role": "user", "content": "..."}
+{"role": "assistant", "content": "...", "tool_calls": [{"id": "...", "name": "...", "input": {...}}]}
+{"role": "tool", "tool_call_id": "...", "name": "...", "content": "..."}
+```
+
+Conversion functions: `messages_to_anthropic()`, `messages_to_openai()`, `tools_to_openai()`.
+
+**Provider-specific handling:**
+- Gemini 3 models require `thought_signature` in tool call responses — this is transparently
+  captured and passed through via `extra_content` on tool_call dicts.
+
+### `context.py` — System Prompt Builder
+
+Assembles the system prompt from:
+1. Base template (role, date, cwd, platform)
+2. Git info (branch, status, recent commits)
+3. CLAUDE.md content (project-level + global)
+4. Memory index (from `memory.get_memory_context()`)
+
+### `config.py` — Configuration
+
+Defaults stored in `~/.nano_claude/config.json`. Key settings:
+
+| Key | Default | Description |
+|---|---|---|
+| `model` | `claude-opus-4-6` | Active model |
+| `max_tokens` | `8192` | Max output tokens |
+| `permission_mode` | `auto` | Permission mode |
+| `max_tool_output` | `32000` | Tool output truncation limit |
+| `max_agent_depth` | `3` | Max sub-agent nesting |
+| `max_concurrent_agents` | `3` | Thread pool size |
+
+---
+
+## Data Flow Example
+
+A user asks "Read config.py and change max_tokens to 16384":
+
+```
+1. nano_claude.py captures input
+2. agent.run() appends user message, calls maybe_compact()
+3. providers.stream() sends to Gemini API with 13 tool schemas
+4. Model responds: text + tool_call[Read(config.py)]
+5. agent.py checks permission (Read = read_only → auto-approve)
+6. tool_registry.execute_tool("Read", ...) → file content (truncated if >32K)
+7. Tool result appended to messages, loop back to step 3
+8. Model responds: text + tool_call[Edit(config.py, "8192", "16384")]
+9. agent.py checks permission (Edit = not read_only → ask user)
+10. User approves → tools.py._edit() runs, generates diff
+11. nano_claude.py renders diff with ANSI colors (red/green)
+12. Tool result appended, loop back to step 3
+13. Model responds: "Done, max_tokens changed to 16384"
+14. No tool_calls → loop ends, TurnDone yielded
+```
+
+---
+
+## Testing
+
+```bash
+# Run all 78 tests
+python -m pytest tests/ -v
+
+# Run specific module tests
+python -m pytest tests/test_tool_registry.py -v
+python -m pytest tests/test_compaction.py -v
+python -m pytest tests/test_memory.py -v
+python -m pytest tests/test_subagent.py -v
+python -m pytest tests/test_skills.py -v
+python -m pytest tests/test_diff_view.py -v
+```
+
+Tests use `monkeypatch` and `tmp_path` fixtures to avoid side effects.
+Sub-agent tests mock `_agent_run` to avoid real API calls.
+
+---
+
+## Future: Package Refactoring
+
+When `tools.py` or `agent.py` grow too large, the flat layout can be migrated to:
+
+```
+ncc/
+├── __init__.py
+├── repl.py              # from nano_claude.py
+├── agent/
+│   ├── loop.py          # from agent.py
+│   ├── subagent.py      # from subagent.py
+│   └── compaction.py    # from compaction.py
+├── providers/
+│   ├── base.py
+│   ├── openai_compat.py
+│   └── registry.py
+├── tools/
+│   ├── registry.py      # from tool_registry.py
+│   ├── builtin.py       # core 8 tools from tools.py
+│   ├── memory.py        # MemorySave/MemoryDelete from tools.py
+│   └── subagent.py      # Agent/Check/List from tools.py
+├── memory/
+│   └── store.py         # from memory.py
+├── skills/
+│   └── loader.py        # from skills.py
+└── config.py
+```
+
+The current code is structured to make this migration straightforward:
+- Modules communicate via function parameters, not globals
+- Each module has a small public API surface
+- Dependencies are unidirectional
--- a/nano-claude-code/docs/logo-v1.png
+++ b/nano-claude-code/docs/logo-v1.png
--- a/nano-claude-code/docs/superpowers/plans/2026-04-02-open-cc-enhancement.md
+++ b/nano-claude-code/docs/superpowers/plans/2026-04-02-open-cc-enhancement.md
--- a/nano-claude-code/docs/superpowers/specs/2026-04-02-open-cc-design.md
+++ b/nano-claude-code/docs/superpowers/specs/2026-04-02-open-cc-design.md
@@ -0,0 +1,643 @@
+# Open-CC: Nano Claude Code Enhancement Design
+
+**Date:** 2026-04-02
+**Status:** Approved
+**Target:** GPT-5.4, Gemini 3/3.1 Pro (Claude not in scope)
+**Code budget:** ~10K lines total (currently ~2.2K)
+**Constraint:** PR-friendly, mergeable back to nano-claude-code upstream
+
+---
+
+## 1. Overview
+
+Evolve nano-claude-code from a minimal ~2.2K-line reference implementation into a capable AI coding CLI, approaching Claude Code's core functionality while staying lean. Five enhancement areas:
+
+1. **Context Window Management** (`compaction.py`)
+2. **Tool System Enhancement** (`tool_registry.py` + `tools.py` refactor)
+3. **Sub-Agent** (`subagent.py`)
+4. **Memory System** (`memory.py`)
+5. **Skills System** (`skills.py`)
+
+### Strategy
+
+**Approach A: Layered Enhancement** -- add new modules alongside existing files, minimize changes to existing code. When agent.py grows too complex, refactor into Approach B (package structure under `ncc/`).
+
+### Design Principles
+
+- Modules communicate via function parameters / dataclasses, no globals
+- Each new module exposes 2-3 public functions, internals self-contained
+- New logic in agent.py grouped by clear `# --- section ---` comments
+- All code in English (comments, docstrings, commit messages)
+
+---
+
+## 2. File Structure
+
+```
+nano-claude-code/
+├── nano_claude.py      # REPL -- add /memory, /skill slash commands
+├── agent.py            # Agent loop -- add compaction call + sub-agent dispatch
+├── providers.py        # No changes (already solid)
+├── tools.py            # Refactor: register built-in tools via registry
+├── context.py          # Extend: inject memory context
+├── config.py           # Add new config keys
+│
+├── compaction.py       # NEW: Context window management
+├── subagent.py         # NEW: Sub-agent lifecycle
+├── memory.py           # NEW: File-based memory system
+├── skills.py           # NEW: Skill loading and execution
+└── tool_registry.py    # NEW: Tool plugin registry
+```
+
+### Module Dependency Graph (unidirectional)
+
+```
+nano_claude.py
+    ├-> agent.py
+    │    ├-> providers.py
+    │    ├-> tool_registry.py -> tools.py (built-in implementations)
+    │    ├-> compaction.py -> providers.py (for summary model call)
+    │    └-> subagent.py (calls agent.py:run recursively)
+    ├-> context.py -> memory.py
+    ├-> skills.py -> tool_registry.py
+    └-> config.py
+```
+
+---
+
+## 3. Context Window Management (`compaction.py`)
+
+Two-layer compression, inspired by Claude Code's three-layer strategy (Layer 3 contextCollapse is experimental, deferred).
+
+### 3.1 Layer 1: Auto-Compact (model-driven summary)
+
+Triggered when estimated token count exceeds 70% of model's context limit.
+
+```python
+def compact_messages(messages: list[dict], config: dict) -> list[dict]:
+    """
+    Split messages into [old | recent].
+    Summarize old via model call.
+    Return [summary_msg, ack_msg, *recent].
+    """
+    split_point = find_split_point(messages, keep_ratio=0.3)
+    old = messages[:split_point]
+    recent = messages[split_point:]
+    summary = call_model_for_summary(old, config)
+    return [
+        {"role": "user", "content": f"[Conversation summary]\n{summary}"},
+        {"role": "assistant", "content": "Understood, I have the context."},
+        *recent
+    ]
+```
+
+### 3.2 Layer 2: Tool-Result Snipping (rule-based)
+
+Truncate old tool outputs without model call. Fast and cheap.
+
+```python
+def snip_old_tool_results(messages: list[dict], max_chars: int = 2000) -> list[dict]:
+    """
+    For tool results older than N turns, truncate to max_chars.
+    Preserve first/last lines, add [snipped N chars] marker.
+    """
+```
+
+### 3.3 Token Estimation
+
+```python
+def estimate_tokens(messages: list[dict]) -> int:
+    """Use tiktoken for GPT models, chars/3.5 fallback."""
+
+def get_context_limit(model: str) -> int:
+    """Return context window size from provider registry."""
+```
+
+### 3.4 Integration Point
+
+```python
+# In agent.py run() loop, before each API call:
+def _maybe_compact(state: AgentState, config: dict) -> bool:
+    token_count = estimate_tokens(state.messages)
+    threshold = get_context_limit(config["model"]) * 0.7
+    if token_count > threshold:
+        state.messages = compact_messages(state.messages, config)
+        return True
+    return False
+```
+
+### 3.5 Public API
+
+```python
+maybe_compact(state: AgentState, config: dict) -> bool
+estimate_tokens(messages: list[dict]) -> int
+get_context_limit(model: str) -> int
+```
+
+---
+
+## 4. Tool System Enhancement (`tool_registry.py` + `tools.py`)
+
+### 4.1 Tool Registry
+
+```python
+@dataclass
+class ToolDef:
+    name: str
+    schema: dict            # JSON schema for parameters
+    func: Callable          # (params: dict, config: dict) -> str
+    read_only: bool         # True = auto-approve in 'auto' mode
+    concurrent_safe: bool   # True = safe for parallel sub-agent use
+
+_TOOLS: dict[str, ToolDef] = {}
+
+def register_tool(tool_def: ToolDef) -> None
+def get_tool(name: str) -> ToolDef | None
+def get_all_tools() -> list[ToolDef]
+def get_tool_schemas() -> list[dict]
+def execute_tool(name: str, params: dict, config: dict) -> str
+```
+
+### 4.2 Tool Output Truncation
+
+Prevent oversized tool outputs (e.g., `cat` large file, `ls -R`) from blowing up context
+before compaction even gets a chance to run. Applied at the `execute_tool` boundary:
+
+```python
+MAX_TOOL_OUTPUT = 32_000  # ~8K tokens, configurable per tool
+
+def execute_tool(name, params, config):
+    tool = get_tool(name)
+    result = tool.func(params, config)
+
+    # Immediate truncation at source
+    if len(result) > MAX_TOOL_OUTPUT:
+        head = result[:MAX_TOOL_OUTPUT // 2]
+        tail = result[-MAX_TOOL_OUTPUT // 4:]
+        snipped = len(result) - len(head) - len(tail)
+        result = f"{head}\n\n[... {snipped} chars truncated ...]\n\n{tail}"
+
+    return result
+```
+
+Additionally, `Bash` tool caps `subprocess` stdout reads to prevent unbounded
+output (e.g., `cat /dev/urandom`).
+
+This creates a two-layer defense:
+- **Layer 0 (here):** hard truncation at tool execution time — prevents oversized messages
+- **Layer 2 (compaction.py snip):** soft truncation of old tool results — reclaims context space
+
+### 4.3 Built-in Tools Refactor
+
+Existing tools.py implementations unchanged. Wrap each with `register_tool()` at module load:
+
+```python
+register_tool(ToolDef(
+    name="Read", schema=READ_SCHEMA, func=_read_file,
+    read_only=True, concurrent_safe=True
+))
+```
+
+### 4.3 Permission Logic (unified)
+
+```python
+# agent.py
+def _check_permission(tool_name, params, config):
+    tool = get_tool(tool_name)
+    if config["permission_mode"] == "accept-all":
+        return True
+    if tool.read_only:
+        return True
+    if tool_name == "Bash" and _is_safe_command(params["command"]):
+        return True
+    return None  # ask user
+```
+
+---
+
+## 5. Sub-Agent (`subagent.py`)
+
+### 5.1 Data Model
+
+```python
+@dataclass
+class SubAgentTask:
+    id: str
+    prompt: str
+    status: str              # "pending" | "running" | "completed" | "failed" | "cancelled"
+    messages: list[dict]     # independent message history
+    result: str | None
+    model: str | None        # optional model override
+    depth: int = 0           # recursion depth counter
+    _cancel_flag: bool = False
+    _future: Future | None = None
+
+@dataclass
+class SubAgentManager:
+    tasks: dict[str, SubAgentTask] = field(default_factory=dict)
+    max_concurrent: int = 3
+    max_depth: int = 3
+    _pool: ThreadPoolExecutor = field(default_factory=
+        lambda: ThreadPoolExecutor(max_workers=3))
+
+    def spawn(self, prompt, config, system_prompt, depth=0) -> SubAgentTask
+    def get_result(self, task_id) -> str | None
+    def list_tasks(self) -> list[SubAgentTask]
+    def cancel(self, task_id) -> bool
+    def wait(self, task_id, timeout=None) -> SubAgentTask
+```
+
+### 5.2 Execution Model — Threading from Day 1
+
+Sub-agents run in background threads via `ThreadPoolExecutor`. This enables:
+- Non-blocking spawn (main agent continues or waits by choice)
+- Cancellation via cooperative flag
+- Concurrent sub-agents (up to `max_concurrent`)
+
+```python
+def spawn(self, prompt, config, system_prompt, depth=0):
+    if depth >= self.max_depth:
+        return SubAgentTask(status="failed",
+            result="Error: max sub-agent depth reached.")
+
+    task = SubAgentTask(id=uuid4().hex[:8], prompt=prompt,
+                        status="running", depth=depth, ...)
+
+    def _run():
+        sub_state = AgentState()
+        try:
+            for event in agent.run(
+                prompt, sub_state, config, system_prompt,
+                depth=depth + 1,
+                cancel_check=lambda: task._cancel_flag
+            ):
+                if isinstance(event, TurnDone):
+                    task.result = extract_final_text(sub_state.messages)
+            task.status = "completed"
+        except Exception as e:
+            task.result = f"Error: {e}"
+            task.status = "failed"
+
+    task._future = self._pool.submit(_run)
+    self.tasks[task.id] = task
+    return task
+```
+
+### 5.3 Cooperative Cancellation
+
+Python threads cannot be killed safely. Instead, `agent.run()` checks a
+`cancel_check` callable each loop iteration:
+
+```python
+# agent.py run() — new parameter
+def run(user_message, state, config, system_prompt,
+        depth=0, cancel_check=None):
+    ...
+    while True:
+        if cancel_check and cancel_check():
+            return  # clean exit
+        for event in stream(...):
+            yield event
+        ...
+```
+
+### 5.4 Depth Limiting (No Tool Removal)
+
+Sub-agents CAN call Agent tool (enabling A -> B -> C chains). Depth is
+passed through, and the Agent tool returns an error at `max_depth`:
+
+```python
+def _agent_tool_func(params, config, depth=0):
+    if depth >= manager.max_depth:
+        return ("Error: max sub-agent depth reached. "
+                "Complete this task directly without spawning sub-agents.")
+    return manager.spawn(params["prompt"], config, system_prompt, depth)
+```
+
+The model sees the error and adapts — no silent capability removal.
+
+### 5.5 Context Strategy
+
+Sub-agent gets **fresh context** (no parent message history):
+
+```python
+sub_system_prompt = f"""You are a sub-agent. Your task:
+{prompt}
+
+Working directory: {cwd}
+{memory_context}
+"""
+```
+
+### 5.6 Tool Registration — 3 Tools
+
+The sub-agent system registers three tools:
+
+**Agent** — spawn a sub-agent:
+
+```python
+AGENT_SCHEMA = {
+    "name": "Agent",
+    "description": "Launch a sub-agent to handle a task independently.",
+    "input_schema": {
+        "type": "object",
+        "properties": {
+            "prompt": {"type": "string", "description": "Task description"},
+            "model": {"type": "string", "description": "Optional model override"},
+            "wait": {"type": "boolean", "default": True,
+                     "description": "True = block until done (default). "
+                                    "False = return task_id immediately."}
+        },
+        "required": ["prompt"]
+    }
+}
+```
+
+- `wait=True` (default): spawn + block + return result. Feels synchronous to model.
+- `wait=False`: spawn + return task_id immediately. Model must use CheckAgentResult later.
+
+**CheckAgentResult** — poll a background sub-agent:
+
+```python
+CHECK_AGENT_RESULT_SCHEMA = {
+    "name": "CheckAgentResult",
+    "description": "Check the result of a background sub-agent task.",
+    "input_schema": {
+        "type": "object",
+        "properties": {
+            "task_id": {"type": "string", "description": "Task ID from Agent tool"}
+        },
+        "required": ["task_id"]
+    }
+}
+```
+
+Returns: status + result (if completed), or status + "still running".
+
+**ListAgentTasks** — overview of all sub-agents:
+
+```python
+LIST_AGENT_TASKS_SCHEMA = {
+    "name": "ListAgentTasks",
+    "description": "List all sub-agent tasks and their status.",
+    "input_schema": {"type": "object", "properties": {}}
+}
+```
+
+Returns a table of `[id, status, prompt_preview]` for all tasks.
+
+---
+
+## 6. Memory System (`memory.py`)
+
+### 6.1 Storage
+
+```
+~/.nano_claude/memory/
+├── MEMORY.md              # Index file (max 200 lines)
+├── user_role.md           # Individual memory files
+├── feedback_testing.md
+└── ...
+```
+
+Memory file format:
+
+```markdown
+---
+name: user role
+description: user is a data scientist focused on logging
+type: user
+created: 2026-04-02
+---
+
+User is a data scientist, currently investigating observability/logging.
+```
+
+### 6.2 Public API
+
+```python
+@dataclass
+class MemoryEntry:
+    name: str
+    description: str
+    type: str              # "user" | "feedback" | "project" | "reference"
+    content: str
+    file_path: str
+    created: str
+
+def load_index() -> list[MemoryEntry]
+def save_memory(entry: MemoryEntry) -> None
+def delete_memory(name: str) -> None
+def search_memory(query: str) -> list[MemoryEntry]
+def get_memory_context() -> str   # for system prompt injection
+```
+
+### 6.3 Tool Registration
+
+Two tools for model-driven memory management:
+
+- **MemorySave**: `{name, type, description, content}` -> write file + update index
+- **MemoryDelete**: `{name}` -> remove file + update index
+
+### 6.4 Context Integration
+
+`context.py:build_system_prompt()` appends `memory.get_memory_context()` (the MEMORY.md index). Model uses Read tool to access full memory file content when needed.
+
+---
+
+## 7. Skills System (`skills.py`)
+
+### 7.1 Skill Definition
+
+Markdown files with frontmatter:
+
+```
+~/.nano_claude/skills/commit.md
+```
+
+```markdown
+---
+name: commit
+description: Create a git commit with conventional format
+triggers: ["/commit", "commit changes"]
+tools: [Bash, Read]
+---
+
+# Commit Skill
+
+Analyze staged changes and create a well-formatted commit message.
+...
+```
+
+### 7.2 Search Path
+
+```python
+SKILL_PATHS = [
+    Path.cwd() / ".nano_claude" / "skills",    # project-level (priority)
+    Path.home() / ".nano_claude" / "skills",    # user-level
+]
+```
+
+### 7.3 Public API
+
+```python
+@dataclass
+class SkillDef:
+    name: str
+    description: str
+    triggers: list[str]
+    tools: list[str]
+    prompt: str
+    file_path: str
+
+def load_skills() -> list[SkillDef]
+def find_skill(query: str) -> SkillDef | None
+def execute_skill(skill, args, state, config) -> Generator
+```
+
+### 7.4 Execution Model
+
+Skills are just prompts injected into the normal agent loop:
+
+```python
+def execute_skill(skill, args, state, config):
+    prompt = f"[Skill: {skill.name}]\n\n{skill.prompt}"
+    if args:
+        prompt += f"\n\nUser context: {args}"
+    system_prompt = build_system_prompt(config)
+    for event in agent.run(prompt, state, config, system_prompt):
+        yield event
+```
+
+### 7.5 REPL Integration
+
+In `nano_claude.py`, unmatched `/` commands fall through to skill lookup:
+
+```python
+if user_input.startswith("/"):
+    # Try built-in slash commands first
+    # If no match -> find_skill(user_input)
+    # If skill found -> execute_skill(...)
+```
+
+---
+
+## 8. Diff View for File Modifications
+
+Core UX improvement: show git-style red/green diff when Edit or Write modifies an existing file.
+
+### 8.1 Diff Generation (in tools.py)
+
+Edit and Write tool implementations capture before/after content and generate unified diff:
+
+```python
+import difflib
+
+def generate_unified_diff(old, new, filename, context_lines=3):
+    """
+    Args:
+        old: original file content, str
+        new: modified file content, str
+        filename: display name, str
+        context_lines: lines of context around changes, int
+    Returns:
+        unified diff string
+    """
+    old_lines = old.splitlines(keepends=True)
+    new_lines = new.splitlines(keepends=True)
+    diff = difflib.unified_diff(
+        old_lines, new_lines,
+        fromfile=f"a/{filename}", tofile=f"b/{filename}",
+        n=context_lines
+    )
+    return "".join(diff)
+```
+
+Tool return values change:
+- **Edit**: `"Changes applied to {filename}:\n\n{diff}"`
+- **Write** (existing file): `"File updated:\n\n{diff}"`
+- **Write** (new file): `"New file created: {filename} ({n} lines)"` (no diff)
+
+### 8.2 REPL Rendering (in nano_claude.py)
+
+Detect diff blocks in tool output and render with ANSI colors:
+
+```python
+def render_diff(diff_text):
+    for line in diff_text.splitlines():
+        if line.startswith("+++") or line.startswith("---"):
+            print(f"\033[1m{line}\033[0m")        # bold
+        elif line.startswith("+"):
+            print(f"\033[32m{line}\033[0m")        # green
+        elif line.startswith("-"):
+            print(f"\033[31m{line}\033[0m")        # red
+        elif line.startswith("@@"):
+            print(f"\033[36m{line}\033[0m")        # cyan
+        else:
+            print(line)
+```
+
+### 8.3 Diff Truncation
+
+For large diffs (e.g., Write replaces entire file), cap the diff display:
+
+```python
+MAX_DIFF_LINES = 80
+
+def maybe_truncate_diff(diff_text):
+    lines = diff_text.splitlines()
+    if len(lines) > MAX_DIFF_LINES:
+        shown = lines[:MAX_DIFF_LINES]
+        remaining = len(lines) - MAX_DIFF_LINES
+        return "\n".join(shown) + f"\n\n[... {remaining} more lines ...]"
+    return diff_text
+```
+
+Note: truncation applies to the **display** in REPL only. The full diff is still
+returned to the model so it can verify the change.
+
+---
+
+## 9. Implementation Order
+
+Each step is an independent PR:
+
+| Phase | Module | Depends On | Estimated Lines |
+|-------|--------|-----------|-----------------|
+| 1 | `tool_registry.py` + `tools.py` refactor | None | ~600 |
+| 2 | Diff view in `tools.py` + `nano_claude.py` | Phase 1 | ~100 |
+| 3 | `compaction.py` + agent.py integration | Phase 1 | ~300 |
+| 4 | `memory.py` + context.py integration | Phase 1 | ~200 |
+| 5 | `subagent.py` + agent.py integration (threading) | Phase 1 | ~350 |
+| 6 | `skills.py` + nano_claude.py integration | Phase 1, 4 | ~200 |
+| 7 | Slash commands + config updates | All above | ~300 |
+
+**Total new code: ~2050 lines. Grand total: ~4.2K lines.**
+
+---
+
+## 10. Key Decisions
+
+| Decision | Choice | Rationale |
+|----------|--------|-----------|
+| Compression layers | 2 (autoCompact + snip) | Layer 3 is experimental in Claude Code |
+| Tool output truncation | Hard cap at execute_tool boundary | Prevents oversized outputs before compaction runs |
+| Sub-agent execution | Threading from day 1 | Sync blocks main agent, can't cancel, can't parallelize |
+| Sub-agent depth | Depth counter (max 3), no tool removal | Model sees error and adapts; sub-sub-agents allowed |
+| Sub-agent tools | Agent + CheckAgentResult + ListAgentTasks | Model needs feedback loop for async tasks |
+| Diff view | difflib unified diff + ANSI colors | Core UX, zero dependencies |
+| Memory search | Keyword match, no embeddings | Keep simple, model judges relevance |
+| Skills format | Markdown + frontmatter | Human-readable, git-friendly, no Python needed |
+| Tool registry | Global dict + register function | Simple, extensible, easy to migrate to package |
+| Target models | GPT-5.4, Gemini 3/3.1 Pro | User's primary use case |
+| No Claude support | Intentional | Official Claude Code exists |
+
+---
+
+## 11. Future Considerations (Not in Scope)
+
+- MCP protocol support
+- Remote skill marketplace
+- Voice mode
+- Bridge to desktop apps
+- contextCollapse (Layer 3 compression)
--- a/nano-claude-code/memory.py
+++ b/nano-claude-code/memory.py
@@ -0,0 +1,11 @@
+"""Backward-compatibility shim — real implementation is in memory/ package."""
+from memory.store import (  # noqa: F401
+    MemoryEntry,
+    save_memory,
+    delete_memory,
+    load_index,
+    search_memory,
+    get_index_content,
+    parse_frontmatter,
+)
+from memory.context import get_memory_context  # noqa: F401
--- a/nano-claude-code/memory/init.py
+++ b/nano-claude-code/memory/init.py
@@ -0,0 +1,86 @@
+"""Memory package for nano-claude-code.
+
+Provides persistent, file-based memory across conversations.
+
+Storage layout:
+  user scope    : ~/.nano_claude/memory/<slug>.md   (shared across projects)
+  project scope : .nano_claude/memory/<slug>.md     (local to cwd)
+
+The MEMORY.md index in each directory is auto-maintained and injected
+into the system prompt so Claude has an overview of available memories.
+
+Public API (backward-compatible with the old memory.py module):
+  MemoryEntry      — dataclass for a single memory
+  save_memory()    — write/update a memory file
+  delete_memory()  — remove a memory file
+  load_index()     — load all entries from one or both scopes
+  search_memory()  — keyword search across entries
+  get_memory_context() — MEMORY.md content for system prompt injection
+"""
+from .store import (  # noqa: F401
+    MemoryEntry,
+    save_memory,
+    delete_memory,
+    load_index,
+    load_entries,
+    search_memory,
+    get_index_content,
+    parse_frontmatter,
+    USER_MEMORY_DIR,
+    INDEX_FILENAME,
+    MAX_INDEX_LINES,
+    MAX_INDEX_BYTES,
+)
+from .scan import (  # noqa: F401
+    MemoryHeader,
+    scan_memory_dir,
+    scan_all_memories,
+    format_memory_manifest,
+    memory_age_days,
+    memory_age_str,
+    memory_freshness_text,
+)
+from .context import (  # noqa: F401
+    get_memory_context,
+    find_relevant_memories,
+    truncate_index_content,
+)
+from .types import (  # noqa: F401
+    MEMORY_TYPES,
+    MEMORY_TYPE_DESCRIPTIONS,
+    MEMORY_SYSTEM_PROMPT,
+    WHAT_NOT_TO_SAVE,
+)
+
+__all__ = [
+    # store
+    "MemoryEntry",
+    "save_memory",
+    "delete_memory",
+    "load_index",
+    "load_entries",
+    "search_memory",
+    "get_index_content",
+    "parse_frontmatter",
+    "USER_MEMORY_DIR",
+    "INDEX_FILENAME",
+    "MAX_INDEX_LINES",
+    "MAX_INDEX_BYTES",
+    # scan
+    "MemoryHeader",
+    "scan_memory_dir",
+    "scan_all_memories",
+    "format_memory_manifest",
+    "memory_age_days",
+    "memory_age_str",
+    "memory_freshness_text",
+    # context
+    "get_memory_context",
+    "find_relevant_memories",
+    "truncate_index_content",
+    # types
+    "MEMORY_TYPES",
+    "MEMORY_TYPE_DESCRIPTIONS",
+    "MEMORY_SYSTEM_PROMPT",
+    "WHAT_NOT_TO_SAVE",
+]
--- a/nano-claude-code/memory/context.py
+++ b/nano-claude-code/memory/context.py
@@ -0,0 +1,221 @@
+"""Memory context building for system prompt injection.
+
+Provides:
+  get_memory_context()      — full context string for system prompt
+  find_relevant_memories()  — keyword (+ optional AI) relevance filtering
+  truncate_index_content()  — line + byte truncation with warning
+"""
+from __future__ import annotations
+
+from pathlib import Path
+
+from .store import (
+    USER_MEMORY_DIR,
+    INDEX_FILENAME,
+    MAX_INDEX_LINES,
+    MAX_INDEX_BYTES,
+    get_memory_dir,
+    get_index_content,
+    load_entries,
+    search_memory,
+)
+from .scan import scan_all_memories, format_memory_manifest, memory_freshness_text
+from .types import MEMORY_SYSTEM_PROMPT
+
+
+# ── Index truncation ───────────────────────────────────────────────────────
+
+def truncate_index_content(raw: str) -> str:
+    """Truncate MEMORY.md content to line AND byte limits, appending a warning.
+
+    Matches Claude Code's truncateEntrypointContent:
+      - Line-truncates first (natural boundary)
+      - Then byte-truncates at the last newline before the cap
+      - Appends which limit fired
+    """
+    trimmed = raw.strip()
+    content_lines = trimmed.split("\n")
+    line_count = len(content_lines)
+    byte_count = len(trimmed.encode())
+
+    was_line_truncated = line_count > MAX_INDEX_LINES
+    was_byte_truncated = byte_count > MAX_INDEX_BYTES
+
+    if not was_line_truncated and not was_byte_truncated:
+        return trimmed
+
+    truncated = "\n".join(content_lines[:MAX_INDEX_LINES]) if was_line_truncated else trimmed
+
+    if len(truncated.encode()) > MAX_INDEX_BYTES:
+        # Cut at last newline before byte limit
+        raw_bytes = truncated.encode()
+        cut = raw_bytes[:MAX_INDEX_BYTES].rfind(b"\n")
+        truncated = raw_bytes[: cut if cut > 0 else MAX_INDEX_BYTES].decode(errors="replace")
+
+    if was_byte_truncated and not was_line_truncated:
+        reason = f"{byte_count:,} bytes (limit: {MAX_INDEX_BYTES:,}) — index entries are too long"
+    elif was_line_truncated and not was_byte_truncated:
+        reason = f"{line_count} lines (limit: {MAX_INDEX_LINES})"
+    else:
+        reason = f"{line_count} lines and {byte_count:,} bytes"
+
+    warning = (
+        f"\n\n> WARNING: {INDEX_FILENAME} is {reason}. "
+        "Only part of it was loaded. Keep index entries to one line under ~150 chars."
+    )
+    return truncated + warning
+
+
+# ── System prompt context ──────────────────────────────────────────────────
+
+def get_memory_context(include_guidance: bool = False) -> str:
+    """Return memory context for injection into the system prompt.
+
+    Combines user-level and project-level MEMORY.md content (if present).
+    Returns empty string when no memories exist.
+
+    Args:
+        include_guidance: if True, prepend the full memory system guidance
+                          (MEMORY_SYSTEM_PROMPT). Normally False since the
+                          system prompt template already includes brief guidance.
+    """
+    parts: list[str] = []
+
+    # User-level index
+    user_content = get_index_content("user")
+    if user_content:
+        truncated = truncate_index_content(user_content)
+        parts.append(truncated)
+
+    # Project-level index (labelled separately)
+    proj_content = get_index_content("project")
+    if proj_content:
+        truncated = truncate_index_content(proj_content)
+        parts.append(f"[Project memories]\n{truncated}")
+
+    if not parts:
+        return ""
+
+    body = "\n\n".join(parts)
+    if include_guidance:
+        return f"{MEMORY_SYSTEM_PROMPT}\n\n## MEMORY.md\n{body}"
+    return body
+
+
+# ── Relevant memory finder ─────────────────────────────────────────────────
+
+def find_relevant_memories(
+    query: str,
+    max_results: int = 5,
+    use_ai: bool = False,
+    config: dict | None = None,
+) -> list[dict]:
+    """Find memories relevant to a query.
+
+    Strategy:
+      1. Always: keyword match on name + description + content
+      2. If use_ai=True and config has a model: use a small AI call to rank
+
+    Returns:
+        List of dicts with keys: name, description, type, scope, content,
+        file_path, mtime_s, freshness_text
+    """
+    # Step 1: Keyword filter
+    keyword_results = search_memory(query)
+    if not keyword_results:
+        return []
+
+    if not use_ai or not config:
+        # Return top max_results by recency (newest first)
+        from .scan import scan_all_memories
+        headers = scan_all_memories()
+        path_to_mtime = {h.file_path: h.mtime_s for h in headers}
+
+        results = []
+        for entry in keyword_results[:max_results]:
+            mtime_s = path_to_mtime.get(entry.file_path, 0)
+            results.append({
+                "name": entry.name,
+                "description": entry.description,
+                "type": entry.type,
+                "scope": entry.scope,
+                "content": entry.content,
+                "file_path": entry.file_path,
+                "mtime_s": mtime_s,
+                "freshness_text": memory_freshness_text(mtime_s),
+            })
+        results.sort(key=lambda r: r["mtime_s"], reverse=True)
+        return results[:max_results]
+
+    # Step 2: AI-powered relevance selection (optional, lightweight)
+    return _ai_select_memories(query, keyword_results, max_results, config)
+
+
+def _ai_select_memories(
+    query: str,
+    candidates: list,
+    max_results: int,
+    config: dict,
+) -> list[dict]:
+    """Use a fast AI call to select the most relevant memories from candidates.
+
+    Falls back to keyword results on any error.
+    """
+    try:
+        from providers import stream, AssistantTurn
+        from .scan import scan_all_memories
+
+        headers = scan_all_memories()
+        path_to_mtime = {h.file_path: h.mtime_s for h in headers}
+
+        # Build manifest of candidates only
+        manifest_lines = []
+        for i, e in enumerate(candidates):
+            manifest_lines.append(f"{i}: [{e.type}] {e.name} — {e.description}")
+        manifest = "\n".join(manifest_lines)
+
+        system = (
+            "You select memories relevant to a query. "
+            "Return a JSON object with key 'indices' containing a list of integer indices "
+            f"(0-based) from the provided list. Select at most {max_results} entries. "
+            "Only include indices clearly relevant to the query. Return {\"indices\": []} if none."
+        )
+        messages = [{"role": "user", "content": f"Query: {query}\n\nMemories:\n{manifest}"}]
+
+        result_text = ""
+        for event in stream(
+            model=config.get("model", "claude-haiku-4-5-20251001"),
+            system=system,
+            messages=messages,
+            tool_schemas=[],
+            config={**config, "max_tokens": 256, "no_tools": True},
+        ):
+            if isinstance(event, AssistantTurn):
+                result_text = event.text
+                break
+
+        import json as _json
+        parsed = _json.loads(result_text)
+        selected_indices = [int(i) for i in parsed.get("indices", []) if isinstance(i, int)]
+
+    except Exception:
+        # Fall back to keyword results
+        selected_indices = list(range(min(max_results, len(candidates))))
+
+    results = []
+    for i in selected_indices[:max_results]:
+        if i < 0 or i >= len(candidates):
+            continue
+        entry = candidates[i]
+        mtime_s = path_to_mtime.get(entry.file_path, 0) if "path_to_mtime" in dir() else 0
+        results.append({
+            "name": entry.name,
+            "description": entry.description,
+            "type": entry.type,
+            "scope": entry.scope,
+            "content": entry.content,
+            "file_path": entry.file_path,
+            "mtime_s": mtime_s,
+            "freshness_text": memory_freshness_text(mtime_s),
+        })
+    return results
--- a/nano-claude-code/memory/scan.py
+++ b/nano-claude-code/memory/scan.py
@@ -0,0 +1,144 @@
+"""Memory file scanning with mtime tracking and freshness/age helpers.
+
+Mirrors the key ideas from Claude Code's memoryScan.ts and memoryAge.ts:
+  - Scan memory directories, sort newest-first
+  - Format a manifest for display or AI relevance selection
+  - Report memory age in human-readable form ("today", "3 days ago")
+  - Emit a staleness caveat for memories older than 1 day
+"""
+from __future__ import annotations
+
+import math
+import time
+from dataclasses import dataclass
+from pathlib import Path
+
+from .store import get_memory_dir, parse_frontmatter, INDEX_FILENAME
+
+MAX_MEMORY_FILES = 200
+
+
+# ── Data model ─────────────────────────────────────────────────────────────
+
+@dataclass
+class MemoryHeader:
+    """Lightweight descriptor loaded from a memory file's frontmatter.
+
+    Attributes:
+        filename:    basename of the .md file
+        file_path:   absolute path
+        mtime_s:     modification time (seconds since epoch)
+        description: value from frontmatter `description:` field
+        type:        value from frontmatter `type:` field
+        scope:       "user" or "project"
+    """
+    filename: str
+    file_path: str
+    mtime_s: float
+    description: str
+    type: str
+    scope: str
+
+
+# ── Scanning ───────────────────────────────────────────────────────────────
+
+def scan_memory_dir(mem_dir: Path, scope: str) -> list[MemoryHeader]:
+    """Scan a single memory directory and return headers sorted newest-first.
+
+    Reads only the frontmatter (first ~30 lines) for efficiency.
+    Silently skips unreadable files. Caps at MAX_MEMORY_FILES entries.
+    """
+    if not mem_dir.is_dir():
+        return []
+
+    headers: list[MemoryHeader] = []
+    for fp in mem_dir.glob("*.md"):
+        if fp.name == INDEX_FILENAME:
+            continue
+        try:
+            stat = fp.stat()
+            # Read only the first 30 lines for frontmatter
+            lines = fp.read_text(errors="replace").splitlines()[:30]
+            snippet = "\n".join(lines)
+            meta, _ = parse_frontmatter(snippet)
+            headers.append(MemoryHeader(
+                filename=fp.name,
+                file_path=str(fp),
+                mtime_s=stat.st_mtime,
+                description=meta.get("description", ""),
+                type=meta.get("type", ""),
+                scope=scope,
+            ))
+        except Exception:
+            continue
+
+    headers.sort(key=lambda h: h.mtime_s, reverse=True)
+    return headers[:MAX_MEMORY_FILES]
+
+
+def scan_all_memories() -> list[MemoryHeader]:
+    """Scan both user and project memory directories, merged newest-first."""
+    user_dir = get_memory_dir("user")
+    proj_dir = get_memory_dir("project")
+
+    user_headers = scan_memory_dir(user_dir, "user")
+    proj_headers = scan_memory_dir(proj_dir, "project")
+
+    combined = user_headers + proj_headers
+    combined.sort(key=lambda h: h.mtime_s, reverse=True)
+    return combined[:MAX_MEMORY_FILES]
+
+
+# ── Age / freshness ────────────────────────────────────────────────────────
+
+def memory_age_days(mtime_s: float) -> int:
+    """Days since mtime_s (floor-rounded, clamped to 0 for future times)."""
+    return max(0, math.floor((time.time() - mtime_s) / 86_400))
+
+
+def memory_age_str(mtime_s: float) -> str:
+    """Human-readable age: 'today', 'yesterday', or 'N days ago'."""
+    d = memory_age_days(mtime_s)
+    if d == 0:
+        return "today"
+    if d == 1:
+        return "yesterday"
+    return f"{d} days ago"
+
+
+def memory_freshness_text(mtime_s: float) -> str:
+    """Staleness caveat for memories older than 1 day (empty string if fresh).
+
+    Motivated by user reports of stale code-state memories (file:line
+    citations to code that has since changed) being asserted as fact.
+    """
+    d = memory_age_days(mtime_s)
+    if d <= 1:
+        return ""
+    return (
+        f"This memory is {d} days old. "
+        "Memories are point-in-time observations, not live state — "
+        "claims about code behavior or file:line citations may be outdated. "
+        "Verify against current code before asserting as fact."
+    )
+
+
+# ── Manifest formatting ────────────────────────────────────────────────────
+
+def format_memory_manifest(headers: list[MemoryHeader]) -> str:
+    """Format a list of MemoryHeader as a text manifest.
+
+    Format per line:  [type/scope] filename (age): description
+    Example:
+        [feedback/user] feedback_testing.md (3 days ago): Don't mock DB in tests
+        [project/project] project_freeze.md (today): Merge freeze until 2026-04-10
+    """
+    lines = []
+    for h in headers:
+        tag = f"[{h.type}/{h.scope}]" if h.type else f"[{h.scope}]"
+        age = memory_age_str(h.mtime_s)
+        if h.description:
+            lines.append(f"- {tag} {h.filename} ({age}): {h.description}")
+        else:
+            lines.append(f"- {tag} {h.filename} ({age})")
+    return "\n".join(lines)
--- a/nano-claude-code/memory/store.py
+++ b/nano-claude-code/memory/store.py
@@ -0,0 +1,223 @@
+"""File-based memory storage with user-level and project-level scopes.
+
+Storage layout:
+  user scope    : ~/.nano_claude/memory/<slug>.md
+  project scope : .nano_claude/memory/<slug>.md  (relative to cwd)
+
+MEMORY.md in each directory is the index file — rebuilt automatically after
+every save/delete. It is loaded into the system prompt to give Claude an
+overview of available memories.
+"""
+from __future__ import annotations
+
+import re
+from dataclasses import dataclass
+from pathlib import Path
+
+
+# ── Paths ──────────────────────────────────────────────────────────────────
+
+USER_MEMORY_DIR = Path.home() / ".nano_claude" / "memory"
+INDEX_FILENAME = "MEMORY.md"
+
+# Maximum lines/bytes for the index file (mirrors Claude Code limits)
+MAX_INDEX_LINES = 200
+MAX_INDEX_BYTES = 25_000
+
+
+def get_project_memory_dir() -> Path:
+    """Return the project-local memory directory (relative to cwd)."""
+    return Path.cwd() / ".nano_claude" / "memory"
+
+
+def get_memory_dir(scope: str = "user") -> Path:
+    """Return the memory directory for the given scope.
+
+    Args:
+        scope: "user" (global ~/.nano_claude/memory) or
+               "project" (.nano_claude/memory relative to cwd)
+    """
+    if scope == "project":
+        return get_project_memory_dir()
+    return USER_MEMORY_DIR
+
+
+# ── Data model ─────────────────────────────────────────────────────────────
+
+@dataclass
+class MemoryEntry:
+    """A single memory entry loaded from a .md file.
+
+    Attributes:
+        name:        human-readable name (also the display title in the index)
+        description: short one-line description (used for relevance decisions)
+        type:        "user" | "feedback" | "project" | "reference"
+        content:     body text of the memory
+        file_path:   absolute path to the .md file on disk
+        created:     date string, e.g. "2026-04-02"
+        scope:       "user" | "project" — which directory this was loaded from
+    """
+    name: str
+    description: str
+    type: str
+    content: str
+    file_path: str = ""
+    created: str = ""
+    scope: str = "user"
+
+
+# ── Helpers ────────────────────────────────────────────────────────────────
+
+def _slugify(name: str) -> str:
+    """Convert name to a filesystem-safe slug (max 60 chars)."""
+    s = name.lower().strip().replace(" ", "_")
+    s = re.sub(r"[^a-z0-9_]", "", s)
+    return s[:60]
+
+
+def parse_frontmatter(text: str) -> tuple[dict, str]:
+    """Parse ---\\nkey: value\\n---\\nbody format.
+
+    Returns:
+        (meta_dict, body_str)
+    """
+    if not text.startswith("---"):
+        return {}, text
+    parts = text.split("---", 2)
+    if len(parts) < 3:
+        return {}, text
+    meta: dict = {}
+    for line in parts[1].strip().splitlines():
+        if ":" in line:
+            key, _, val = line.partition(":")
+            meta[key.strip()] = val.strip()
+    return meta, parts[2].strip()
+
+
+def _format_entry_md(entry: MemoryEntry) -> str:
+    """Render a MemoryEntry as a markdown file with YAML frontmatter."""
+    return (
+        f"---\n"
+        f"name: {entry.name}\n"
+        f"description: {entry.description}\n"
+        f"type: {entry.type}\n"
+        f"created: {entry.created}\n"
+        f"---\n"
+        f"{entry.content}\n"
+    )
+
+
+# ── Core storage operations ────────────────────────────────────────────────
+
+def save_memory(entry: MemoryEntry, scope: str = "user") -> None:
+    """Write/update a memory file and rebuild the index for that scope.
+
+    If a memory with the same name (slug) already exists, it is overwritten.
+
+    Args:
+        entry: MemoryEntry to persist
+        scope: "user" or "project"
+    """
+    mem_dir = get_memory_dir(scope)
+    mem_dir.mkdir(parents=True, exist_ok=True)
+    slug = _slugify(entry.name)
+    fp = mem_dir / f"{slug}.md"
+    fp.write_text(_format_entry_md(entry))
+    entry.file_path = str(fp)
+    entry.scope = scope
+    _rewrite_index(scope)
+
+
+def delete_memory(name: str, scope: str = "user") -> None:
+    """Remove the memory file matching name and rebuild the index.
+
+    No error if not found.
+    """
+    mem_dir = get_memory_dir(scope)
+    slug = _slugify(name)
+    fp = mem_dir / f"{slug}.md"
+    if fp.exists():
+        fp.unlink()
+    _rewrite_index(scope)
+
+
+def load_entries(scope: str = "user") -> list[MemoryEntry]:
+    """Scan all .md files (except MEMORY.md) in a scope and return entries.
+
+    Returns:
+        List of MemoryEntry sorted alphabetically by name.
+    """
+    mem_dir = get_memory_dir(scope)
+    if not mem_dir.exists():
+        return []
+    entries: list[MemoryEntry] = []
+    for fp in sorted(mem_dir.glob("*.md")):
+        if fp.name == INDEX_FILENAME:
+            continue
+        try:
+            text = fp.read_text()
+        except Exception:
+            continue
+        meta, body = parse_frontmatter(text)
+        entries.append(MemoryEntry(
+            name=meta.get("name", fp.stem),
+            description=meta.get("description", ""),
+            type=meta.get("type", "user"),
+            content=body,
+            file_path=str(fp),
+            created=meta.get("created", ""),
+            scope=scope,
+        ))
+    return entries
+
+
+def load_index(scope: str = "all") -> list[MemoryEntry]:
+    """Load memory entries from one or both scopes.
+
+    Args:
+        scope: "user", "project", or "all" (both combined)
+
+    Returns:
+        List of MemoryEntry (user entries first, then project).
+    """
+    if scope == "all":
+        return load_entries("user") + load_entries("project")
+    return load_entries(scope)
+
+
+def search_memory(query: str, scope: str = "all") -> list[MemoryEntry]:
+    """Case-insensitive keyword match on name + description + content.
+
+    Returns:
+        List of matching MemoryEntry objects.
+    """
+    q = query.lower()
+    results = []
+    for entry in load_index(scope):
+        haystack = f"{entry.name} {entry.description} {entry.content}".lower()
+        if q in haystack:
+            results.append(entry)
+    return results
+
+
+def _rewrite_index(scope: str) -> None:
+    """Rebuild MEMORY.md for the given scope from all .md files in that dir."""
+    mem_dir = get_memory_dir(scope)
+    if not mem_dir.exists():
+        return
+    index_path = mem_dir / INDEX_FILENAME
+    entries = load_entries(scope)
+    lines = [
+        f"- [{e.name}]({Path(e.file_path).name}) — {e.description}"
+        for e in entries
+    ]
+    index_path.write_text("\n".join(lines) + ("\n" if lines else ""))
+
+
+def get_index_content(scope: str = "user") -> str:
+    """Return raw MEMORY.md content for the given scope, or '' if absent."""
+    mem_dir = get_memory_dir(scope)
+    index_path = mem_dir / INDEX_FILENAME
+    if not index_path.exists():
+        return ""
+    return index_path.read_text().strip()
--- a/nano-claude-code/memory/tools.py
+++ b/nano-claude-code/memory/tools.py
@@ -0,0 +1,216 @@
+"""Memory tool registrations: MemorySave, MemoryDelete, MemorySearch.
+
+Importing this module registers the three tools into the central registry.
+"""
+from __future__ import annotations
+
+from datetime import datetime
+
+from tool_registry import ToolDef, register_tool
+from .store import MemoryEntry, save_memory, delete_memory, load_index
+from .context import find_relevant_memories
+from .scan import scan_all_memories, format_memory_manifest
+
+
+# ── Tool implementations ───────────────────────────────────────────────────
+
+def _memory_save(params: dict, config: dict) -> str:
+    """Save or update a persistent memory entry."""
+    entry = MemoryEntry(
+        name=params["name"],
+        description=params["description"],
+        type=params["type"],
+        content=params["content"],
+        created=datetime.now().strftime("%Y-%m-%d"),
+    )
+    scope = params.get("scope", "user")
+    save_memory(entry, scope=scope)
+
+    scope_label = "project" if scope == "project" else "user"
+    return f"Memory saved: '{entry.name}' [{entry.type}/{scope_label}]"
+
+
+def _memory_delete(params: dict, config: dict) -> str:
+    """Delete a persistent memory entry by name."""
+    name = params["name"]
+    scope = params.get("scope", "user")
+    delete_memory(name, scope=scope)
+    return f"Memory deleted: '{name}' (scope: {scope})"
+
+
+def _memory_search(params: dict, config: dict) -> str:
+    """Search memories by keyword query with optional AI relevance filtering."""
+    query = params["query"]
+    use_ai = params.get("use_ai", False)
+    max_results = params.get("max_results", 5)
+
+    results = find_relevant_memories(
+        query, max_results=max_results, use_ai=use_ai, config=config
+    )
+
+    if not results:
+        return f"No memories found matching '{query}'."
+
+    lines = [f"Found {len(results)} relevant memory/memories for '{query}':", ""]
+    for r in results:
+        freshness = f"  ⚠ {r['freshness_text']}" if r["freshness_text"] else ""
+        lines.append(
+            f"[{r['type']}/{r['scope']}] {r['name']}\n"
+            f"  {r['description']}\n"
+            f"  {r['content'][:200]}{'...' if len(r['content']) > 200 else ''}"
+            f"{freshness}"
+        )
+    return "\n\n".join(lines)
+
+
+def _memory_list(params: dict, config: dict) -> str:
+    """List all memory entries with their manifest (type, scope, age, description)."""
+    headers = scan_all_memories()
+    if not headers:
+        return "No memories stored."
+
+    scope_filter = params.get("scope", "all")
+    if scope_filter != "all":
+        headers = [h for h in headers if h.scope == scope_filter]
+        if not headers:
+            return f"No {scope_filter} memories stored."
+
+    manifest = format_memory_manifest(headers)
+    return f"{len(headers)} memory/memories:\n\n{manifest}"
+
+
+# ── Tool registrations ─────────────────────────────────────────────────────
+
+register_tool(ToolDef(
+    name="MemorySave",
+    schema={
+        "name": "MemorySave",
+        "description": (
+            "Save a persistent memory entry as a markdown file with frontmatter. "
+            "Use for information that should persist across conversations: "
+            "user preferences, feedback/corrections, project context, or external references. "
+            "Do NOT save: code patterns, architecture, git history, or task state.\n\n"
+            "For feedback/project memories, structure content as: "
+            "rule/fact, then **Why:** and **How to apply:** lines."
+        ),
+        "input_schema": {
+            "type": "object",
+            "properties": {
+                "name": {
+                    "type": "string",
+                    "description": "Human-readable name (becomes the filename slug)",
+                },
+                "type": {
+                    "type": "string",
+                    "enum": ["user", "feedback", "project", "reference"],
+                    "description": (
+                        "user=preferences/role, feedback=guidance on how to work, "
+                        "project=ongoing work/decisions, reference=external system pointers"
+                    ),
+                },
+                "description": {
+                    "type": "string",
+                    "description": "Short one-line description (used for relevance decisions — be specific)",
+                },
+                "content": {
+                    "type": "string",
+                    "description": "Body text. For feedback/project: rule/fact + **Why:** + **How to apply:**",
+                },
+                "scope": {
+                    "type": "string",
+                    "enum": ["user", "project"],
+                    "description": (
+                        "'user' (default) = ~/.nano_claude/memory/ shared across projects; "
+                        "'project' = .nano_claude/memory/ local to this project"
+                    ),
+                },
+            },
+            "required": ["name", "type", "description", "content"],
+        },
+    },
+    func=_memory_save,
+    read_only=False,
+    concurrent_safe=False,
+))
+
+register_tool(ToolDef(
+    name="MemoryDelete",
+    schema={
+        "name": "MemoryDelete",
+        "description": "Delete a persistent memory entry by name.",
+        "input_schema": {
+            "type": "object",
+            "properties": {
+                "name": {"type": "string", "description": "Name of the memory to delete"},
+                "scope": {
+                    "type": "string",
+                    "enum": ["user", "project"],
+                    "description": "Scope to delete from (default: 'user')",
+                },
+            },
+            "required": ["name"],
+        },
+    },
+    func=_memory_delete,
+    read_only=False,
+    concurrent_safe=False,
+))
+
+register_tool(ToolDef(
+    name="MemorySearch",
+    schema={
+        "name": "MemorySearch",
+        "description": (
+            "Search persistent memories by keyword. Returns matching entries with "
+            "content preview and staleness warning for old memories. "
+            "Set use_ai=true to use AI-powered relevance ranking (costs a small API call)."
+        ),
+        "input_schema": {
+            "type": "object",
+            "properties": {
+                "query": {"type": "string", "description": "Search query"},
+                "max_results": {
+                    "type": "integer",
+                    "description": "Maximum results to return (default: 5)",
+                },
+                "use_ai": {
+                    "type": "boolean",
+                    "description": "Use AI relevance ranking (default: false = keyword only)",
+                },
+                "scope": {
+                    "type": "string",
+                    "enum": ["user", "project", "all"],
+                    "description": "Which scope to search (default: 'all')",
+                },
+            },
+            "required": ["query"],
+        },
+    },
+    func=_memory_search,
+    read_only=True,
+    concurrent_safe=True,
+))
+
+register_tool(ToolDef(
+    name="MemoryList",
+    schema={
+        "name": "MemoryList",
+        "description": (
+            "List all memory entries with type, scope, age, and description. "
+            "Useful for reviewing what's been remembered before deciding to save or delete."
+        ),
+        "input_schema": {
+            "type": "object",
+            "properties": {
+                "scope": {
+                    "type": "string",
+                    "enum": ["user", "project", "all"],
+                    "description": "Which scope to list (default: 'all')",
+                },
+            },
+        },
+    },
+    func=_memory_list,
+    read_only=True,
+    concurrent_safe=True,
+))
--- a/nano-claude-code/memory/types.py
+++ b/nano-claude-code/memory/types.py
@@ -0,0 +1,86 @@
+"""Memory type taxonomy and system-prompt guidance text.
+
+Four types capture context NOT derivable from the current project state.
+Code patterns, architecture, git history, and file structure are derivable
+(via grep/git/CLAUDE.md) and should NOT be saved as memories.
+"""
+
+MEMORY_TYPES = ["user", "feedback", "project", "reference"]
+
+# Condensed per-type guidance (used in system prompt injection)
+MEMORY_TYPE_DESCRIPTIONS: dict[str, str] = {
+    "user": (
+        "Information about the user's role, goals, responsibilities, and knowledge. "
+        "Helps tailor future behavior to the user's preferences."
+    ),
+    "feedback": (
+        "Guidance the user has given about how to approach work — both what to avoid "
+        "and what to keep doing. Lead with the rule, then **Why:** and **How to apply:**."
+    ),
+    "project": (
+        "Ongoing work, goals, bugs, or incidents not derivable from code or git history. "
+        "Lead with the fact/decision, then **Why:** and **How to apply:**. "
+        "Always convert relative dates to absolute dates."
+    ),
+    "reference": (
+        "Pointers to external systems (issue trackers, dashboards, Slack channels, docs)."
+    ),
+}
+
+# What NOT to save (mirrors Claude Code source)
+WHAT_NOT_TO_SAVE = """\
+## What NOT to save in memory
+- Code patterns, conventions, architecture, file paths, or project structure — derivable from the codebase.
+- Git history, recent changes, who-changed-what — use `git log` / `git blame`.
+- Debugging solutions or fix recipes — the fix is in the code; the commit message has context.
+- Anything already documented in CLAUDE.md files.
+- Ephemeral task details: in-progress work, temporary state, current conversation context.
+
+These exclusions apply even when explicitly asked. If asked to save a PR list or activity summary,
+ask what was *surprising* or *non-obvious* — that is the part worth keeping."""
+
+# Memory format example (frontmatter)
+MEMORY_FORMAT_EXAMPLE = """\
+```markdown
+---
+name: {{memory name}}
+description: {{one-line description — used to decide relevance, so be specific}}
+type: {{user | feedback | project | reference}}
+---
+
+{{memory content — for feedback/project types: rule/fact, then **Why:** and **How to apply:** lines}}
+```"""
+
+# Full guidance injected into the system prompt
+MEMORY_SYSTEM_PROMPT = """\
+## Memory system
+
+You have a persistent, file-based memory system. Memories are stored as markdown files with
+YAML frontmatter. Build this up over time so future conversations have context about the user,
+their preferences, and the work you're doing together.
+
+**Types** (save only what cannot be derived from the codebase):
+- **user** — role, goals, knowledge, preferences
+- **feedback** — guidance on how to work (corrections AND confirmations of non-obvious approaches)
+- **project** — ongoing work, decisions, deadlines not in git history
+- **reference** — pointers to external systems (Linear, Grafana, Slack, etc.)
+
+**When to save**: If the user corrects you, confirms an approach, or shares context that should
+persist beyond this conversation. For feedback: save corrections AND quiet confirmations.
+
+**Body structure for feedback/project**: Lead with the rule/fact, then:
+  **Why:** (reason given) | **How to apply:** (when this guidance kicks in)
+
+**Format**:
+{format_example}
+
+**Saving is two steps**:
+1. Write the memory to its own file (e.g. `feedback_testing.md`) using MemorySave.
+2. The index (MEMORY.md) is updated automatically.
+
+**What NOT to save**: code patterns, architecture, git history, debugging fixes,
+anything already in CLAUDE.md, or ephemeral task state.
+
+**Before recommending from memory**: A memory naming a file, function, or flag may be stale.
+Verify it still exists before acting on it. For current state, prefer `git log` or reading code.
+""".format(format_example=MEMORY_FORMAT_EXAMPLE)
--- a/nano-claude-code/multi_agent/init.py
+++ b/nano-claude-code/multi_agent/init.py
@@ -0,0 +1,23 @@
+"""Multi-agent package for nano-claude-code.
+
+Provides:
+  - AgentDefinition  — typed agent definition (name, system_prompt, model, tools)
+  - SubAgentTask     — lifecycle-tracked task
+  - SubAgentManager  — thread-pool manager for spawning agents
+  - load_agent_definitions / get_agent_definition — agent registry
+"""
+from .subagent import (
+    AgentDefinition,
+    SubAgentTask,
+    SubAgentManager,
+    load_agent_definitions,
+    get_agent_definition,
+)
+
+__all__ = [
+    "AgentDefinition",
+    "SubAgentTask",
+    "SubAgentManager",
+    "load_agent_definitions",
+    "get_agent_definition",
+]
--- a/nano-claude-code/multi_agent/subagent.py
+++ b/nano-claude-code/multi_agent/subagent.py
@@ -0,0 +1,480 @@
+"""Threaded sub-agent system for spawning nested agent loops."""
+from __future__ import annotations
+
+import os
+import uuid
+import queue
+import subprocess
+import tempfile
+from concurrent.futures import ThreadPoolExecutor, Future
+from dataclasses import dataclass, field
+from pathlib import Path
+from typing import Dict, List, Optional, Any
+
+
+# ── Agent definition ───────────────────────────────────────────────────────
+
+@dataclass
+class AgentDefinition:
+    """Definition for a specialized agent type."""
+    name: str
+    description: str = ""
+    system_prompt: str = ""   # extra instructions prepended to the base system prompt
+    model: str = ""            # model override; "" = inherit from parent
+    tools: list = field(default_factory=list)   # empty list = all tools
+    source: str = "user"       # "built-in" | "user" | "project"
+
+
+# ── Built-in agent definitions ─────────────────────────────────────────────
+
+_BUILTIN_AGENTS: Dict[str, AgentDefinition] = {
+    "general-purpose": AgentDefinition(
+        name="general-purpose",
+        description=(
+            "General-purpose agent for researching complex questions, "
+            "searching for code, and executing multi-step tasks."
+        ),
+        system_prompt="",
+        source="built-in",
+    ),
+    "coder": AgentDefinition(
+        name="coder",
+        description="Specialized coding agent for writing, reading, and modifying code.",
+        system_prompt=(
+            "You are a specialized coding assistant. Focus on:\n"
+            "- Writing clean, idiomatic code\n"
+            "- Reading and understanding existing code before modifying\n"
+            "- Making minimal targeted changes\n"
+            "- Never adding unnecessary features, comments, or error handling\n"
+        ),
+        source="built-in",
+    ),
+    "reviewer": AgentDefinition(
+        name="reviewer",
+        description="Code review agent analyzing quality, security, and correctness.",
+        system_prompt=(
+            "You are a code reviewer. Analyze code for:\n"
+            "- Correctness and logic errors\n"
+            "- Security vulnerabilities (injection, XSS, auth bypass, etc.)\n"
+            "- Performance issues\n"
+            "- Code quality and maintainability\n"
+            "Be concise and specific. Categorize findings as: Critical | Warning | Suggestion.\n"
+        ),
+        tools=["Read", "Glob", "Grep"],
+        source="built-in",
+    ),
+    "researcher": AgentDefinition(
+        name="researcher",
+        description="Research agent for exploring codebases and answering questions.",
+        system_prompt=(
+            "You are a research assistant focused on understanding codebases.\n"
+            "- Read and analyze code thoroughly before answering\n"
+            "- Provide factual, evidence-based answers\n"
+            "- Cite specific file paths and line numbers\n"
+            "- Be concise and focused\n"
+        ),
+        tools=["Read", "Glob", "Grep", "WebFetch", "WebSearch"],
+        source="built-in",
+    ),
+    "tester": AgentDefinition(
+        name="tester",
+        description="Testing agent that writes and runs tests.",
+        system_prompt=(
+            "You are a testing specialist. Your job:\n"
+            "- Write comprehensive tests for the given code\n"
+            "- Run existing tests and diagnose failures\n"
+            "- Focus on edge cases and error conditions\n"
+            "- Keep tests simple, readable, and fast\n"
+        ),
+        source="built-in",
+    ),
+}
+
+
+# ── Loading agent definitions from .md files ──────────────────────────────
+
+def _parse_agent_md(path: Path, source: str = "user") -> AgentDefinition:
+    """Parse a .md file with optional YAML frontmatter into an AgentDefinition.
+
+    File format:
+        ---
+        description: "Short description"
+        model: claude-haiku-4-5-20251001
+        tools: [Read, Write, Edit, Bash]
+        ---
+
+        System prompt body goes here...
+    """
+    content = path.read_text()
+    name = path.stem
+    description = ""
+    model = ""
+    tools: list = []
+    system_prompt_body = content
+
+    if content.startswith("---"):
+        end = content.find("---", 3)
+        if end != -1:
+            fm_text = content[3:end].strip()
+            system_prompt_body = content[end + 3:].strip()
+            try:
+                import yaml as _yaml
+                fm = _yaml.safe_load(fm_text) or {}
+            except ImportError:
+                # Manual key: value parse (no yaml dependency required)
+                fm: dict = {}
+                for line in fm_text.splitlines():
+                    if ":" in line:
+                        k, _, v = line.partition(":")
+                        fm[k.strip()] = v.strip()
+            description = str(fm.get("description", ""))
+            model = str(fm.get("model", ""))
+            raw_tools = fm.get("tools", [])
+            if isinstance(raw_tools, list):
+                tools = [str(t) for t in raw_tools]
+            elif isinstance(raw_tools, str):
+                # Handle "[Read, Write]" or "Read, Write" format
+                s = raw_tools.strip("[]")
+                tools = [t.strip() for t in s.split(",") if t.strip()]
+
+    return AgentDefinition(
+        name=name,
+        description=description,
+        system_prompt=system_prompt_body,
+        model=model,
+        tools=tools,
+        source=source,
+    )
+
+
+def load_agent_definitions() -> Dict[str, AgentDefinition]:
+    """Load all agent definitions: built-ins → user-level → project-level.
+
+    Search paths:
+      ~/.nano-claude/agents/*.md   (user-level)
+      .nano-claude/agents/*.md     (project-level, overrides user)
+    """
+    defs: Dict[str, AgentDefinition] = dict(_BUILTIN_AGENTS)
+
+    # User-level
+    user_dir = Path.home() / ".nano-claude" / "agents"
+    if user_dir.is_dir():
+        for p in sorted(user_dir.glob("*.md")):
+            try:
+                d = _parse_agent_md(p, source="user")
+                defs[d.name] = d
+            except Exception:
+                pass
+
+    # Project-level (overrides user)
+    proj_dir = Path.cwd() / ".nano-claude" / "agents"
+    if proj_dir.is_dir():
+        for p in sorted(proj_dir.glob("*.md")):
+            try:
+                d = _parse_agent_md(p, source="project")
+                defs[d.name] = d
+            except Exception:
+                pass
+
+    return defs
+
+
+def get_agent_definition(name: str) -> Optional[AgentDefinition]:
+    """Look up an agent definition by name. Returns None if not found."""
+    return load_agent_definitions().get(name)
+
+
+# ── SubAgentTask ───────────────────────────────────────────────────────────
+
+@dataclass
+class SubAgentTask:
+    """Represents a sub-agent task with lifecycle tracking."""
+    id: str
+    prompt: str
+    status: str = "pending"       # pending | running | completed | failed | cancelled
+    result: Optional[str] = None
+    depth: int = 0
+    name: str = ""                # optional human-readable name (addressable by SendMessage)
+    worktree_path: str = ""       # set if isolation="worktree"
+    worktree_branch: str = ""     # set if isolation="worktree"
+    _cancel_flag: bool = False
+    _future: Optional[Future] = field(default=None, repr=False)
+    _inbox: Any = field(default_factory=queue.Queue, repr=False)  # for send_message
+
+
+# ── Worktree helpers ───────────────────────────────────────────────────────
+
+def _git_root(cwd: str) -> Optional[str]:
+    """Return the git root directory for cwd, or None if not in a git repo."""
+    try:
+        r = subprocess.run(
+            ["git", "rev-parse", "--show-toplevel"],
+            cwd=cwd, capture_output=True, text=True, check=True,
+        )
+        return r.stdout.strip()
+    except Exception:
+        return None
+
+
+def _create_worktree(base_dir: str) -> tuple:
+    """Create a temporary git worktree.
+
+    Returns:
+        (worktree_path, branch_name)
+    Raises:
+        subprocess.CalledProcessError or OSError on failure.
+    """
+    branch = f"nano-agent-{uuid.uuid4().hex[:8]}"
+    # mkdtemp gives us a path; remove the empty dir so git can create it
+    wt_path = tempfile.mkdtemp(prefix="nano-agent-wt-")
+    os.rmdir(wt_path)
+    subprocess.run(
+        ["git", "worktree", "add", "-b", branch, wt_path],
+        cwd=base_dir, check=True, capture_output=True, text=True,
+    )
+    return wt_path, branch
+
+
+def _remove_worktree(wt_path: str, branch: str, base_dir: str) -> None:
+    """Remove a git worktree and delete its branch (best-effort)."""
+    try:
+        subprocess.run(
+            ["git", "worktree", "remove", "--force", wt_path],
+            cwd=base_dir, capture_output=True,
+        )
+    except Exception:
+        pass
+    try:
+        subprocess.run(
+            ["git", "branch", "-D", branch],
+            cwd=base_dir, capture_output=True,
+        )
+    except Exception:
+        pass
+
+
+# ── Internal helpers ───────────────────────────────────────────────────────
+
+def _agent_run(prompt, state, config, system_prompt, depth=0, cancel_check=None):
+    """Lazy-import wrapper to avoid circular dependency with agent module.
+
+    Uses absolute import so this works whether called from inside or outside
+    the multi_agent package (sys.path includes the project root).
+    """
+    import agent as _agent_mod
+    return _agent_mod.run(prompt, state, config, system_prompt, depth=depth, cancel_check=cancel_check)
+
+
+def _extract_final_text(messages):
+    """Walk backwards through messages, return first assistant content string."""
+    for msg in reversed(messages):
+        if msg.get("role") == "assistant" and msg.get("content"):
+            return msg["content"]
+    return None
+
+
+# ── SubAgentManager ────────────────────────────────────────────────────────
+
+class SubAgentManager:
+    """Manages concurrent sub-agent tasks using a thread pool."""
+
+    def __init__(self, max_concurrent: int = 5, max_depth: int = 5):
+        self.tasks: Dict[str, SubAgentTask] = {}
+        self._by_name: Dict[str, str] = {}   # name → task_id
+        self.max_concurrent = max_concurrent
+        self.max_depth = max_depth
+        self._pool = ThreadPoolExecutor(max_workers=max_concurrent)
+
+    def spawn(
+        self,
+        prompt: str,
+        config: dict,
+        system_prompt: str,
+        depth: int = 0,
+        agent_def: Optional[AgentDefinition] = None,
+        isolation: str = "",     # "" | "worktree"
+        name: str = "",
+    ) -> SubAgentTask:
+        """Spawn a new sub-agent task.
+
+        Args:
+            prompt:       user message for the sub-agent
+            config:       agent configuration dict (copied before modification)
+            system_prompt: base system prompt
+            depth:        current nesting depth (prevents infinite recursion)
+            agent_def:    optional AgentDefinition with model/system_prompt/tools overrides
+            isolation:    "" for normal, "worktree" for isolated git worktree
+            name:         optional human-readable name (addressable via SendMessage)
+
+        Returns:
+            SubAgentTask tracking the spawned work.
+        """
+        task_id = uuid.uuid4().hex[:12]
+        short_name = name or task_id[:8]
+        task = SubAgentTask(id=task_id, prompt=prompt, depth=depth, name=short_name)
+        self.tasks[task_id] = task
+        if name:
+            self._by_name[name] = task_id
+
+        if depth >= self.max_depth:
+            task.status = "failed"
+            task.result = f"Max depth ({self.max_depth}) exceeded"
+            return task
+
+        # Build effective config and system prompt for this sub-agent
+        eff_config = dict(config)
+        eff_system = system_prompt
+
+        if agent_def:
+            if agent_def.model:
+                eff_config["model"] = agent_def.model
+            if agent_def.system_prompt:
+                eff_system = agent_def.system_prompt.rstrip() + "\n\n" + system_prompt
+
+        # Handle worktree isolation
+        worktree_path = ""
+        worktree_branch = ""
+        base_dir = os.getcwd()
+
+        if isolation == "worktree":
+            git_root = _git_root(base_dir)
+            if not git_root:
+                task.status = "failed"
+                task.result = "isolation='worktree' requires a git repository"
+                return task
+            try:
+                worktree_path, worktree_branch = _create_worktree(git_root)
+                task.worktree_path = worktree_path
+                task.worktree_branch = worktree_branch
+                notice = (
+                    f"\n\n[Note: You are working in an isolated git worktree at "
+                    f"{worktree_path} (branch: {worktree_branch}). "
+                    f"Your changes are isolated from the main workspace at {git_root}. "
+                    f"Commit your changes before finishing so they can be reviewed/merged.]"
+                )
+                prompt = prompt + notice
+            except Exception as e:
+                task.status = "failed"
+                task.result = f"Failed to create worktree: {e}"
+                return task
+
+        def _run():
+            import agent as _agent_mod; AgentState = _agent_mod.AgentState
+            task.status = "running"
+            old_cwd = os.getcwd()
+            try:
+                if worktree_path:
+                    os.chdir(worktree_path)
+
+                state = AgentState()
+                gen = _agent_run(
+                    prompt, state, eff_config, eff_system,
+                    depth=depth + 1,
+                    cancel_check=lambda: task._cancel_flag,
+                )
+                for _event in gen:
+                    if task._cancel_flag:
+                        break
+
+                if task._cancel_flag:
+                    task.status = "cancelled"
+                    task.result = None
+                else:
+                    task.result = _extract_final_text(state.messages)
+                    task.status = "completed"
+
+                # Drain inbox: process any messages sent via SendMessage
+                while not task._inbox.empty() and not task._cancel_flag:
+                    inbox_msg = task._inbox.get_nowait()
+                    task.status = "running"
+                    gen2 = _agent_run(
+                        inbox_msg, state, eff_config, eff_system,
+                        depth=depth + 1,
+                        cancel_check=lambda: task._cancel_flag,
+                    )
+                    for _ev in gen2:
+                        if task._cancel_flag:
+                            break
+                    if not task._cancel_flag:
+                        task.result = _extract_final_text(state.messages)
+                        task.status = "completed"
+
+            except Exception as e:
+                task.status = "failed"
+                task.result = f"Error: {e}"
+            finally:
+                if worktree_path:
+                    os.chdir(old_cwd)
+                    _remove_worktree(worktree_path, worktree_branch, old_cwd)
+
+        task._future = self._pool.submit(_run)
+        return task
+
+    def wait(self, task_id: str, timeout: float = None) -> Optional[SubAgentTask]:
+        """Block until a task completes or timeout expires.
+
+        Returns:
+            The task, or None if task_id is unknown.
+        """
+        task = self.tasks.get(task_id)
+        if task is None:
+            return None
+        if task._future is not None:
+            try:
+                task._future.result(timeout=timeout)
+            except Exception:
+                pass
+        return task
+
+    def get_result(self, task_id: str) -> Optional[str]:
+        """Return the result string for a completed task, or None."""
+        task = self.tasks.get(task_id)
+        return task.result if task else None
+
+    def list_tasks(self) -> List[SubAgentTask]:
+        """Return all tracked tasks."""
+        return list(self.tasks.values())
+
+    def send_message(self, task_id_or_name: str, message: str) -> bool:
+        """Send a message to a running background agent.
+
+        The message is queued and the agent will process it after completing
+        its current work.
+
+        Args:
+            task_id_or_name: task ID or the human-readable name passed to spawn()
+            message:         message text to send
+
+        Returns:
+            True if the message was queued, False if task not found or already done.
+        """
+        # Resolve name → task_id
+        task_id = self._by_name.get(task_id_or_name, task_id_or_name)
+        task = self.tasks.get(task_id)
+        if task is None:
+            return False
+        if task.status not in ("running", "pending"):
+            return False
+        task._inbox.put(message)
+        return True
+
+    def cancel(self, task_id: str) -> bool:
+        """Request cancellation of a running task.
+
+        Returns:
+            True if the cancel flag was set, False if task not found or not running.
+        """
+        task = self.tasks.get(task_id)
+        if task is None:
+            return False
+        if task.status == "running":
+            task._cancel_flag = True
+            return True
+        return False
+
+    def shutdown(self) -> None:
+        """Cancel all running tasks and shut down the thread pool."""
+        for task in self.tasks.values():
+            if task.status == "running":
+                task._cancel_flag = True
+        self._pool.shutdown(wait=True)
--- a/nano-claude-code/multi_agent/tools.py
+++ b/nano-claude-code/multi_agent/tools.py
@@ -0,0 +1,295 @@
+"""Multi-agent tool registrations.
+
+Registers the following tools into the central tool_registry:
+  Agent            — spawn a sub-agent (sync or background)
+  SendMessage      — send a message to a named background agent
+  CheckAgentResult — check status/result of a background agent
+  ListAgentTasks   — list all active/finished agent tasks
+  ListAgentTypes   — list available agent type definitions
+"""
+from __future__ import annotations
+
+from tool_registry import ToolDef, register_tool
+from .subagent import SubAgentManager, get_agent_definition, load_agent_definitions
+
+
+# ── Singleton manager ──────────────────────────────────────────────────────
+
+_agent_manager: SubAgentManager | None = None
+
+
+def get_agent_manager() -> SubAgentManager:
+    """Return (and lazily create) the process-wide SubAgentManager."""
+    global _agent_manager
+    if _agent_manager is None:
+        _agent_manager = SubAgentManager()
+    return _agent_manager
+
+
+# ── Tool implementations ───────────────────────────────────────────────────
+
+def _agent_tool(params: dict, config: dict) -> str:
+    """Spawn a sub-agent.
+
+    Reads from config:
+      _system_prompt  — injected by agent.py run(), used as base system prompt
+      _depth          — current nesting depth (prevents infinite recursion)
+    """
+    mgr = get_agent_manager()
+
+    prompt = params["prompt"]
+    wait = params.get("wait", True)
+    isolation = params.get("isolation", "")
+    name = params.get("name", "")
+    model_override = params.get("model", "")
+    subagent_type = params.get("subagent_type", "")
+
+    system_prompt = config.get("_system_prompt", "You are a helpful assistant.")
+    depth = config.get("_depth", 0)
+
+    # Strip private keys before passing to sub-agent
+    eff_config = {k: v for k, v in config.items() if not k.startswith("_")}
+    if model_override:
+        eff_config["model"] = model_override
+
+    # Resolve agent definition
+    agent_def = None
+    if subagent_type:
+        agent_def = get_agent_definition(subagent_type)
+        if agent_def is None:
+            return (
+                f"Error: unknown subagent_type '{subagent_type}'. "
+                "Use ListAgentTypes to see available types."
+            )
+
+    task = mgr.spawn(
+        prompt, eff_config, system_prompt,
+        depth=depth,
+        agent_def=agent_def,
+        isolation=isolation,
+        name=name,
+    )
+
+    if task.status == "failed":
+        return f"Error spawning agent: {task.result}"
+
+    if wait:
+        mgr.wait(task.id, timeout=300)
+        result = task.result or f"(no output — status: {task.status})"
+        header = f"[Agent: {task.name}"
+        if subagent_type:
+            header += f" ({subagent_type})"
+        if task.worktree_branch:
+            header += f", branch: {task.worktree_branch}"
+        header += "]"
+        return f"{header}\n\n{result}"
+    else:
+        info_parts = [f"Task ID: {task.id}", f"Name: {task.name}", f"Status: {task.status}"]
+        if subagent_type:
+            info_parts.append(f"Type: {subagent_type}")
+        if task.worktree_branch:
+            info_parts.append(f"Worktree branch: {task.worktree_branch}")
+        info_parts.append("Use CheckAgentResult or SendMessage to interact with this agent.")
+        return "\n".join(info_parts)
+
+
+def _send_message(params: dict, config: dict) -> str:
+    mgr = get_agent_manager()
+    target = params["to"]
+    message = params["message"]
+    ok = mgr.send_message(target, message)
+    if ok:
+        return f"Message queued for agent '{target}'. It will be processed after current work completes."
+    task_id = mgr._by_name.get(target, target)
+    task = mgr.tasks.get(task_id)
+    if task is None:
+        return f"Error: no agent found with id or name '{target}'"
+    return f"Error: agent '{target}' is not running (status: {task.status}). Cannot send message."
+
+
+def _check_agent_result(params: dict, config: dict) -> str:
+    mgr = get_agent_manager()
+    task_id = params["task_id"]
+    task = mgr.tasks.get(task_id)
+    if task is None:
+        return f"Error: no task with id '{task_id}'"
+    lines = [f"Status: {task.status}", f"Name: {task.name}"]
+    if task.worktree_branch:
+        lines.append(f"Worktree branch: {task.worktree_branch}")
+    if task.result:
+        lines.append(f"\nResult:\n{task.result}")
+    return "\n".join(lines)
+
+
+def _list_agent_tasks(params: dict, config: dict) -> str:
+    mgr = get_agent_manager()
+    tasks = mgr.list_tasks()
+    if not tasks:
+        return "No sub-agent tasks."
+    lines = ["ID           | Name     | Status    | Worktree branch | Prompt"]
+    lines.append("-------------|----------|-----------|-----------------|------")
+    for t in tasks:
+        prompt_short = t.prompt[:50] + ("..." if len(t.prompt) > 50 else "")
+        wt = t.worktree_branch[:15] if t.worktree_branch else "-"
+        lines.append(f"{t.id} | {t.name[:8]:8s} | {t.status:9s} | {wt:15s} | {prompt_short}")
+    return "\n".join(lines)
+
+
+def _list_agent_types(params: dict, config: dict) -> str:
+    defs = load_agent_definitions()
+    if not defs:
+        return "No agent types available."
+    lines = ["Available agent types:", ""]
+    for aname, d in sorted(defs.items()):
+        model_info = f"  model: {d.model}" if d.model else ""
+        tools_info = f"  tools: {', '.join(d.tools)}" if d.tools else ""
+        lines.append(f"  {aname:20s}  [{d.source:8s}]  {d.description}")
+        if model_info:
+            lines.append(f"                           {model_info}")
+        if tools_info:
+            lines.append(f"                           {tools_info}")
+    lines.append("")
+    lines.append(
+        "Create custom agents: place .md files in ~/.nano-claude/agents/ or .nano-claude/agents/"
+    )
+    return "\n".join(lines)
+
+
+# ── Tool registrations ─────────────────────────────────────────────────────
+
+register_tool(ToolDef(
+    name="Agent",
+    schema={
+        "name": "Agent",
+        "description": (
+            "Spawn a sub-agent to handle a task autonomously. The sub-agent runs in a "
+            "separate thread with its own conversation history. Supports specialized agent "
+            "types (coder, reviewer, researcher, tester, or custom from .nano-claude/agents/), "
+            "isolated git worktrees for parallel work, and background execution.\n\n"
+            "When using isolation='worktree', the agent gets its own git branch and "
+            "working copy — ideal for parallel coding tasks that shouldn't interfere."
+        ),
+        "input_schema": {
+            "type": "object",
+            "properties": {
+                "prompt": {
+                    "type": "string",
+                    "description": "Task description for the sub-agent",
+                },
+                "subagent_type": {
+                    "type": "string",
+                    "description": (
+                        "Specialized agent type: 'general-purpose', 'coder', 'reviewer', "
+                        "'researcher', 'tester', or any custom type. "
+                        "Use ListAgentTypes to see all available types."
+                    ),
+                },
+                "name": {
+                    "type": "string",
+                    "description": (
+                        "Human-readable name for this agent instance. "
+                        "Makes it addressable via SendMessage while running in background."
+                    ),
+                },
+                "model": {
+                    "type": "string",
+                    "description": "Model override for this specific agent (optional)",
+                },
+                "wait": {
+                    "type": "boolean",
+                    "description": (
+                        "Block until complete (default: true). "
+                        "Set false to run in background."
+                    ),
+                },
+                "isolation": {
+                    "type": "string",
+                    "enum": ["worktree"],
+                    "description": (
+                        "'worktree' creates a temporary git worktree so the agent works "
+                        "on an isolated copy of the repo. Changes stay on a separate branch "
+                        "and can be reviewed/merged after completion."
+                    ),
+                },
+            },
+            "required": ["prompt"],
+        },
+    },
+    func=_agent_tool,
+    read_only=False,
+    concurrent_safe=False,
+))
+
+register_tool(ToolDef(
+    name="SendMessage",
+    schema={
+        "name": "SendMessage",
+        "description": (
+            "Send a follow-up message to a running background agent. "
+            "The message is queued and processed after the agent finishes its current work. "
+            "Reference agents by the name set via Agent(name=...) or by task ID."
+        ),
+        "input_schema": {
+            "type": "object",
+            "properties": {
+                "to":      {"type": "string", "description": "Agent name or task ID"},
+                "message": {"type": "string", "description": "Message to send to the agent"},
+            },
+            "required": ["to", "message"],
+        },
+    },
+    func=_send_message,
+    read_only=False,
+    concurrent_safe=True,
+))
+
+register_tool(ToolDef(
+    name="CheckAgentResult",
+    schema={
+        "name": "CheckAgentResult",
+        "description": "Check the status and result of a spawned sub-agent task.",
+        "input_schema": {
+            "type": "object",
+            "properties": {
+                "task_id": {"type": "string", "description": "Task ID returned by Agent tool"},
+            },
+            "required": ["task_id"],
+        },
+    },
+    func=_check_agent_result,
+    read_only=True,
+    concurrent_safe=True,
+))
+
+register_tool(ToolDef(
+    name="ListAgentTasks",
+    schema={
+        "name": "ListAgentTasks",
+        "description": "List all sub-agent tasks and their statuses.",
+        "input_schema": {
+            "type": "object",
+            "properties": {},
+        },
+    },
+    func=_list_agent_tasks,
+    read_only=True,
+    concurrent_safe=True,
+))
+
+register_tool(ToolDef(
+    name="ListAgentTypes",
+    schema={
+        "name": "ListAgentTypes",
+        "description": (
+            "List all available agent types (built-in and custom). "
+            "Use the type names as subagent_type when calling Agent."
+        ),
+        "input_schema": {
+            "type": "object",
+            "properties": {},
+        },
+    },
+    func=_list_agent_types,
+    read_only=True,
+    concurrent_safe=True,
+))
--- a/nano-claude-code/nano_claude.py
+++ b/nano-claude-code/nano_claude.py
@@ -26,6 +26,9 @@ Slash commands in REPL:
  /thinking   Toggle extended thinking
  /permissions [mode]  Set permission mode
  /cwd [path] Show or change working directory
+  /memory [query]   Show/search persistent memories
+  /skills           List available skills
+  /agents           Show sub-agent tasks
  /exit /quit Exit
 """
 from __future__ import annotations
@@ -33,13 +36,16 @@ from __future__ import annotations
 import os
 import sys
 import json
-import readline
+try:
+    import readline
+except ImportError:
+    readline = None  # Windows compatibility
 import atexit
 import argparse
 import textwrap
 from pathlib import Path
 from datetime import datetime
-from typing import Optional
+from typing import Optional, Union

 # ── Optional rich for markdown rendering ──────────────────────────────────
 try:
@@ -78,6 +84,25 @@ def warn(msg: str):   print(clr(f"Warning: {msg}", "yellow"))
 def err(msg: str):    print(clr(f"Error: {msg}", "red"), file=sys.stderr)


+def render_diff(text: str):
+    """Print diff text with ANSI colors: red for removals, green for additions."""
+    for line in text.splitlines():
+        if line.startswith("+++") or line.startswith("---"):
+            print(C["bold"] + line + C["reset"])
+        elif line.startswith("+"):
+            print(C["green"] + line + C["reset"])
+        elif line.startswith("-"):
+            print(C["red"] + line + C["reset"])
+        elif line.startswith("@@"):
+            print(C["cyan"] + line + C["reset"])
+        else:
+            print(line)
+
+def _has_diff(text: str) -> bool:
+    """Check if text contains a unified diff."""
+    return "--- a/" in text and "+++ b/" in text
+
+
 # ── Conversation rendering ─────────────────────────────────────────────────

 _accumulated_text: list[str] = []   # buffer text during streaming
@@ -118,6 +143,12 @@ def print_tool_end(name: str, result: str, verbose: bool):
    summary = f"→ {lines} lines ({size} chars)"
    if not result.startswith("Error") and not result.startswith("Denied"):
        print(clr(f"  ✓ {summary}", "dim", "green"), flush=True)
+        # Render diff for Edit/Write results
+        if name in ("Edit", "Write") and _has_diff(result):
+            parts = result.split("\n\n", 1)
+            if len(parts) == 2:
+                print(clr(f"  {parts[0]}", "dim"))
+                render_diff(parts[1])
    else:
        print(clr(f"  ✗ {result[:120]}", "dim", "red"), flush=True)
    if verbose and not result.startswith("Denied"):
@@ -131,8 +162,26 @@ def _tool_desc(name: str, inputs: dict) -> str:
    if name == "Bash":   return f"Bash({inputs.get('command','')[:80]})"
    if name == "Glob":   return f"Glob({inputs.get('pattern','')})"
    if name == "Grep":   return f"Grep({inputs.get('pattern','')})"
-    if name == "WebFetch":  return f"WebFetch({inputs.get('url','')[:60]})"
-    if name == "WebSearch": return f"WebSearch({inputs.get('query','')})"
+    if name == "WebFetch":    return f"WebFetch({inputs.get('url','')[:60]})"
+    if name == "WebSearch":   return f"WebSearch({inputs.get('query','')})"
+    if name == "Agent":
+        atype = inputs.get("subagent_type", "")
+        aname = inputs.get("name", "")
+        iso   = inputs.get("isolation", "")
+        bg    = not inputs.get("wait", True)
+        parts = []
+        if atype:  parts.append(atype)
+        if aname:  parts.append(f"name={aname}")
+        if iso:    parts.append(f"isolation={iso}")
+        if bg:     parts.append("background")
+        suffix = f"({', '.join(parts)})" if parts else ""
+        prompt_short = inputs.get("prompt", "")[:60]
+        return f"Agent{suffix}: {prompt_short}"
+    if name == "SendMessage":
+        return f"SendMessage(to={inputs.get('to','')}: {inputs.get('message','')[:50]})"
+    if name == "CheckAgentResult": return f"CheckAgentResult({inputs.get('task_id','')})"
+    if name == "ListAgentTasks":   return "ListAgentTasks()"
+    if name == "ListAgentTypes":   return "ListAgentTypes()"
    return f"{name}({list(inputs.values())[:1]})"


@@ -351,6 +400,101 @@ def cmd_exit(_args: str, _state, _config) -> bool:
    ok("Goodbye!")
    sys.exit(0)

+def cmd_memory(args: str, _state, _config) -> bool:
+    from memory import search_memory, load_index
+    from memory.scan import scan_all_memories, format_memory_manifest, memory_freshness_text
+
+    if args.strip():
+        results = search_memory(args.strip())
+        if not results:
+            info(f"No memories matching '{args.strip()}'")
+            return True
+        info(f"  {len(results)} result(s) for '{args.strip()}':")
+        for m in results:
+            info(f"  [{m.type:9s}|{m.scope:7s}] {m.name}: {m.description}")
+            info(f"    {m.content[:120]}{'...' if len(m.content) > 120 else ''}")
+        return True
+
+    # Show manifest with age/freshness
+    headers = scan_all_memories()
+    if not headers:
+        info("No memories stored. The model saves memories via MemorySave.")
+        return True
+    info(f"  {len(headers)} memory/memories (newest first):")
+    for h in headers:
+        fresh_warn = "  ⚠ stale" if memory_freshness_text(h.mtime_s) else ""
+        tag = f"[{h.type or '?':9s}|{h.scope:7s}]"
+        info(f"  {tag} {h.filename}{fresh_warn}")
+        if h.description:
+            info(f"    {h.description}")
+    return True
+
+def cmd_agents(_args: str, _state, _config) -> bool:
+    try:
+        from multi_agent.tools import get_agent_manager
+        mgr = get_agent_manager()
+        tasks = mgr.list_tasks()
+        if not tasks:
+            info("No sub-agent tasks.")
+            return True
+        info(f"  {len(tasks)} sub-agent task(s):")
+        for t in tasks:
+            preview = t.prompt[:50] + ("..." if len(t.prompt) > 50 else "")
+            wt_info = f"  branch:{t.worktree_branch}" if t.worktree_branch else ""
+            info(f"  {t.id} [{t.status:9s}] name={t.name}{wt_info}  {preview}")
+    except Exception:
+        info("Sub-agent system not initialized.")
+    return True
+
+
+def _print_background_notifications():
+    """Print notifications for newly completed background agent tasks.
+
+    Called before each user prompt so the user sees results without polling.
+    """
+    try:
+        from multi_agent.tools import get_agent_manager
+        mgr = get_agent_manager()
+    except Exception:
+        return
+
+    notified_key = "_notified"
+    if not hasattr(_print_background_notifications, "_seen"):
+        _print_background_notifications._seen = set()
+
+    for task in mgr.list_tasks():
+        if task.id in _print_background_notifications._seen:
+            continue
+        if task.status in ("completed", "failed", "cancelled"):
+            _print_background_notifications._seen.add(task.id)
+            icon = "✓" if task.status == "completed" else "✗"
+            color = "green" if task.status == "completed" else "red"
+            branch_info = f" [branch: {task.worktree_branch}]" if task.worktree_branch else ""
+            print(clr(
+                f"\n  {icon} Background agent '{task.name}' {task.status}{branch_info}",
+                color, "bold"
+            ))
+            if task.result:
+                preview = task.result[:200] + ("..." if len(task.result) > 200 else "")
+                print(clr(f"    {preview}", "dim"))
+            print()
+
+def cmd_skills(_args: str, _state, _config) -> bool:
+    from skill import load_skills
+    skills = load_skills()
+    if not skills:
+        info("No skills found.")
+        return True
+    info(f"Available skills ({len(skills)}):")
+    for s in skills:
+        triggers = ", ".join(s.triggers)
+        source_label = f"[{s.source}]" if s.source != "builtin" else ""
+        hint = f"  args: {s.argument_hint}" if s.argument_hint else ""
+        print(f"  {clr(s.name, 'cyan'):24s} {s.description}  {clr(triggers, 'dim')}{hint} {clr(source_label, 'yellow')}")
+        if s.when_to_use:
+            print(f"    {clr(s.when_to_use[:80], 'dim')}")
+    return True
+
 COMMANDS = {
    "help":        cmd_help,
    "clear":       cmd_clear,
@@ -365,13 +509,16 @@ COMMANDS = {
    "thinking":    cmd_thinking,
    "permissions": cmd_permissions,
    "cwd":         cmd_cwd,
+    "skills":      cmd_skills,
+    "memory":      cmd_memory,
+    "agents":      cmd_agents,
    "exit":        cmd_exit,
    "quit":        cmd_exit,
 }


-def handle_slash(line: str, state, config) -> bool:
-    """Handle /command [args]. Returns True if handled."""
+def handle_slash(line: str, state, config) -> Union[bool, tuple]:
+    """Handle /command [args]. Returns True if handled, tuple (skill, args) for skill match."""
    if not line.startswith("/"):
        return False
    parts = line[1:].split(None, 1)
@@ -383,6 +530,15 @@ def handle_slash(line: str, state, config) -> bool:
    if handler:
        handler(args, state, config)
        return True
+
+    # Fall through to skill lookup
+    from skill import find_skill
+    skill = find_skill(line)
+    if skill:
+        cmd_parts = line.strip().split(maxsplit=1)
+        skill_args = cmd_parts[1] if len(cmd_parts) > 1 else ""
+        return (skill, skill_args)
+
    err(f"Unknown command: /{cmd}  (type /help for commands)")
    return True

@@ -390,6 +546,8 @@ def handle_slash(line: str, state, config) -> bool:
 # ── Input history setup ────────────────────────────────────────────────────

 def setup_readline(history_file: Path):
+    if readline is None:
+        return
    try:
        readline.read_history_file(str(history_file))
    except FileNotFoundError:
@@ -487,6 +645,8 @@ def repl(config: dict, initial_prompt: str = None):
        return

    while True:
+        # Show notifications for background agents that finished
+        _print_background_notifications()
        try:
            cwd_short = Path.cwd().name
            prompt = clr(f"\n[{cwd_short}] ", "dim") + clr("❯ ", "cyan", "bold")
@@ -498,7 +658,19 @@ def repl(config: dict, initial_prompt: str = None):

        if not user_input:
            continue
-        if handle_slash(user_input, state, config):
+
+        result = handle_slash(user_input, state, config)
+        if isinstance(result, tuple):
+            skill, skill_args = result
+            info(f"Running skill: {skill.name}" + (f" [{skill.context}]" if skill.context == "fork" else ""))
+            try:
+                from skill import substitute_arguments
+                rendered = substitute_arguments(skill.prompt, skill_args, skill.arguments)
+                run_query(f"[Skill: {skill.name}]\n\n{rendered}")
+            except KeyboardInterrupt:
+                print(clr("\n  (interrupted)", "yellow"))
+            continue
+        if result:
            continue

        try:
--- a/nano-claude-code/providers.py
+++ b/nano-claude-code/providers.py
@@ -29,6 +29,7 @@ PROVIDERS: dict[str, dict] = {
    "anthropic": {
        "type":       "anthropic",
        "api_key_env": "ANTHROPIC_API_KEY",
+        "context_limit": 200000,
        "models": [
            "claude-opus-4-6", "claude-sonnet-4-6", "claude-haiku-4-5-20251001",
            "claude-opus-4-5", "claude-sonnet-4-5",
@@ -39,6 +40,7 @@ PROVIDERS: dict[str, dict] = {
        "type":       "openai",
        "api_key_env": "OPENAI_API_KEY",
        "base_url":   "https://api.openai.com/v1",
+        "context_limit": 128000,
        "models": [
            "gpt-4o", "gpt-4o-mini", "gpt-4-turbo",
            "o3-mini", "o1", "o1-mini",
@@ -48,6 +50,7 @@ PROVIDERS: dict[str, dict] = {
        "type":       "openai",
        "api_key_env": "GEMINI_API_KEY",
        "base_url":   "https://generativelanguage.googleapis.com/v1beta/openai/",
+        "context_limit": 1000000,
        "models": [
            "gemini-2.5-pro-preview-03-25",
            "gemini-2.0-flash", "gemini-2.0-flash-lite",
@@ -58,6 +61,7 @@ PROVIDERS: dict[str, dict] = {
        "type":       "openai",
        "api_key_env": "MOONSHOT_API_KEY",
        "base_url":   "https://api.moonshot.cn/v1",
+        "context_limit": 128000,
        "models": [
            "moonshot-v1-8k", "moonshot-v1-32k", "moonshot-v1-128k",
            "kimi-latest",
@@ -67,6 +71,7 @@ PROVIDERS: dict[str, dict] = {
        "type":       "openai",
        "api_key_env": "DASHSCOPE_API_KEY",
        "base_url":   "https://dashscope.aliyuncs.com/compatible-mode/v1",
+        "context_limit": 1000000,
        "models": [
            "qwen-max", "qwen-plus", "qwen-turbo", "qwen-long",
            "qwen2.5-72b-instruct", "qwen2.5-coder-32b-instruct",
@@ -77,6 +82,7 @@ PROVIDERS: dict[str, dict] = {
        "type":       "openai",
        "api_key_env": "ZHIPU_API_KEY",
        "base_url":   "https://open.bigmodel.cn/api/paas/v4/",
+        "context_limit": 128000,
        "models": [
            "glm-4-plus", "glm-4", "glm-4-flash", "glm-4-air",
            "glm-z1-flash",
@@ -86,6 +92,7 @@ PROVIDERS: dict[str, dict] = {
        "type":       "openai",
        "api_key_env": "DEEPSEEK_API_KEY",
        "base_url":   "https://api.deepseek.com/v1",
+        "context_limit": 64000,
        "models": [
            "deepseek-chat", "deepseek-coder", "deepseek-reasoner",
        ],
@@ -95,6 +102,7 @@ PROVIDERS: dict[str, dict] = {
        "api_key_env": None,
        "base_url":   "http://localhost:11434/v1",
        "api_key":    "ollama",
+        "context_limit": 128000,
        "models": [
            "llama3.3", "llama3.2", "phi4", "mistral", "mixtral",
            "qwen2.5-coder", "deepseek-r1", "gemma3",
@@ -105,12 +113,14 @@ PROVIDERS: dict[str, dict] = {
        "api_key_env": None,
        "base_url":   "http://localhost:1234/v1",
        "api_key":    "lm-studio",
+        "context_limit": 128000,
        "models": [],   # dynamic, depends on loaded model
    },
    "custom": {
        "type":       "openai",
        "api_key_env": "CUSTOM_API_KEY",
        "base_url":   None,   # read from config["custom_base_url"]
+        "context_limit": 128000,
        "models": [],
    },
 }
@@ -277,8 +287,9 @@ def messages_to_openai(messages: list) -> list:
            msg: dict = {"role": "assistant", "content": m.get("content") or None}
            tcs = m.get("tool_calls", [])
            if tcs:
-                msg["tool_calls"] = [
-                    {
+                msg["tool_calls"] = []
+                for tc in tcs:
+                    tc_msg = {
                        "id":   tc["id"],
                        "type": "function",
                        "function": {
@@ -286,8 +297,10 @@ def messages_to_openai(messages: list) -> list:
                            "arguments": json.dumps(tc["input"], ensure_ascii=False),
                        },
                    }
-                    for tc in tcs
-                ]
+                    # Pass through provider-specific fields (e.g. Gemini thought_signature)
+                    if tc.get("extra_content"):
+                        tc_msg["extra_content"] = tc["extra_content"]
+                    msg["tool_calls"].append(tc_msg)
            result.append(msg)

        elif role == "tool":
@@ -425,7 +438,7 @@ def stream_openai_compat(
            for tc in delta.tool_calls:
                idx = tc.index
                if idx not in tool_buf:
-                    tool_buf[idx] = {"id": "", "name": "", "args": ""}
+                    tool_buf[idx] = {"id": "", "name": "", "args": "", "extra_content": None}
                if tc.id:
                    tool_buf[idx]["id"] = tc.id
                if tc.function:
@@ -433,6 +446,10 @@ def stream_openai_compat(
                        tool_buf[idx]["name"] += tc.function.name
                    if tc.function.arguments:
                        tool_buf[idx]["args"] += tc.function.arguments
+                # Capture extra_content (e.g. Gemini thought_signature)
+                extra = getattr(tc, "extra_content", None)
+                if extra:
+                    tool_buf[idx]["extra_content"] = extra

        # Some providers include usage in the last chunk
        if hasattr(chunk, "usage") and chunk.usage:
@@ -446,7 +463,10 @@ def stream_openai_compat(
            inp = json.loads(v["args"]) if v["args"] else {}
        except json.JSONDecodeError:
            inp = {"_raw": v["args"]}
-        tool_calls.append({"id": v["id"] or f"call_{idx}", "name": v["name"], "input": inp})
+        tc_entry = {"id": v["id"] or f"call_{idx}", "name": v["name"], "input": inp}
+        if v.get("extra_content"):
+            tc_entry["extra_content"] = v["extra_content"]
+        tool_calls.append(tc_entry)

    yield AssistantTurn(text, tool_calls, in_tok, out_tok)

--- a/nano-claude-code/skill/init.py
+++ b/nano-claude-code/skill/init.py
@@ -0,0 +1,14 @@
+"""skill package — reusable prompt templates (skills)."""
+from .loader import (  # noqa: F401
+    SkillDef,
+    load_skills,
+    find_skill,
+    substitute_arguments,
+    register_builtin_skill,
+    _parse_skill_file,
+    _parse_list_field,
+)
+from .executor import execute_skill  # noqa: F401
+
+# Importing builtin registers the built-in skills
+from . import builtin as _builtin  # noqa: F401
--- a/nano-claude-code/skill/builtin.py
+++ b/nano-claude-code/skill/builtin.py
@@ -0,0 +1,100 @@
+"""Built-in skills that ship with nano-claude-code."""
+from __future__ import annotations
+
+from .loader import SkillDef, register_builtin_skill
+
+# ── /commit ────────────────────────────────────────────────────────────────
+
+_COMMIT_PROMPT = """\
+Review the current git state and create a well-structured commit.
+
+## Steps
+
+1. Run `git status` and `git diff --staged` to see what is staged.
+   - If nothing is staged, run `git diff` to see unstaged changes, then stage relevant files.
+2. Analyze the changes:
+   - Summarize the nature of the change (feature, bug fix, refactor, docs, etc.)
+   - Write a concise commit title (≤72 chars) focusing on *why*, not just *what*.
+   - If multiple logical changes exist, ask the user whether to split them.
+3. Create the commit:
+   ```
+   git commit -m "<title>"
+   ```
+   If additional context is needed, add a body separated by a blank line.
+4. Print the commit hash and summary when done.
+
+**Rules:**
+- Never use `--no-verify`.
+- Never commit files that likely contain secrets (.env, credentials, keys).
+- Prefer imperative mood in the title: "Add X", "Fix Y", "Refactor Z".
+
+User context: $ARGUMENTS
+"""
+
+_REVIEW_PROMPT = """\
+Review the code or pull request and provide structured feedback.
+
+## Steps
+
+1. Understand the scope:
+   - If a PR number or URL is given in $ARGUMENTS, use `gh pr view $ARGUMENTS --patch` to get the diff.
+   - Otherwise, use `git diff main...HEAD` (or `git diff HEAD~1`) for local changes.
+2. Analyze the diff:
+   - Correctness: Are there bugs, edge cases, or logic errors?
+   - Security: Injection, auth issues, exposed secrets, unsafe operations?
+   - Performance: N+1 queries, unnecessary allocations, blocking calls?
+   - Style: Does it follow existing conventions in the codebase?
+   - Tests: Are new behaviors tested? Do existing tests cover the change?
+3. Write a structured review:
+   ```
+   ## Summary
+   One-line overview of what the change does.
+
+   ## Issues
+   - [CRITICAL/MAJOR/MINOR] Description and location
+
+   ## Suggestions
+   - Nice-to-have improvements
+
+   ## Verdict
+   APPROVE / REQUEST CHANGES / COMMENT
+   ```
+4. If changes are needed, list specific file:line references.
+
+User context: $ARGUMENTS
+"""
+
+
+def _register_builtins() -> None:
+    register_builtin_skill(SkillDef(
+        name="commit",
+        description="Review staged changes and create a well-structured git commit",
+        triggers=["/commit"],
+        tools=["Bash", "Read"],
+        prompt=_COMMIT_PROMPT,
+        file_path="<builtin>",
+        when_to_use="Use when the user wants to commit changes. Triggers: '/commit', 'commit changes', 'make a commit'.",
+        argument_hint="[optional context]",
+        arguments=[],
+        user_invocable=True,
+        context="inline",
+        source="builtin",
+    ))
+
+    register_builtin_skill(SkillDef(
+        name="review",
+        description="Review code changes or a pull request and provide structured feedback",
+        triggers=["/review", "/review-pr"],
+        tools=["Bash", "Read", "Grep"],
+        prompt=_REVIEW_PROMPT,
+        file_path="<builtin>",
+        when_to_use="Use when the user wants a code review. Triggers: '/review', '/review-pr', 'review this PR'.",
+        argument_hint="[PR number or URL]",
+        arguments=["pr"],
+        user_invocable=True,
+        context="inline",
+        source="builtin",
+    ))
+
+
+_register_builtins()
--- a/nano-claude-code/skill/executor.py
+++ b/nano-claude-code/skill/executor.py
@@ -0,0 +1,66 @@
+"""Skill execution: inline (current conversation) or forked (sub-agent)."""
+from __future__ import annotations
+
+from typing import Generator
+
+from .loader import SkillDef, substitute_arguments
+
+
+def execute_skill(
+    skill: SkillDef,
+    args: str,
+    state,
+    config: dict,
+    system_prompt: str,
+) -> Generator:
+    """Execute a skill.
+
+    If skill.context == "fork", runs as an isolated sub-agent and yields its events.
+    Otherwise (inline), injects the rendered prompt into the current agent loop.
+
+    Args:
+        skill: SkillDef to execute
+        args: raw argument string from user (after the trigger word)
+        state: AgentState
+        config: config dict (may contain _depth, model, etc.)
+        system_prompt: current system prompt string
+    Yields:
+        agent events (TextChunk, ToolStart, ToolEnd, TurnDone, …)
+    """
+    rendered = substitute_arguments(skill.prompt, args, skill.arguments)
+    message = f"[Skill: {skill.name}]\n\n{rendered}"
+
+    if skill.context == "fork":
+        yield from _execute_forked(skill, message, config, system_prompt)
+    else:
+        yield from _execute_inline(message, state, config, system_prompt)
+
+
+def _execute_inline(message: str, state, config: dict, system_prompt: str) -> Generator:
+    """Run skill prompt inline in the current conversation."""
+    import agent as _agent
+    yield from _agent.run(message, state, config, system_prompt)
+
+
+def _execute_forked(
+    skill: SkillDef,
+    message: str,
+    config: dict,
+    system_prompt: str,
+) -> Generator:
+    """Run skill as an isolated sub-agent (separate conversation context)."""
+    import agent as _agent
+
+    # Build a sub-agent config with depth tracking
+    depth = config.get("_depth", 0) + 1
+    sub_config = {**config, "_depth": depth, "_system_prompt": system_prompt}
+    if skill.model:
+        sub_config["model"] = skill.model
+
+    # Restrict tools if skill specifies allowed-tools
+    if skill.tools:
+        sub_config["_allowed_tools"] = skill.tools
+
+    # Run in fresh state (no shared history)
+    sub_state = _agent.AgentState()
+    yield from _agent.run(message, sub_state, sub_config, system_prompt)
--- a/nano-claude-code/skill/loader.py
+++ b/nano-claude-code/skill/loader.py
@@ -0,0 +1,184 @@
+"""Skill loading: parse markdown files with YAML frontmatter into SkillDef objects."""
+from __future__ import annotations
+
+from dataclasses import dataclass, field
+from pathlib import Path
+from typing import Optional
+
+
+@dataclass
+class SkillDef:
+    name: str
+    description: str
+    triggers: list[str]          # ["/commit", "commit changes"]
+    tools: list[str]             # ["Bash", "Read"]  (allowed-tools)
+    prompt: str                  # full prompt body after frontmatter
+    file_path: str
+    # Enhanced fields
+    when_to_use: str = ""        # when Claude should auto-invoke this skill
+    argument_hint: str = ""      # e.g. "[branch] [description]"
+    arguments: list[str] = field(default_factory=list)  # named arg names
+    model: str = ""              # model override
+    user_invocable: bool = True  # appears in /skills list
+    context: str = "inline"      # "inline" or "fork" (fork = sub-agent)
+    source: str = "user"         # "user", "project", "builtin"
+
+
+# ── Directory paths ────────────────────────────────────────────────────────
+
+def _get_skill_paths() -> list[Path]:
+    return [
+        Path.cwd() / ".nano_claude" / "skills",   # project-level (priority)
+        Path.home() / ".nano_claude" / "skills",   # user-level
+    ]
+
+
+# ── List field parser ──────────────────────────────────────────────────────
+
+def _parse_list_field(value: str) -> list[str]:
+    """Parse YAML-like list: ``[a, b, c]`` or ``"a, b, c"``."""
+    value = value.strip()
+    if value.startswith("[") and value.endswith("]"):
+        value = value[1:-1]
+    return [item.strip().strip('"').strip("'") for item in value.split(",") if item.strip()]
+
+
+# ── Single-file parser ─────────────────────────────────────────────────────
+
+def _parse_skill_file(path: Path, source: str = "user") -> Optional[SkillDef]:
+    """Parse a markdown file with ``---`` frontmatter into a SkillDef.
+
+    Frontmatter fields:
+        name, description, triggers, tools / allowed-tools,
+        when_to_use, argument-hint, arguments, model,
+        user-invocable, context
+    """
+    try:
+        text = path.read_text(encoding="utf-8")
+    except Exception:
+        return None
+
+    if not text.startswith("---"):
+        return None
+
+    parts = text.split("---", 2)
+    if len(parts) < 3:
+        return None
+
+    frontmatter_raw = parts[1].strip()
+    prompt = parts[2].strip()
+
+    fields: dict[str, str] = {}
+    for line in frontmatter_raw.splitlines():
+        line = line.strip()
+        if not line or ":" not in line:
+            continue
+        key, _, val = line.partition(":")
+        fields[key.strip().lower()] = val.strip()
+
+    name = fields.get("name", "")
+    if not name:
+        return None
+
+    # allowed-tools wins over tools if present
+    tools_raw = fields.get("allowed-tools", fields.get("tools", ""))
+    tools = _parse_list_field(tools_raw) if tools_raw else []
+
+    triggers_raw = fields.get("triggers", "")
+    triggers = _parse_list_field(triggers_raw) if triggers_raw else [f"/{name}"]
+
+    arguments_raw = fields.get("arguments", "")
+    arguments = _parse_list_field(arguments_raw) if arguments_raw else []
+
+    user_invocable_raw = fields.get("user-invocable", "true")
+    user_invocable = user_invocable_raw.lower() not in ("false", "0", "no")
+
+    context = fields.get("context", "inline").strip().lower()
+    if context not in ("inline", "fork"):
+        context = "inline"
+
+    return SkillDef(
+        name=name,
+        description=fields.get("description", ""),
+        triggers=triggers,
+        tools=tools,
+        prompt=prompt,
+        file_path=str(path),
+        when_to_use=fields.get("when_to_use", ""),
+        argument_hint=fields.get("argument-hint", ""),
+        arguments=arguments,
+        model=fields.get("model", ""),
+        user_invocable=user_invocable,
+        context=context,
+        source=source,
+    )
+
+
+# ── Registry of built-in skills (registered by builtin.py) ────────────────
+
+_BUILTIN_SKILLS: list[SkillDef] = []
+
+
+def register_builtin_skill(skill: SkillDef) -> None:
+    _BUILTIN_SKILLS.append(skill)
+
+
+# ── Load all skills ────────────────────────────────────────────────────────
+
+def load_skills(include_builtins: bool = True) -> list[SkillDef]:
+    """Return skills from disk + builtins, deduplicated (project > user > builtin)."""
+    seen: dict[str, SkillDef] = {}
+
+    # Builtins go in first (lowest priority)
+    if include_builtins:
+        for sk in _BUILTIN_SKILLS:
+            seen[sk.name] = sk
+
+    # User-level next, project-level last (highest priority)
+    skill_paths = _get_skill_paths()
+    for i, skill_dir in enumerate(reversed(skill_paths)):
+        src = "user" if i == 0 else "project"
+        if not skill_dir.is_dir():
+            continue
+        for md_file in sorted(skill_dir.glob("*.md")):
+            skill = _parse_skill_file(md_file, source=src)
+            if skill:
+                seen[skill.name] = skill
+
+    return list(seen.values())
+
+
+def find_skill(query: str) -> Optional[SkillDef]:
+    """Find a skill whose trigger matches the first word (or whole string) of query."""
+    query = query.strip()
+    if not query:
+        return None
+
+    first_word = query.split()[0]
+    for skill in load_skills():
+        for trigger in skill.triggers:
+            if first_word == trigger:
+                return skill
+            if trigger.startswith(first_word + " "):
+                return skill
+    return None
+
+
+# ── Argument substitution ─────────────────────────────────────────────────
+
+def substitute_arguments(prompt: str, args: str, arg_names: list[str]) -> str:
+    """Replace $ARGUMENTS (whole args string) and $ARG_NAME placeholders.
+
+    Named args are positional: first word → first name, etc.
+    """
+    # Always substitute $ARGUMENTS
+    result = prompt.replace("$ARGUMENTS", args)
+
+    # Named args: split by whitespace
+    arg_values = args.split()
+    for i, arg_name in enumerate(arg_names):
+        placeholder = f"${arg_name.upper()}"
+        value = arg_values[i] if i < len(arg_values) else ""
+        result = result.replace(placeholder, value)
+
+    return result
--- a/nano-claude-code/skill/tools.py
+++ b/nano-claude-code/skill/tools.py
@@ -0,0 +1,110 @@
+"""Skill tool: lets the model invoke skills by name via tool call."""
+from __future__ import annotations
+
+from tool_registry import ToolDef, register_tool
+from .loader import find_skill, load_skills, substitute_arguments
+
+
+_SKILL_SCHEMA = {
+    "name": "Skill",
+    "description": (
+        "Invoke a named skill (reusable prompt template). "
+        "Use SkillList to see available skills and their triggers."
+    ),
+    "input_schema": {
+        "type": "object",
+        "properties": {
+            "name": {
+                "type": "string",
+                "description": "Skill name (e.g. 'commit', 'review')",
+            },
+            "args": {
+                "type": "string",
+                "description": "Arguments to pass to the skill (replaces $ARGUMENTS)",
+                "default": "",
+            },
+        },
+        "required": ["name"],
+    },
+}
+
+_SKILL_LIST_SCHEMA = {
+    "name": "SkillList",
+    "description": "List all available skills with their names, triggers, and descriptions.",
+    "input_schema": {
+        "type": "object",
+        "properties": {},
+        "required": [],
+    },
+}
+
+
+def _skill_tool(params: dict, config: dict) -> str:
+    """Execute a skill by name and return its output."""
+    skill_name = params.get("name", "").strip()
+    args = params.get("args", "")
+
+    # Look up by name first, then by trigger
+    skill = None
+    for s in load_skills():
+        if s.name == skill_name:
+            skill = s
+            break
+    if skill is None:
+        skill = find_skill(skill_name)
+    if skill is None:
+        names = [s.name for s in load_skills()]
+        return f"Error: skill '{skill_name}' not found. Available: {', '.join(names)}"
+
+    rendered = substitute_arguments(skill.prompt, args, skill.arguments)
+    message = f"[Skill: {skill.name}]\n\n{rendered}"
+
+    # Run inline via agent and collect text output
+    import agent as _agent
+    system_prompt = config.get("_system_prompt", "")
+
+    # Collect output text
+    output_parts: list[str] = []
+    sub_state = _agent.AgentState()
+    sub_config = {**config, "_depth": config.get("_depth", 0) + 1}
+    try:
+        for event in _agent.run(message, sub_state, sub_config, system_prompt):
+            if hasattr(event, "text"):
+                output_parts.append(event.text)
+    except Exception as e:
+        return f"Skill execution error: {e}"
+
+    return "".join(output_parts) or "(skill completed with no text output)"
+
+
+def _skill_list_tool(params: dict, config: dict) -> str:
+    skills = load_skills()
+    if not skills:
+        return "No skills available."
+    lines = ["Available skills:\n"]
+    for s in skills:
+        triggers = ", ".join(s.triggers)
+        hint = f"  args: {s.argument_hint}" if s.argument_hint else ""
+        when = f"\n    when: {s.when_to_use}" if s.when_to_use else ""
+        lines.append(f"- **{s.name}** [{triggers}]{hint}\n  {s.description}{when}")
+    return "\n".join(lines)
+
+
+def _register() -> None:
+    register_tool(ToolDef(
+        name="Skill",
+        schema=_SKILL_SCHEMA,
+        func=_skill_tool,
+        read_only=False,
+        concurrent_safe=False,
+    ))
+    register_tool(ToolDef(
+        name="SkillList",
+        schema=_SKILL_LIST_SCHEMA,
+        func=_skill_list_tool,
+        read_only=True,
+        concurrent_safe=True,
+    ))
+
+
+_register()
--- a/nano-claude-code/skills.py
+++ b/nano-claude-code/skills.py
@@ -0,0 +1,14 @@
+"""Backward-compatibility shim — real implementation is in skill/ package."""
+from skill.loader import (  # noqa: F401
+    SkillDef,
+    load_skills,
+    find_skill,
+    substitute_arguments,
+    _parse_skill_file,
+    _parse_list_field,
+)
+from skill.executor import execute_skill  # noqa: F401
+
+# Legacy constant — kept for tests that patch it
+from skill.loader import _get_skill_paths as _gsp
+SKILL_PATHS = _gsp()
--- a/nano-claude-code/subagent.py
+++ b/nano-claude-code/subagent.py
@@ -0,0 +1,11 @@
+"""Backward-compatibility shim — real implementation is in multi_agent/subagent.py."""
+from multi_agent.subagent import (  # noqa: F401
+    AgentDefinition,
+    SubAgentTask,
+    SubAgentManager,
+    load_agent_definitions,
+    get_agent_definition,
+    _extract_final_text,
+    _agent_run,
+    _BUILTIN_AGENTS,
+)
--- a/nano-claude-code/tests/init.py
+++ b/nano-claude-code/tests/init.py
--- a/nano-claude-code/tests/test_compaction.py
+++ b/nano-claude-code/tests/test_compaction.py
@@ -0,0 +1,187 @@
+"""Tests for compaction.py — token estimation, context limits, snipping, split point."""
+from __future__ import annotations
+
+import sys
+import os
+
+# Ensure project root is on sys.path
+sys.path.insert(0, os.path.join(os.path.dirname(__file__), ".."))
+
+from compaction import estimate_tokens, get_context_limit, snip_old_tool_results, find_split_point
+
+
+# ── estimate_tokens ───────────────────────────────────────────────────────
+
+class TestEstimateTokens:
+    def test_simple_messages(self):
+        msgs = [
+            {"role": "user", "content": "Hello world"},          # 11 chars
+            {"role": "assistant", "content": "Hi there!"},       # 9 chars
+        ]
+        result = estimate_tokens(msgs)
+        # (11 + 9) / 3.5 = 5.71 -> 5
+        assert result == int(20 / 3.5)
+
+    def test_empty_messages(self):
+        assert estimate_tokens([]) == 0
+
+    def test_empty_content(self):
+        msgs = [{"role": "user", "content": ""}]
+        assert estimate_tokens(msgs) == 0
+
+    def test_tool_result_messages(self):
+        msgs = [
+            {"role": "tool", "tool_call_id": "abc", "name": "Read", "content": "x" * 350},
+        ]
+        result = estimate_tokens(msgs)
+        assert result == int(350 / 3.5)
+
+    def test_structured_content(self):
+        """Content that is a list of dicts (e.g. Anthropic tool_result blocks)."""
+        msgs = [
+            {
+                "role": "user",
+                "content": [
+                    {"type": "tool_result", "tool_use_id": "id1", "content": "A" * 70},
+                ],
+            },
+        ]
+        result = estimate_tokens(msgs)
+        # "tool_result" (11) + "id1" (3) + "A"*70 (70) = 84  -> 84/3.5 = 24
+        assert result == int(84 / 3.5)
+
+    def test_with_tool_calls(self):
+        msgs = [
+            {
+                "role": "assistant",
+                "content": "ok",
+                "tool_calls": [
+                    {"id": "c1", "name": "Bash", "input": {"command": "ls"}},
+                ],
+            },
+        ]
+        result = estimate_tokens(msgs)
+        # content "ok" (2) + tool_calls string values: "c1" (2) + "Bash" (4) = 8
+        assert result == int(8 / 3.5)
+
+
+# ── get_context_limit ─────────────────────────────────────────────────────
+
+class TestGetContextLimit:
+    def test_anthropic(self):
+        assert get_context_limit("claude-opus-4-6") == 200000
+
+    def test_gemini(self):
+        assert get_context_limit("gemini-2.0-flash") == 1000000
+
+    def test_deepseek(self):
+        assert get_context_limit("deepseek-chat") == 64000
+
+    def test_openai(self):
+        assert get_context_limit("gpt-4o") == 128000
+
+    def test_qwen(self):
+        assert get_context_limit("qwen-max") == 1000000
+
+    def test_unknown_model_fallback(self):
+        # Unknown models fall back to openai provider which has 128000
+        assert get_context_limit("some-random-model-xyz") == 128000
+
+    def test_explicit_provider_prefix(self):
+        assert get_context_limit("ollama/llama3.3") == 128000
+
+
+# ── snip_old_tool_results ─────────────────────────────────────────────────
+
+class TestSnipOldToolResults:
+    def test_old_tool_results_get_truncated(self):
+        long_content = "A" * 5000
+        msgs = [
+            {"role": "user", "content": "hello"},
+            {"role": "assistant", "content": "let me check", "tool_calls": []},
+            {"role": "tool", "tool_call_id": "t1", "name": "Read", "content": long_content},
+            {"role": "user", "content": "thanks"},
+            {"role": "assistant", "content": "you're welcome"},
+            {"role": "user", "content": "bye"},
+            {"role": "assistant", "content": "goodbye"},
+            {"role": "user", "content": "wait"},
+            {"role": "assistant", "content": "yes?"},
+            {"role": "user", "content": "never mind"},
+        ]
+        result = snip_old_tool_results(msgs, max_chars=2000, preserve_last_n_turns=6)
+        assert result is msgs  # mutated in place
+        tool_msg = msgs[2]
+        assert len(tool_msg["content"]) < 5000
+        assert "snipped" in tool_msg["content"]
+
+    def test_recent_tool_results_preserved(self):
+        long_content = "B" * 5000
+        msgs = [
+            {"role": "user", "content": "hello"},
+            {"role": "assistant", "content": "ok", "tool_calls": []},
+            {"role": "tool", "tool_call_id": "t1", "name": "Read", "content": long_content},
+        ]
+        # All 3 messages are within preserve_last_n_turns=6
+        result = snip_old_tool_results(msgs, max_chars=2000, preserve_last_n_turns=6)
+        assert msgs[2]["content"] == long_content  # not truncated
+
+    def test_short_tool_results_not_touched(self):
+        msgs = [
+            {"role": "tool", "tool_call_id": "t1", "name": "Bash", "content": "short"},
+            {"role": "user", "content": "a"},
+            {"role": "user", "content": "b"},
+            {"role": "user", "content": "c"},
+            {"role": "user", "content": "d"},
+            {"role": "user", "content": "e"},
+            {"role": "user", "content": "f"},
+        ]
+        snip_old_tool_results(msgs, max_chars=2000, preserve_last_n_turns=6)
+        assert msgs[0]["content"] == "short"
+
+    def test_non_tool_messages_untouched(self):
+        msgs = [
+            {"role": "user", "content": "X" * 5000},
+            {"role": "user", "content": "a"},
+            {"role": "user", "content": "b"},
+            {"role": "user", "content": "c"},
+            {"role": "user", "content": "d"},
+            {"role": "user", "content": "e"},
+            {"role": "user", "content": "f"},
+        ]
+        snip_old_tool_results(msgs, max_chars=2000, preserve_last_n_turns=6)
+        assert msgs[0]["content"] == "X" * 5000
+
+
+# ── find_split_point ──────────────────────────────────────────────────────
+
+class TestFindSplitPoint:
+    def test_returns_reasonable_index(self):
+        msgs = [
+            {"role": "user", "content": "A" * 1000},
+            {"role": "assistant", "content": "B" * 1000},
+            {"role": "user", "content": "C" * 1000},
+            {"role": "assistant", "content": "D" * 1000},
+            {"role": "user", "content": "E" * 1000},
+        ]
+        idx = find_split_point(msgs, keep_ratio=0.3)
+        # With equal-size messages and keep_ratio=0.3, split should be around index 3-4
+        assert 2 <= idx <= 4
+
+    def test_single_message(self):
+        msgs = [{"role": "user", "content": "hello"}]
+        idx = find_split_point(msgs, keep_ratio=0.3)
+        assert idx == 0
+
+    def test_empty_messages(self):
+        idx = find_split_point([], keep_ratio=0.3)
+        assert idx == 0
+
+    def test_split_preserves_recent(self):
+        # Recent portion should contain ~30% of tokens
+        msgs = [{"role": "user", "content": "X" * 100} for _ in range(10)]
+        idx = find_split_point(msgs, keep_ratio=0.3)
+        total = estimate_tokens(msgs)
+        recent = estimate_tokens(msgs[idx:])
+        # Recent should be roughly 30% of total (allow some tolerance)
+        assert recent >= total * 0.2
+        assert recent <= total * 0.5
--- a/nano-claude-code/tests/test_diff_view.py
+++ b/nano-claude-code/tests/test_diff_view.py
@@ -0,0 +1,50 @@
+import sys, os, tempfile
+sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
+import pytest
+
+def test_generate_unified_diff():
+    from tools import generate_unified_diff
+    old = "line1\nline2\nline3\n"
+    new = "line1\nline2_modified\nline3\n"
+    diff = generate_unified_diff(old, new, "test.py")
+    assert "--- a/test.py" in diff
+    assert "+++ b/test.py" in diff
+    assert "-line2" in diff
+    assert "+line2_modified" in diff
+
+def test_generate_unified_diff_empty_old():
+    from tools import generate_unified_diff
+    diff = generate_unified_diff("", "new content\n", "test.py")
+    assert "+new content" in diff
+
+def test_edit_returns_diff(tmp_path):
+    from tools import _edit
+    f = tmp_path / "test.txt"
+    f.write_text("hello world\n")
+    result = _edit(str(f), "hello", "goodbye")
+    assert "-hello world" in result
+    assert "+goodbye world" in result
+
+def test_write_existing_returns_diff(tmp_path):
+    from tools import _write
+    f = tmp_path / "test.txt"
+    f.write_text("old content\n")
+    result = _write(str(f), "new content\n")
+    assert "-old content" in result
+    assert "+new content" in result
+
+def test_write_new_file_no_diff(tmp_path):
+    from tools import _write
+    f = tmp_path / "new.txt"
+    result = _write(str(f), "content\n")
+    assert "Created" in result
+    assert "---" not in result
+
+def test_diff_truncation():
+    from tools import generate_unified_diff, maybe_truncate_diff
+    old = "\n".join(f"line{i}" for i in range(200))
+    new = "\n".join(f"CHANGED{i}" for i in range(200))
+    diff = generate_unified_diff(old, new, "big.py")
+    truncated = maybe_truncate_diff(diff, max_lines=50)
+    assert "more lines" in truncated
+    assert truncated.count("\n") < 60
--- a/nano-claude-code/tests/test_memory.py
+++ b/nano-claude-code/tests/test_memory.py
@@ -0,0 +1,275 @@
+"""Tests for the memory package (memory/)."""
+import pytest
+from pathlib import Path
+
+import memory.store as _store
+from memory.store import (
+    MemoryEntry,
+    save_memory,
+    load_index,
+    load_entries,
+    delete_memory,
+    search_memory,
+    _slugify,
+    parse_frontmatter,
+    get_index_content,
+)
+from memory.context import get_memory_context, truncate_index_content
+from memory.scan import (
+    scan_memory_dir,
+    format_memory_manifest,
+    memory_age_days,
+    memory_age_str,
+    memory_freshness_text,
+    MemoryHeader,
+)
+from memory.types import MEMORY_TYPES
+
+
+# ── Fixtures ─────────────────────────────────────────────────────────────
+
+@pytest.fixture(autouse=True)
+def redirect_memory_dirs(tmp_path, monkeypatch):
+    """Redirect user and project memory dirs to tmp_path for all tests."""
+    user_mem = tmp_path / "user_memory"
+    user_mem.mkdir()
+    proj_mem = tmp_path / "project_memory"
+    proj_mem.mkdir()
+
+    monkeypatch.setattr(_store, "USER_MEMORY_DIR", user_mem)
+
+    # Patch get_project_memory_dir to return our tmp project dir
+    monkeypatch.setattr(_store, "get_project_memory_dir", lambda: proj_mem)
+
+
+def _make_entry(name="test note", description="a test", type_="user",
+                content="hello world", scope="user"):
+    return MemoryEntry(
+        name=name, description=description, type=type_,
+        content=content, created="2026-04-02", scope=scope,
+    )
+
+
+# ── Save and Load ─────────────────────────────────────────────────────────
+
+class TestSaveAndLoad:
+    def test_roundtrip(self):
+        entry = _make_entry()
+        save_memory(entry, scope="user")
+        loaded = load_entries("user")
+        assert len(loaded) == 1
+        assert loaded[0].name == "test note"
+        assert loaded[0].description == "a test"
+        assert loaded[0].type == "user"
+        assert loaded[0].content == "hello world"
+
+    def test_creates_file_on_disk(self):
+        entry = _make_entry()
+        save_memory(entry, scope="user")
+        assert Path(entry.file_path).exists()
+        text = Path(entry.file_path).read_text()
+        assert "hello world" in text
+
+    def test_update_existing(self):
+        """Save same name twice → only 1 entry with updated content."""
+        save_memory(_make_entry(content="version 1"), scope="user")
+        save_memory(_make_entry(content="version 2"), scope="user")
+        loaded = load_entries("user")
+        assert len(loaded) == 1
+        assert loaded[0].content == "version 2"
+
+    def test_project_scope_stored_separately(self):
+        save_memory(_make_entry(name="user note"), scope="user")
+        save_memory(_make_entry(name="proj note"), scope="project")
+        user_entries = load_entries("user")
+        proj_entries = load_entries("project")
+        assert len(user_entries) == 1
+        assert len(proj_entries) == 1
+        assert user_entries[0].name == "user note"
+        assert proj_entries[0].name == "proj note"
+
+    def test_load_index_all_combines_scopes(self):
+        save_memory(_make_entry(name="user note"), scope="user")
+        save_memory(_make_entry(name="proj note"), scope="project")
+        all_entries = load_index("all")
+        names = {e.name for e in all_entries}
+        assert "user note" in names
+        assert "proj note" in names
+
+
+# ── Delete ────────────────────────────────────────────────────────────────
+
+class TestDelete:
+    def test_delete_removes_file_and_index(self):
+        entry = _make_entry()
+        save_memory(entry, scope="user")
+        delete_memory("test note", scope="user")
+        assert load_entries("user") == []
+        assert not Path(entry.file_path).exists()
+
+    def test_delete_nonexistent_no_error(self):
+        delete_memory("nonexistent", scope="user")
+
+    def test_delete_from_project_scope(self):
+        save_memory(_make_entry(name="proj note"), scope="project")
+        delete_memory("proj note", scope="project")
+        assert load_entries("project") == []
+
+
+# ── Search ────────────────────────────────────────────────────────────────
+
+class TestSearch:
+    def test_search_by_keyword(self):
+        save_memory(_make_entry(name="python tips", content="use list comprehension"), scope="user")
+        save_memory(_make_entry(name="rust tips", content="use iterators"), scope="user")
+        results = search_memory("python")
+        assert len(results) == 1
+        assert results[0].name == "python tips"
+
+    def test_search_case_insensitive(self):
+        save_memory(_make_entry(name="Important Note", content="something"), scope="user")
+        results = search_memory("important")
+        assert len(results) == 1
+
+    def test_search_in_content(self):
+        save_memory(_make_entry(name="misc", content="the quick brown fox"), scope="user")
+        results = search_memory("brown fox")
+        assert len(results) == 1
+
+    def test_search_across_scopes(self):
+        save_memory(_make_entry(name="user note", content="alpha"), scope="user")
+        save_memory(_make_entry(name="proj note", content="alpha"), scope="project")
+        results = search_memory("alpha", scope="all")
+        assert len(results) == 2
+
+
+# ── Memory context ────────────────────────────────────────────────────────
+
+class TestGetMemoryContext:
+    def test_returns_index_text(self):
+        save_memory(_make_entry(name="my note", description="desc here"), scope="user")
+        ctx = get_memory_context()
+        assert "my note" in ctx
+        assert "desc here" in ctx
+
+    def test_empty_when_no_memories(self):
+        ctx = get_memory_context()
+        assert ctx == ""
+
+    def test_project_memories_labelled(self):
+        save_memory(_make_entry(name="proj note", description="project context"), scope="project")
+        ctx = get_memory_context()
+        assert "Project memories" in ctx
+        assert "proj note" in ctx
+
+
+# ── Truncation ────────────────────────────────────────────────────────────
+
+class TestTruncation:
+    def test_no_truncation_within_limits(self):
+        text = "- line\n" * 10
+        result = truncate_index_content(text)
+        assert "WARNING" not in result
+
+    def test_line_truncation(self):
+        text = "\n".join(f"- line {i}" for i in range(300))
+        result = truncate_index_content(text)
+        assert "WARNING" in result
+        assert "lines" in result
+
+    def test_byte_truncation(self):
+        # 25001 bytes of content
+        text = "x" * 25001
+        result = truncate_index_content(text)
+        assert "WARNING" in result
+
+
+# ── Slugify ───────────────────────────────────────────────────────────────
+
+class TestSlugify:
+    def test_basic(self):
+        assert _slugify("Hello World") == "hello_world"
+
+    def test_special_chars(self):
+        assert _slugify("foo@bar!baz") == "foobarbaz"
+
+    def test_max_length(self):
+        assert len(_slugify("a" * 100)) == 60
+
+
+# ── parse_frontmatter ─────────────────────────────────────────────────────
+
+class TestParseFrontmatter:
+    def test_parse(self):
+        text = "---\nname: foo\ntype: user\n---\nbody text"
+        meta, body = parse_frontmatter(text)
+        assert meta["name"] == "foo"
+        assert meta["type"] == "user"
+        assert body == "body text"
+
+    def test_no_frontmatter(self):
+        meta, body = parse_frontmatter("just plain text")
+        assert meta == {}
+        assert body == "just plain text"
+
+
+# ── scan / age / freshness ────────────────────────────────────────────────
+
+class TestScanAndAge:
+    def test_scan_memory_dir(self):
+        save_memory(_make_entry(name="note a"), scope="user")
+        save_memory(_make_entry(name="note b"), scope="user")
+        user_dir = _store.USER_MEMORY_DIR
+        headers = scan_memory_dir(user_dir, "user")
+        assert len(headers) == 2
+        assert all(isinstance(h, MemoryHeader) for h in headers)
+
+    def test_format_manifest(self):
+        import time
+        headers = [
+            MemoryHeader(
+                filename="foo.md",
+                file_path="/tmp/foo.md",
+                mtime_s=time.time(),
+                description="test desc",
+                type="user",
+                scope="user",
+            )
+        ]
+        manifest = format_memory_manifest(headers)
+        assert "foo.md" in manifest
+        assert "test desc" in manifest
+        assert "today" in manifest
+
+    def test_memory_age_days_today(self):
+        import time
+        assert memory_age_days(time.time()) == 0
+
+    def test_memory_age_days_old(self):
+        import time
+        old = time.time() - 5 * 86400  # 5 days ago
+        assert memory_age_days(old) == 5
+
+    def test_memory_age_str(self):
+        import time
+        assert memory_age_str(time.time()) == "today"
+        assert memory_age_str(time.time() - 86400) == "yesterday"
+        assert memory_age_str(time.time() - 3 * 86400) == "3 days ago"
+
+    def test_freshness_text_fresh(self):
+        import time
+        assert memory_freshness_text(time.time()) == ""
+
+    def test_freshness_text_stale(self):
+        import time
+        old = time.time() - 10 * 86400
+        text = memory_freshness_text(old)
+        assert "10 days old" in text
+        assert "stale" in text.lower() or "outdated" in text.lower()
+
+
+# ── Memory types ──────────────────────────────────────────────────────────
+
+class TestMemoryTypes:
+    def test_types_list(self):
+        assert set(MEMORY_TYPES) == {"user", "feedback", "project", "reference"}
--- a/nano-claude-code/tests/test_skills.py
+++ b/nano-claude-code/tests/test_skills.py
@@ -0,0 +1,234 @@
+from __future__ import annotations
+
+import pytest
+from pathlib import Path
+
+import skill.loader as _loader
+from skill.loader import _parse_skill_file, _parse_list_field, find_skill, SkillDef
+from skill import load_skills, substitute_arguments
+
+
+COMMIT_MD = """\
+---
+name: commit
+description: Create a git commit
+triggers: [/commit, commit changes]
+tools: [Bash, Read]
+---
+Review staged changes and create a commit with a descriptive message.
+"""
+
+REVIEW_MD = """\
+---
+name: review
+description: Review a pull request
+triggers: [/review, /review-pr]
+tools: [Bash, Read, Grep]
+---
+Analyze the PR diff and provide constructive feedback.
+"""
+
+ARGS_MD = """\
+---
+name: deploy
+description: Deploy to an environment
+triggers: [/deploy]
+tools: [Bash]
+argument-hint: [env] [version]
+arguments: [env, version]
+---
+Deploy $VERSION to $ENV environment. Full args: $ARGUMENTS
+"""
+
+
+@pytest.fixture()
+def skill_dir(tmp_path, monkeypatch):
+    """Create a temp skill directory with sample skills and patch _get_skill_paths."""
+    skills_dir = tmp_path / "skills"
+    skills_dir.mkdir()
+    (skills_dir / "commit.md").write_text(COMMIT_MD, encoding="utf-8")
+    (skills_dir / "review.md").write_text(REVIEW_MD, encoding="utf-8")
+
+    monkeypatch.setattr(_loader, "_get_skill_paths", lambda: [skills_dir])
+    # Also patch the builtin list to be empty so tests are predictable
+    monkeypatch.setattr(_loader, "_BUILTIN_SKILLS", [])
+    return skills_dir
+
+
+# ------------------------------------------------------------------
+# _parse_list_field
+# ------------------------------------------------------------------
+
+def test_parse_list_field_bracket():
+    assert _parse_list_field("[a, b, c]") == ["a", "b", "c"]
+
+
+def test_parse_list_field_plain():
+    assert _parse_list_field("a, b, c") == ["a", "b", "c"]
+
+
+def test_parse_list_field_single():
+    assert _parse_list_field("solo") == ["solo"]
+
+
+# ------------------------------------------------------------------
+# _parse_skill_file
+# ------------------------------------------------------------------
+
+def test_parse_skill_file(skill_dir):
+    path = skill_dir / "commit.md"
+    skill = _parse_skill_file(path)
+    assert skill is not None
+    assert skill.name == "commit"
+    assert skill.description == "Create a git commit"
+    assert "/commit" in skill.triggers
+    assert "commit changes" in skill.triggers
+    assert "Bash" in skill.tools
+    assert "Read" in skill.tools
+    assert "commit" in skill.prompt.lower()
+    assert skill.file_path == str(path)
+
+
+def test_parse_skill_file_review(skill_dir):
+    path = skill_dir / "review.md"
+    skill = _parse_skill_file(path)
+    assert skill is not None
+    assert skill.name == "review"
+    assert "/review" in skill.triggers
+    assert "/review-pr" in skill.triggers
+
+
+def test_parse_skill_file_invalid(tmp_path):
+    bad = tmp_path / "bad.md"
+    bad.write_text("no frontmatter here", encoding="utf-8")
+    assert _parse_skill_file(bad) is None
+
+
+def test_parse_skill_file_no_name(tmp_path):
+    no_name = tmp_path / "noname.md"
+    no_name.write_text("---\ndescription: test\n---\nbody\n", encoding="utf-8")
+    assert _parse_skill_file(no_name) is None
+
+
+def test_parse_skill_file_context_fork(tmp_path):
+    fork_md = tmp_path / "fork.md"
+    fork_md.write_text("---\nname: fork-task\ndescription: test\ncontext: fork\n---\nbody\n")
+    skill = _parse_skill_file(fork_md)
+    assert skill is not None
+    assert skill.context == "fork"
+
+
+def test_parse_skill_file_allowed_tools(tmp_path):
+    md = tmp_path / "t.md"
+    md.write_text("---\nname: myskill\ndescription: d\nallowed-tools: [Bash, Read]\n---\nbody\n")
+    skill = _parse_skill_file(md)
+    assert skill is not None
+    assert "Bash" in skill.tools
+    assert "Read" in skill.tools
+
+
+# ------------------------------------------------------------------
+# load_skills
+# ------------------------------------------------------------------
+
+def test_load_skills(skill_dir):
+    skills = load_skills()
+    assert len(skills) == 2
+    names = {s.name for s in skills}
+    assert names == {"commit", "review"}
+
+
+def test_load_skills_empty_dir(tmp_path, monkeypatch):
+    empty = tmp_path / "empty_skills"
+    empty.mkdir()
+    monkeypatch.setattr(_loader, "_get_skill_paths", lambda: [empty])
+    monkeypatch.setattr(_loader, "_BUILTIN_SKILLS", [])
+    assert load_skills() == []
+
+
+def test_load_skills_nonexistent_dir(tmp_path, monkeypatch):
+    monkeypatch.setattr(_loader, "_get_skill_paths", lambda: [tmp_path / "does_not_exist"])
+    monkeypatch.setattr(_loader, "_BUILTIN_SKILLS", [])
+    assert load_skills() == []
+
+
+def test_load_skills_builtins_present(monkeypatch):
+    """Without patching, builtins (commit, review) should be present."""
+    monkeypatch.setattr(_loader, "_get_skill_paths", lambda: [])
+    skills = load_skills()
+    names = {s.name for s in skills}
+    assert "commit" in names
+    assert "review" in names
+
+
+def test_load_skills_project_overrides_builtin(tmp_path, monkeypatch):
+    """A project skill with the same name overrides the builtin."""
+    skills_dir = tmp_path / "skills"
+    skills_dir.mkdir()
+    # project-level "commit" with different description
+    (skills_dir / "commit.md").write_text(
+        "---\nname: commit\ndescription: OVERRIDDEN\n---\ncustom commit prompt\n"
+    )
+    monkeypatch.setattr(_loader, "_get_skill_paths", lambda: [skills_dir])
+    skills = load_skills()
+    commit = next(s for s in skills if s.name == "commit")
+    assert commit.description == "OVERRIDDEN"
+
+
+# ------------------------------------------------------------------
+# find_skill
+# ------------------------------------------------------------------
+
+def test_find_skill_commit(skill_dir):
+    skill = find_skill("/commit")
+    assert skill is not None
+    assert skill.name == "commit"
+
+
+def test_find_skill_review(skill_dir):
+    skill = find_skill("/review")
+    assert skill is not None
+    assert skill.name == "review"
+
+
+def test_find_skill_review_pr(skill_dir):
+    skill = find_skill("/review-pr some-pr-url")
+    assert skill is not None
+    assert skill.name == "review"
+
+
+def test_find_skill_nonexistent(skill_dir):
+    result = find_skill("/nonexistent")
+    assert result is None
+
+
+# ------------------------------------------------------------------
+# substitute_arguments
+# ------------------------------------------------------------------
+
+def test_substitute_arguments_placeholder():
+    result = substitute_arguments("Deploy $ARGUMENTS please", "v1.2 prod", [])
+    assert result == "Deploy v1.2 prod please"
+
+
+def test_substitute_named_args(tmp_path):
+    result = substitute_arguments(
+        "Deploy $VERSION to $ENV. Full args: $ARGUMENTS",
+        "1.0 staging",
+        ["env", "version"],
+    )
+    # arg_names are positional: env=1.0, version=staging
+    assert "$VERSION" not in result
+    assert "$ENV" not in result
+    assert "$ARGUMENTS" not in result
+
+
+def test_substitute_missing_arg():
+    # If user provides fewer args than named slots, missing ones become ""
+    result = substitute_arguments("Hello $NAME!", "", ["name"])
+    assert result == "Hello !"
+
+
+def test_substitute_no_placeholders():
+    result = substitute_arguments("just a plain prompt", "some args", [])
+    assert result == "just a plain prompt"
--- a/nano-claude-code/tests/test_subagent.py
+++ b/nano-claude-code/tests/test_subagent.py
@@ -0,0 +1,136 @@
+"""Tests for the sub-agent system (subagent.py)."""
+import time
+import threading
+
+import pytest
+
+from multi_agent.subagent import SubAgentManager, SubAgentTask, _extract_final_text
+
+
+# ── Mock for _agent_run ──────────────────────────────────────────────────
+
+def _make_mock_agent_run(sleep_per_iter=0.05, iters=3):
+    """Return a mock _agent_run that simulates work and checks cancellation."""
+
+    def mock_agent_run(prompt, state, config, system_prompt, depth=0, cancel_check=None):
+        for i in range(iters):
+            if cancel_check and cancel_check():
+                return
+            time.sleep(sleep_per_iter)
+        # Append an assistant message to state
+        state.messages.append({
+            "role": "assistant",
+            "content": f"Result for: {prompt}",
+            "tool_calls": [],
+        })
+        # Yield a TurnDone-like event (generator protocol)
+        yield None
+
+    return mock_agent_run
+
+
+def _make_slow_mock(sleep_per_iter=0.2, iters=10):
+    """Return a slow mock for cancellation testing."""
+    return _make_mock_agent_run(sleep_per_iter=sleep_per_iter, iters=iters)
+
+
+@pytest.fixture
+def manager(monkeypatch):
+    """Create a SubAgentManager with mocked _agent_run."""
+    mock = _make_mock_agent_run()
+    monkeypatch.setattr("multi_agent.subagent._agent_run", mock)
+    mgr = SubAgentManager(max_concurrent=3, max_depth=3)
+    yield mgr
+    mgr.shutdown()
+
+
+@pytest.fixture
+def slow_manager(monkeypatch):
+    """Create a SubAgentManager with a slow mock for cancel testing."""
+    mock = _make_slow_mock()
+    monkeypatch.setattr("multi_agent.subagent._agent_run", mock)
+    mgr = SubAgentManager(max_concurrent=3, max_depth=3)
+    yield mgr
+    mgr.shutdown()
+
+
+# ── Tests ────────────────────────────────────────────────────────────────
+
+class TestSpawnAndWait:
+    def test_spawn_and_wait_completes(self, manager):
+        task = manager.spawn("hello", {}, "system")
+        result_task = manager.wait(task.id, timeout=5)
+        assert result_task is not None
+        assert result_task.status == "completed"
+        assert result_task.result == "Result for: hello"
+
+    def test_spawn_returns_immediately(self, manager):
+        task = manager.spawn("hello", {}, "system")
+        # Task should be pending or running, not yet completed
+        assert task.status in ("pending", "running")
+
+
+class TestListTasks:
+    def test_list_tasks(self, manager):
+        t1 = manager.spawn("task1", {}, "system")
+        t2 = manager.spawn("task2", {}, "system")
+        tasks = manager.list_tasks()
+        task_ids = [t.id for t in tasks]
+        assert t1.id in task_ids
+        assert t2.id in task_ids
+        assert len(tasks) == 2
+
+
+class TestCancel:
+    def test_cancel_running_task(self, slow_manager):
+        task = slow_manager.spawn("slow task", {}, "system")
+        # Wait briefly to ensure the task starts running
+        time.sleep(0.1)
+        assert task.status == "running"
+        success = slow_manager.cancel(task.id)
+        assert success is True
+        # Wait for the task to actually finish
+        slow_manager.wait(task.id, timeout=5)
+        assert task.status == "cancelled"
+
+
+class TestDepthLimit:
+    def test_spawn_at_max_depth_fails(self, manager):
+        task = manager.spawn("deep", {}, "system", depth=3)
+        assert task.status == "failed"
+        assert "Max depth" in task.result
+
+
+class TestGetResult:
+    def test_get_result_completed(self, manager):
+        task = manager.spawn("hello", {}, "system")
+        manager.wait(task.id, timeout=5)
+        result = manager.get_result(task.id)
+        assert result == "Result for: hello"
+
+    def test_get_result_unknown_id(self, manager):
+        result = manager.get_result("nonexistent_id")
+        assert result is None
+
+
+class TestExtractFinalText:
+    def test_extracts_last_assistant(self):
+        messages = [
+            {"role": "user", "content": "hi"},
+            {"role": "assistant", "content": "first"},
+            {"role": "user", "content": "more"},
+            {"role": "assistant", "content": "second"},
+        ]
+        assert _extract_final_text(messages) == "second"
+
+    def test_returns_none_for_empty(self):
+        assert _extract_final_text([]) is None
+
+    def test_returns_none_no_assistant(self):
+        messages = [{"role": "user", "content": "hi"}]
+        assert _extract_final_text(messages) is None
+
+
+class TestWaitUnknown:
+    def test_wait_unknown_returns_none(self, manager):
+        assert manager.wait("nonexistent") is None
--- a/nano-claude-code/tests/test_tool_registry.py
+++ b/nano-claude-code/tests/test_tool_registry.py
@@ -0,0 +1,160 @@
+from __future__ import annotations
+
+import pytest
+
+from tool_registry import (
+    ToolDef,
+    clear_registry,
+    execute_tool,
+    get_all_tools,
+    get_tool,
+    get_tool_schemas,
+    register_tool,
+)
+
+
+@pytest.fixture(autouse=True)
+def _clean_registry():
+    """Reset registry before each test."""
+    clear_registry()
+    yield
+    clear_registry()
+
+
+def _make_echo_tool(name: str = "echo", read_only: bool = False) -> ToolDef:
+    """Helper to build a simple echo tool."""
+    schema = {
+        "name": name,
+        "description": f"Echo tool ({name})",
+        "input_schema": {
+            "type": "object",
+            "properties": {
+                "text": {"type": "string", "description": "text to echo"},
+            },
+            "required": ["text"],
+        },
+    }
+
+    def func(params: dict, config: dict) -> str:
+        return params["text"]
+
+    return ToolDef(
+        name=name,
+        schema=schema,
+        func=func,
+        read_only=read_only,
+        concurrent_safe=True,
+    )
+
+
+# ------------------------------------------------------------------
+# register and get
+# ------------------------------------------------------------------
+
+def test_register_and_get():
+    tool = _make_echo_tool()
+    register_tool(tool)
+    result = get_tool("echo")
+    assert result is not None
+    assert result.name == "echo"
+
+
+def test_get_unknown_returns_none():
+    assert get_tool("no_such_tool") is None
+
+
+# ------------------------------------------------------------------
+# get_all_tools
+# ------------------------------------------------------------------
+
+def test_get_all_tools_empty():
+    assert get_all_tools() == []
+
+
+def test_get_all_tools():
+    register_tool(_make_echo_tool("a"))
+    register_tool(_make_echo_tool("b"))
+    names = [t.name for t in get_all_tools()]
+    assert sorted(names) == ["a", "b"]
+
+
+# ------------------------------------------------------------------
+# get_tool_schemas
+# ------------------------------------------------------------------
+
+def test_get_tool_schemas():
+    register_tool(_make_echo_tool("echo"))
+    schemas = get_tool_schemas()
+    assert len(schemas) == 1
+    assert schemas[0]["name"] == "echo"
+
+
+# ------------------------------------------------------------------
+# execute_tool
+# ------------------------------------------------------------------
+
+def test_execute_tool():
+    register_tool(_make_echo_tool())
+    result = execute_tool("echo", {"text": "hello"}, config={})
+    assert result == "hello"
+
+
+def test_execute_unknown_tool():
+    result = execute_tool("missing", {}, config={})
+    assert "unknown" in result.lower() or "not found" in result.lower()
+
+
+# ------------------------------------------------------------------
+# output truncation
+# ------------------------------------------------------------------
+
+def test_output_truncation():
+    def big_func(params: dict, config: dict) -> str:
+        return "x" * 100
+
+    tool = ToolDef(
+        name="big",
+        schema={"name": "big", "description": "big", "input_schema": {"type": "object", "properties": {}}},
+        func=big_func,
+        read_only=True,
+        concurrent_safe=True,
+    )
+    register_tool(tool)
+
+    result = execute_tool("big", {}, config={}, max_output=40)
+    # first half = 20 chars, last quarter = 10 chars, marker in between
+    assert len(result) < 100
+    assert "truncated" in result
+    # The kept portion: first 20 + last 10 should be present
+    assert result.startswith("x" * 20)
+    assert result.endswith("x" * 10)
+
+
+def test_no_truncation_when_within_limit():
+    register_tool(_make_echo_tool())
+    result = execute_tool("echo", {"text": "short"}, config={})
+    assert result == "short"
+
+
+# ------------------------------------------------------------------
+# duplicate register overwrites
+# ------------------------------------------------------------------
+
+def test_duplicate_register_overwrites():
+    register_tool(_make_echo_tool("dup"))
+
+    def new_func(params: dict, config: dict) -> str:
+        return "new"
+
+    replacement = ToolDef(
+        name="dup",
+        schema={"name": "dup", "description": "new", "input_schema": {"type": "object", "properties": {}}},
+        func=new_func,
+        read_only=False,
+        concurrent_safe=False,
+    )
+    register_tool(replacement)
+
+    assert len(get_all_tools()) == 1
+    result = execute_tool("dup", {}, config={})
+    assert result == "new"
--- a/nano-claude-code/tool_registry.py
+++ b/nano-claude-code/tool_registry.py
@@ -0,0 +1,98 @@
+"""Tool plugin registry for nano-claude-code.
+
+Provides a central registry for tool definitions, lookup, schema export,
+and dispatch with output truncation.
+"""
+from __future__ import annotations
+
+from dataclasses import dataclass
+from typing import Any, Callable, Dict, List, Optional
+
+
+@dataclass
+class ToolDef:
+    """Definition of a single tool plugin.
+
+    Attributes:
+        name: unique tool identifier
+        schema: JSON-schema dict sent to the API (name, description, input_schema)
+        func: callable(params: dict, config: dict) -> str
+        read_only: True if the tool never mutates state
+        concurrent_safe: True if safe to run in parallel with other tools
+    """
+    name: str
+    schema: Dict[str, Any]
+    func: Callable[[Dict[str, Any], Dict[str, Any]], str]
+    read_only: bool = False
+    concurrent_safe: bool = False
+
+
+# --------------- internal state ---------------
+
+_registry: Dict[str, ToolDef] = {}
+
+
+# --------------- public API ---------------
+
+def register_tool(tool_def: ToolDef) -> None:
+    """Register a tool, overwriting any existing tool with the same name."""
+    _registry[tool_def.name] = tool_def
+
+
+def get_tool(name: str) -> Optional[ToolDef]:
+    """Look up a tool by name. Returns None if not found."""
+    return _registry.get(name)
+
+
+def get_all_tools() -> List[ToolDef]:
+    """Return all registered tools (insertion order)."""
+    return list(_registry.values())
+
+
+def get_tool_schemas() -> List[Dict[str, Any]]:
+    """Return the schemas of all registered tools (for API tool parameter)."""
+    return [t.schema for t in _registry.values()]
+
+
+def execute_tool(
+    name: str,
+    params: Dict[str, Any],
+    config: Dict[str, Any],
+    max_output: int = 32000,
+) -> str:
+    """Dispatch a tool call by name.
+
+    Args:
+        name: tool name
+        params: tool input parameters dict
+        config: runtime configuration dict
+        max_output: maximum allowed output length in characters
+
+    Returns:
+        Tool result string, possibly truncated.
+    """
+    tool = get_tool(name)
+    if tool is None:
+        return f"Error: tool '{name}' not found."
+
+    try:
+        result = tool.func(params, config)
+    except Exception as e:
+        return f"Error executing {name}: {e}"
+
+    if len(result) > max_output:
+        first_half = max_output // 2
+        last_quarter = max_output // 4
+        truncated = len(result) - first_half - last_quarter
+        result = (
+            result[:first_half]
+            + f"\n[... {truncated} chars truncated ...]\n"
+            + result[-last_quarter:]
+        )
+
+    return result
+
+
+def clear_registry() -> None:
+    """Remove all registered tools. Intended for testing."""
+    _registry.clear()
--- a/nano-claude-code/tools.py
+++ b/nano-claude-code/tools.py
@@ -2,10 +2,14 @@
 import os
 import re
 import glob as _glob
+import difflib
 import subprocess
 from pathlib import Path
 from typing import Callable, Optional

+from tool_registry import ToolDef, register_tool
+from tool_registry import execute_tool as _registry_execute
+
 # ── Tool JSON schemas (sent to Claude API) ─────────────────────────────────

 TOOL_SCHEMAS = [
@@ -142,6 +146,24 @@ def _is_safe_bash(cmd: str) -> bool:
    return any(c.startswith(p) for p in _SAFE_PREFIXES)


+# ── Diff helpers ──────────────────────────────────────────────────────────
+
+def generate_unified_diff(old, new, filename, context_lines=3):
+    old_lines = old.splitlines(keepends=True)
+    new_lines = new.splitlines(keepends=True)
+    diff = difflib.unified_diff(old_lines, new_lines,
+        fromfile=f"a/{filename}", tofile=f"b/{filename}", n=context_lines)
+    return "".join(diff)
+
+def maybe_truncate_diff(diff_text, max_lines=80):
+    lines = diff_text.splitlines()
+    if len(lines) <= max_lines:
+        return diff_text
+    shown = lines[:max_lines]
+    remaining = len(lines) - max_lines
+    return "\n".join(shown) + f"\n\n[... {remaining} more lines ...]"
+
+
 # ── Tool implementations ───────────────────────────────────────────────────

 def _read(file_path: str, limit: int = None, offset: int = None) -> str:
@@ -164,10 +186,19 @@ def _read(file_path: str, limit: int = None, offset: int = None) -> str:
 def _write(file_path: str, content: str) -> str:
    p = Path(file_path)
    try:
+        is_new = not p.exists()
+        old_content = "" if is_new else p.read_text(errors="replace")
        p.parent.mkdir(parents=True, exist_ok=True)
        p.write_text(content)
-        lc = content.count("\n") + (1 if content and not content.endswith("\n") else 0)
-        return f"Wrote {lc} lines to {file_path}"
+        if is_new:
+            lc = content.count("\n") + (1 if content and not content.endswith("\n") else 0)
+            return f"Created {file_path} ({lc} lines)"
+        filename = p.name
+        diff = generate_unified_diff(old_content, content, filename)
+        if not diff:
+            return f"No changes in {file_path}"
+        truncated = maybe_truncate_diff(diff)
+        return f"File updated — {file_path}:\n\n{truncated}"
    except Exception as e:
        return f"Error: {e}"

@@ -184,10 +215,13 @@ def _edit(file_path: str, old_string: str, new_string: str, replace_all: bool =
        if count > 1 and not replace_all:
            return (f"Error: old_string appears {count} times. "
                    "Provide more context to make it unique, or use replace_all=true.")
+        old_content = content
        new_content = content.replace(old_string, new_string) if replace_all else \
                      content.replace(old_string, new_string, 1)
        p.write_text(new_content)
-        return f"Replaced {'all ' + str(count) if replace_all else '1'} occurrence(s) in {file_path}"
+        filename = p.name
+        diff = generate_unified_diff(old_content, new_content, filename)
+        return f"Changes applied to {filename}:\n\n{diff}"
    except Exception as e:
        return f"Error: {e}"

@@ -299,15 +333,22 @@ def _websearch(query: str) -> str:
        return f"Error: {e}"


-# ── Dispatcher ─────────────────────────────────────────────────────────────
+# ── Dispatcher (backward-compatible wrapper) ──────────────────────────────

 def execute_tool(
    name: str,
    inputs: dict,
    permission_mode: str = "auto",
    ask_permission: Optional[Callable[[str], bool]] = None,
+    config: dict = None,
 ) -> str:
-    """Dispatch tool execution; ask permission for write/destructive ops."""
+    """Dispatch tool execution; ask permission for write/destructive ops.
+
+    Permission checking is done here, then delegation goes to the registry.
+    The config dict is forwarded to tool functions so they can access
+    runtime context like _depth, _system_prompt, model, etc.
+    """
+    cfg = config or {}

    def _check(desc: str) -> bool:
        """Return True if action is allowed."""
@@ -317,43 +358,110 @@ def execute_tool(
            return ask_permission(desc)
        return True  # headless: allow everything

-    if name == "Read":
-        return _read(inputs["file_path"], inputs.get("limit"), inputs.get("offset"))
-
-    elif name == "Write":
+    # --- permission gate ---
+    if name == "Write":
        if not _check(f"Write to {inputs['file_path']}"):
            return "Denied: user rejected write operation"
-        return _write(inputs["file_path"], inputs["content"])
-
    elif name == "Edit":
        if not _check(f"Edit {inputs['file_path']}"):
            return "Denied: user rejected edit operation"
-        return _edit(inputs["file_path"], inputs["old_string"],
-                     inputs["new_string"], inputs.get("replace_all", False))
-
    elif name == "Bash":
        cmd = inputs["command"]
        if permission_mode != "accept-all" and not _is_safe_bash(cmd):
            if not _check(f"Bash: {cmd}"):
                return "Denied: user rejected bash command"
-        return _bash(cmd, inputs.get("timeout", 30))

-    elif name == "Glob":
-        return _glob(inputs["pattern"], inputs.get("path"))
+    return _registry_execute(name, inputs, cfg)

-    elif name == "Grep":
-        return _grep(
-            inputs["pattern"], inputs.get("path"), inputs.get("glob"),
-            inputs.get("output_mode", "files_with_matches"),
-            inputs.get("case_insensitive", False),
-            inputs.get("context", 0),
-        )

-    elif name == "WebFetch":
-        return _webfetch(inputs["url"], inputs.get("prompt"))
+# ── Register built-in tools with the plugin registry ─────────────────────

-    elif name == "WebSearch":
-        return _websearch(inputs["query"])
+def _register_builtins() -> None:
+    """Register all 8 built-in tools into the central registry."""
+    _tool_defs = [
+        ToolDef(
+            name="Read",
+            schema=TOOL_SCHEMAS[0],
+            func=lambda p, c: _read(**p),
+            read_only=True,
+            concurrent_safe=True,
+        ),
+        ToolDef(
+            name="Write",
+            schema=TOOL_SCHEMAS[1],
+            func=lambda p, c: _write(**p),
+            read_only=False,
+            concurrent_safe=False,
+        ),
+        ToolDef(
+            name="Edit",
+            schema=TOOL_SCHEMAS[2],
+            func=lambda p, c: _edit(**p),
+            read_only=False,
+            concurrent_safe=False,
+        ),
+        ToolDef(
+            name="Bash",
+            schema=TOOL_SCHEMAS[3],
+            func=lambda p, c: _bash(p["command"], p.get("timeout", 30)),
+            read_only=False,
+            concurrent_safe=False,
+        ),
+        ToolDef(
+            name="Glob",
+            schema=TOOL_SCHEMAS[4],
+            func=lambda p, c: _glob(p["pattern"], p.get("path")),
+            read_only=True,
+            concurrent_safe=True,
+        ),
+        ToolDef(
+            name="Grep",
+            schema=TOOL_SCHEMAS[5],
+            func=lambda p, c: _grep(
+                p["pattern"], p.get("path"), p.get("glob"),
+                p.get("output_mode", "files_with_matches"),
+                p.get("case_insensitive", False),
+                p.get("context", 0),
+            ),
+            read_only=True,
+            concurrent_safe=True,
+        ),
+        ToolDef(
+            name="WebFetch",
+            schema=TOOL_SCHEMAS[6],
+            func=lambda p, c: _webfetch(p["url"], p.get("prompt")),
+            read_only=True,
+            concurrent_safe=True,
+        ),
+        ToolDef(
+            name="WebSearch",
+            schema=TOOL_SCHEMAS[7],
+            func=lambda p, c: _websearch(p["query"]),
+            read_only=True,
+            concurrent_safe=True,
+        ),
+    ]
+    for td in _tool_defs:
+        register_tool(td)

-    else:
-        return f"Unknown tool: {name}"
+
+_register_builtins()
+
+
+# ── Memory tools (MemorySave, MemoryDelete, MemorySearch, MemoryList) ────────
+# Defined in memory/tools.py; importing registers them automatically.
+import memory.tools as _memory_tools  # noqa: F401
+
+
+
+# ── Multi-agent tools (Agent, SendMessage, CheckAgentResult, ListAgentTasks, ListAgentTypes) ──
+# Defined in multi_agent/tools.py; importing registers them automatically.
+import multi_agent.tools as _multiagent_tools  # noqa: F401
+
+# Expose get_agent_manager at module level for backward compatibility
+from multi_agent.tools import get_agent_manager as _get_agent_manager  # noqa: F401
+
+
+# ── Skill tools (Skill, SkillList) ────────────────────────────────────────
+# Defined in skill/tools.py; importing registers them automatically.
+import skill.tools as _skill_tools  # noqa: F401