Update README.MD and add nano-claude-code v3.0 + original-source-code/src
- README.MD: add original-source-code and nano-claude-code sections, update overview table (4 subprojects), add v3.0 news entry, expand comparison table with memory/multi-agent/skills dimensions - nano-claude-code v3.0: multi-agent package (multi_agent/), memory package (memory/), skill package (skill/) with built-in /commit and /review skills, context compression (compaction.py), tool registry plugin system, diff view, 17 slash commands, 18 built-in tools, 101 tests (~5000 lines total) - original-source-code/src: add raw TypeScript source tree (1884 files) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
374
nano-claude-code/docs/architecture.md
Normal file
374
nano-claude-code/docs/architecture.md
Normal file
@@ -0,0 +1,374 @@
|
||||
# Architecture Guide
|
||||
|
||||
This document is for developers who want to understand, modify, or extend nano-claude-code.
|
||||
For user-facing docs, see [README.md](../README.md).
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Nano-claude-code is a ~3.4K-line Python CLI that lets LLMs (GPT, Gemini, etc.) operate as
|
||||
coding agents with tool use, memory, sub-agents, and skills. The architecture is a flat
|
||||
module layout designed for readability and future migration to a package structure.
|
||||
|
||||
```
|
||||
User Input
|
||||
│
|
||||
▼
|
||||
nano_claude.py ── REPL, slash commands, rendering
|
||||
│
|
||||
├──► agent.py ── multi-turn loop, permission gates
|
||||
│ │
|
||||
│ ├──► providers.py ── API streaming (Anthropic / OpenAI-compat)
|
||||
│ ├──► tool_registry.py ──► tools.py ── 13 tools
|
||||
│ ├──► compaction.py ── context window management
|
||||
│ └──► subagent.py ── threaded sub-agent lifecycle
|
||||
│
|
||||
├──► context.py ── system prompt (git, CLAUDE.md, memory)
|
||||
│ └──► memory.py ── persistent file-based memory
|
||||
│
|
||||
├──► skills.py ── markdown skill loading + execution
|
||||
└──► config.py ── configuration persistence
|
||||
```
|
||||
|
||||
**Key invariant:** Dependencies flow downward. No circular imports at the module level
|
||||
(subagent.py uses lazy imports to call agent.py).
|
||||
|
||||
---
|
||||
|
||||
## Module Reference
|
||||
|
||||
### `tool_registry.py` — Tool Plugin System
|
||||
|
||||
The central registry that all tools register into. This is the foundation for extensibility.
|
||||
|
||||
**Data model:**
|
||||
|
||||
```python
|
||||
@dataclass
|
||||
class ToolDef:
|
||||
name: str # unique identifier (e.g. "Read", "MemorySave")
|
||||
schema: dict # JSON schema sent to the LLM API
|
||||
func: Callable # (params: dict, config: dict) -> str
|
||||
read_only: bool # True = auto-approve in 'auto' permission mode
|
||||
concurrent_safe: bool # True = safe to run in parallel (for sub-agents)
|
||||
```
|
||||
|
||||
**Public API:**
|
||||
|
||||
| Function | Description |
|
||||
|---|---|
|
||||
| `register_tool(tool_def)` | Add a tool to the registry (overwrites by name) |
|
||||
| `get_tool(name)` | Look up by name, returns `None` if not found |
|
||||
| `get_all_tools()` | List all registered tools |
|
||||
| `get_tool_schemas()` | Return schemas for API calls |
|
||||
| `execute_tool(name, params, config, max_output=32000)` | Execute with output truncation |
|
||||
| `clear_registry()` | Reset — for testing only |
|
||||
|
||||
**Output truncation:** If a tool returns more than `max_output` chars, the result is
|
||||
truncated to `first_half + [... N chars truncated ...] + last_quarter`. This prevents
|
||||
a single tool call (e.g. reading a huge file) from blowing up the context window.
|
||||
|
||||
**Registering a custom tool:**
|
||||
|
||||
```python
|
||||
from tool_registry import ToolDef, register_tool
|
||||
|
||||
def my_tool(params, config):
|
||||
return f"Hello, {params['name']}!"
|
||||
|
||||
register_tool(ToolDef(
|
||||
name="MyTool",
|
||||
schema={
|
||||
"name": "MyTool",
|
||||
"description": "A greeting tool",
|
||||
"input_schema": {
|
||||
"type": "object",
|
||||
"properties": {"name": {"type": "string"}},
|
||||
"required": ["name"],
|
||||
},
|
||||
},
|
||||
func=my_tool,
|
||||
read_only=True,
|
||||
concurrent_safe=True,
|
||||
))
|
||||
```
|
||||
|
||||
### `tools.py` — Built-in Tool Implementations
|
||||
|
||||
Contains the 8 core tools (Read, Write, Edit, Bash, Glob, Grep, WebFetch, WebSearch)
|
||||
plus memory tools (MemorySave, MemoryDelete) and sub-agent tools (Agent, CheckAgentResult,
|
||||
ListAgentTasks). All register themselves via `tool_registry` at import time.
|
||||
|
||||
**Key internals:**
|
||||
|
||||
- `_is_safe_bash(cmd)` — whitelist of safe shell commands for auto-approval
|
||||
- `generate_unified_diff(old, new, filename)` — diff generation for Edit/Write
|
||||
- `maybe_truncate_diff(diff_text, max_lines=80)` — truncate large diffs for display
|
||||
- `_get_agent_manager()` — lazy singleton for SubAgentManager
|
||||
- Backward-compatible `execute_tool(name, inputs, permission_mode, ask_permission)` wrapper
|
||||
|
||||
### `agent.py` — Core Agent Loop
|
||||
|
||||
The heart of the system. `run()` is a generator that yields events as they happen.
|
||||
|
||||
```python
|
||||
def run(user_message, state, config, system_prompt,
|
||||
depth=0, cancel_check=None) -> Generator:
|
||||
```
|
||||
|
||||
**Loop logic:**
|
||||
|
||||
```
|
||||
1. Append user message
|
||||
2. Inject depth into config (for sub-agent depth tracking)
|
||||
3. While True:
|
||||
a. Check cancel_check() — cooperative cancellation for sub-agents
|
||||
b. maybe_compact(state, config) — compress if near context limit
|
||||
c. Stream from provider → yield TextChunk / ThinkingChunk
|
||||
d. Record assistant message
|
||||
e. If no tool_calls → break
|
||||
f. For each tool_call:
|
||||
- Permission check (_check_permission)
|
||||
- If denied → yield PermissionRequest → user decides
|
||||
- Execute tool → yield ToolStart / ToolEnd
|
||||
- Append tool result
|
||||
g. Loop (model sees tool results and responds)
|
||||
```
|
||||
|
||||
**Event types:**
|
||||
|
||||
| Event | Fields | When |
|
||||
|---|---|---|
|
||||
| `TextChunk` | `text` | Streaming text delta |
|
||||
| `ThinkingChunk` | `text` | Extended thinking block |
|
||||
| `ToolStart` | `name, inputs` | Before tool execution |
|
||||
| `ToolEnd` | `name, result, permitted` | After tool execution |
|
||||
| `PermissionRequest` | `description, granted` | Needs user approval |
|
||||
| `TurnDone` | `input_tokens, output_tokens` | End of one API turn |
|
||||
|
||||
### `compaction.py` — Context Window Management
|
||||
|
||||
Keeps conversations within model context limits using two layers.
|
||||
|
||||
**Layer 1: Snip** (`snip_old_tool_results`)
|
||||
- Rule-based, no API cost
|
||||
- Truncates tool-role messages older than `preserve_last_n_turns` (default 6)
|
||||
- Keeps first half + last quarter of the content
|
||||
|
||||
**Layer 2: Auto-Compact** (`compact_messages`)
|
||||
- Model-driven: calls the current model to summarize old messages
|
||||
- Splits messages into [old | recent] at ~70/30 ratio
|
||||
- Replaces old messages with a summary + acknowledgment
|
||||
|
||||
**Trigger:** `maybe_compact()` checks `estimate_tokens(messages) > context_limit * 0.7`.
|
||||
Runs snip first (cheap), then auto-compact if still over.
|
||||
|
||||
**Token estimation:** `len(content) / 3.5` — simple heuristic. Works for most models.
|
||||
`get_context_limit(model)` reads from the provider registry.
|
||||
|
||||
### `memory.py` — Persistent Memory
|
||||
|
||||
File-based memory system stored in `~/.nano_claude/memory/`.
|
||||
|
||||
**Storage format:**
|
||||
|
||||
```
|
||||
~/.nano_claude/memory/
|
||||
├── MEMORY.md # Index: one line per memory
|
||||
├── user_preferences.md # Individual memory file
|
||||
└── project_auth.md
|
||||
```
|
||||
|
||||
Each memory file uses markdown with YAML frontmatter:
|
||||
|
||||
```markdown
|
||||
---
|
||||
name: user preferences
|
||||
description: coding style preferences
|
||||
type: feedback
|
||||
created: 2026-04-02
|
||||
---
|
||||
|
||||
User prefers 4-space indentation and type hints.
|
||||
```
|
||||
|
||||
**How it integrates:**
|
||||
- `get_memory_context()` returns the MEMORY.md index text
|
||||
- `context.py` injects this into the system prompt
|
||||
- The model reads the index, then uses `Read` tool to access full memory content
|
||||
- The model uses `MemorySave` / `MemoryDelete` tools to manage memories
|
||||
|
||||
### `subagent.py` — Threaded Sub-Agents
|
||||
|
||||
Sub-agents run in background threads via `ThreadPoolExecutor`.
|
||||
|
||||
**Key design decisions:**
|
||||
|
||||
1. **Fresh context** — each sub-agent starts with empty message history + task prompt
|
||||
2. **Depth limiting** — `max_depth=3`, checked at spawn time. Model gets an error message
|
||||
(not silent tool removal) so it can adapt.
|
||||
3. **Cooperative cancellation** — `cancel_check` callable checked each loop iteration.
|
||||
Python threads can't be killed safely, so we set a flag.
|
||||
4. **Threading, not asyncio** — the entire codebase is synchronous generators. Threading
|
||||
via `concurrent.futures` keeps things simple. The SubAgentManager API is designed to
|
||||
be compatible with a future async migration.
|
||||
|
||||
**Lifecycle:**
|
||||
|
||||
```
|
||||
spawn(prompt, config, system_prompt, depth)
|
||||
→ Creates SubAgentTask
|
||||
→ Submits _run to ThreadPoolExecutor
|
||||
→ _run calls agent.run() with depth+1
|
||||
|
||||
wait(task_id, timeout) → blocks until complete
|
||||
cancel(task_id) → sets _cancel_flag
|
||||
get_result(task_id) → returns result string
|
||||
```
|
||||
|
||||
### `skills.py` — Reusable Prompt Templates
|
||||
|
||||
Skills are markdown files with frontmatter. They are **not code** — just structured prompts
|
||||
that get injected into the agent loop.
|
||||
|
||||
**Skill file format:**
|
||||
|
||||
```markdown
|
||||
---
|
||||
name: commit
|
||||
description: Create a conventional commit
|
||||
triggers: ["/commit"]
|
||||
tools: [Bash, Read]
|
||||
---
|
||||
|
||||
Your prompt instructions here...
|
||||
```
|
||||
|
||||
**Execution:** `execute_skill()` wraps the skill prompt as a user message and calls
|
||||
`agent.run()`. The skill runs through the exact same agent loop as a normal query.
|
||||
|
||||
**Search order:** Project-level (`./.nano_claude/skills/`) overrides user-level
|
||||
(`~/.nano_claude/skills/`) when skill names collide.
|
||||
|
||||
### `providers.py` — Multi-Provider Abstraction
|
||||
|
||||
Two streaming adapters cover all providers:
|
||||
|
||||
| Adapter | Providers |
|
||||
|---|---|
|
||||
| `stream_anthropic()` | Anthropic (native SDK) |
|
||||
| `stream_openai_compat()` | OpenAI, Gemini, Kimi, Qwen, Zhipu, DeepSeek, Ollama, LM Studio, Custom |
|
||||
|
||||
**Neutral message format** (provider-independent):
|
||||
|
||||
```python
|
||||
{"role": "user", "content": "..."}
|
||||
{"role": "assistant", "content": "...", "tool_calls": [{"id": "...", "name": "...", "input": {...}}]}
|
||||
{"role": "tool", "tool_call_id": "...", "name": "...", "content": "..."}
|
||||
```
|
||||
|
||||
Conversion functions: `messages_to_anthropic()`, `messages_to_openai()`, `tools_to_openai()`.
|
||||
|
||||
**Provider-specific handling:**
|
||||
- Gemini 3 models require `thought_signature` in tool call responses — this is transparently
|
||||
captured and passed through via `extra_content` on tool_call dicts.
|
||||
|
||||
### `context.py` — System Prompt Builder
|
||||
|
||||
Assembles the system prompt from:
|
||||
1. Base template (role, date, cwd, platform)
|
||||
2. Git info (branch, status, recent commits)
|
||||
3. CLAUDE.md content (project-level + global)
|
||||
4. Memory index (from `memory.get_memory_context()`)
|
||||
|
||||
### `config.py` — Configuration
|
||||
|
||||
Defaults stored in `~/.nano_claude/config.json`. Key settings:
|
||||
|
||||
| Key | Default | Description |
|
||||
|---|---|---|
|
||||
| `model` | `claude-opus-4-6` | Active model |
|
||||
| `max_tokens` | `8192` | Max output tokens |
|
||||
| `permission_mode` | `auto` | Permission mode |
|
||||
| `max_tool_output` | `32000` | Tool output truncation limit |
|
||||
| `max_agent_depth` | `3` | Max sub-agent nesting |
|
||||
| `max_concurrent_agents` | `3` | Thread pool size |
|
||||
|
||||
---
|
||||
|
||||
## Data Flow Example
|
||||
|
||||
A user asks "Read config.py and change max_tokens to 16384":
|
||||
|
||||
```
|
||||
1. nano_claude.py captures input
|
||||
2. agent.run() appends user message, calls maybe_compact()
|
||||
3. providers.stream() sends to Gemini API with 13 tool schemas
|
||||
4. Model responds: text + tool_call[Read(config.py)]
|
||||
5. agent.py checks permission (Read = read_only → auto-approve)
|
||||
6. tool_registry.execute_tool("Read", ...) → file content (truncated if >32K)
|
||||
7. Tool result appended to messages, loop back to step 3
|
||||
8. Model responds: text + tool_call[Edit(config.py, "8192", "16384")]
|
||||
9. agent.py checks permission (Edit = not read_only → ask user)
|
||||
10. User approves → tools.py._edit() runs, generates diff
|
||||
11. nano_claude.py renders diff with ANSI colors (red/green)
|
||||
12. Tool result appended, loop back to step 3
|
||||
13. Model responds: "Done, max_tokens changed to 16384"
|
||||
14. No tool_calls → loop ends, TurnDone yielded
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Testing
|
||||
|
||||
```bash
|
||||
# Run all 78 tests
|
||||
python -m pytest tests/ -v
|
||||
|
||||
# Run specific module tests
|
||||
python -m pytest tests/test_tool_registry.py -v
|
||||
python -m pytest tests/test_compaction.py -v
|
||||
python -m pytest tests/test_memory.py -v
|
||||
python -m pytest tests/test_subagent.py -v
|
||||
python -m pytest tests/test_skills.py -v
|
||||
python -m pytest tests/test_diff_view.py -v
|
||||
```
|
||||
|
||||
Tests use `monkeypatch` and `tmp_path` fixtures to avoid side effects.
|
||||
Sub-agent tests mock `_agent_run` to avoid real API calls.
|
||||
|
||||
---
|
||||
|
||||
## Future: Package Refactoring
|
||||
|
||||
When `tools.py` or `agent.py` grow too large, the flat layout can be migrated to:
|
||||
|
||||
```
|
||||
ncc/
|
||||
├── __init__.py
|
||||
├── repl.py # from nano_claude.py
|
||||
├── agent/
|
||||
│ ├── loop.py # from agent.py
|
||||
│ ├── subagent.py # from subagent.py
|
||||
│ └── compaction.py # from compaction.py
|
||||
├── providers/
|
||||
│ ├── base.py
|
||||
│ ├── openai_compat.py
|
||||
│ └── registry.py
|
||||
├── tools/
|
||||
│ ├── registry.py # from tool_registry.py
|
||||
│ ├── builtin.py # core 8 tools from tools.py
|
||||
│ ├── memory.py # MemorySave/MemoryDelete from tools.py
|
||||
│ └── subagent.py # Agent/Check/List from tools.py
|
||||
├── memory/
|
||||
│ └── store.py # from memory.py
|
||||
├── skills/
|
||||
│ └── loader.py # from skills.py
|
||||
└── config.py
|
||||
```
|
||||
|
||||
The current code is structured to make this migration straightforward:
|
||||
- Modules communicate via function parameters, not globals
|
||||
- Each module has a small public API surface
|
||||
- Dependencies are unidirectional
|
||||
BIN
nano-claude-code/docs/logo-v1.png
Normal file
BIN
nano-claude-code/docs/logo-v1.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 191 KiB |
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,643 @@
|
||||
# Open-CC: Nano Claude Code Enhancement Design
|
||||
|
||||
**Date:** 2026-04-02
|
||||
**Status:** Approved
|
||||
**Target:** GPT-5.4, Gemini 3/3.1 Pro (Claude not in scope)
|
||||
**Code budget:** ~10K lines total (currently ~2.2K)
|
||||
**Constraint:** PR-friendly, mergeable back to nano-claude-code upstream
|
||||
|
||||
---
|
||||
|
||||
## 1. Overview
|
||||
|
||||
Evolve nano-claude-code from a minimal ~2.2K-line reference implementation into a capable AI coding CLI, approaching Claude Code's core functionality while staying lean. Five enhancement areas:
|
||||
|
||||
1. **Context Window Management** (`compaction.py`)
|
||||
2. **Tool System Enhancement** (`tool_registry.py` + `tools.py` refactor)
|
||||
3. **Sub-Agent** (`subagent.py`)
|
||||
4. **Memory System** (`memory.py`)
|
||||
5. **Skills System** (`skills.py`)
|
||||
|
||||
### Strategy
|
||||
|
||||
**Approach A: Layered Enhancement** -- add new modules alongside existing files, minimize changes to existing code. When agent.py grows too complex, refactor into Approach B (package structure under `ncc/`).
|
||||
|
||||
### Design Principles
|
||||
|
||||
- Modules communicate via function parameters / dataclasses, no globals
|
||||
- Each new module exposes 2-3 public functions, internals self-contained
|
||||
- New logic in agent.py grouped by clear `# --- section ---` comments
|
||||
- All code in English (comments, docstrings, commit messages)
|
||||
|
||||
---
|
||||
|
||||
## 2. File Structure
|
||||
|
||||
```
|
||||
nano-claude-code/
|
||||
├── nano_claude.py # REPL -- add /memory, /skill slash commands
|
||||
├── agent.py # Agent loop -- add compaction call + sub-agent dispatch
|
||||
├── providers.py # No changes (already solid)
|
||||
├── tools.py # Refactor: register built-in tools via registry
|
||||
├── context.py # Extend: inject memory context
|
||||
├── config.py # Add new config keys
|
||||
│
|
||||
├── compaction.py # NEW: Context window management
|
||||
├── subagent.py # NEW: Sub-agent lifecycle
|
||||
├── memory.py # NEW: File-based memory system
|
||||
├── skills.py # NEW: Skill loading and execution
|
||||
└── tool_registry.py # NEW: Tool plugin registry
|
||||
```
|
||||
|
||||
### Module Dependency Graph (unidirectional)
|
||||
|
||||
```
|
||||
nano_claude.py
|
||||
├-> agent.py
|
||||
│ ├-> providers.py
|
||||
│ ├-> tool_registry.py -> tools.py (built-in implementations)
|
||||
│ ├-> compaction.py -> providers.py (for summary model call)
|
||||
│ └-> subagent.py (calls agent.py:run recursively)
|
||||
├-> context.py -> memory.py
|
||||
├-> skills.py -> tool_registry.py
|
||||
└-> config.py
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Context Window Management (`compaction.py`)
|
||||
|
||||
Two-layer compression, inspired by Claude Code's three-layer strategy (Layer 3 contextCollapse is experimental, deferred).
|
||||
|
||||
### 3.1 Layer 1: Auto-Compact (model-driven summary)
|
||||
|
||||
Triggered when estimated token count exceeds 70% of model's context limit.
|
||||
|
||||
```python
|
||||
def compact_messages(messages: list[dict], config: dict) -> list[dict]:
|
||||
"""
|
||||
Split messages into [old | recent].
|
||||
Summarize old via model call.
|
||||
Return [summary_msg, ack_msg, *recent].
|
||||
"""
|
||||
split_point = find_split_point(messages, keep_ratio=0.3)
|
||||
old = messages[:split_point]
|
||||
recent = messages[split_point:]
|
||||
summary = call_model_for_summary(old, config)
|
||||
return [
|
||||
{"role": "user", "content": f"[Conversation summary]\n{summary}"},
|
||||
{"role": "assistant", "content": "Understood, I have the context."},
|
||||
*recent
|
||||
]
|
||||
```
|
||||
|
||||
### 3.2 Layer 2: Tool-Result Snipping (rule-based)
|
||||
|
||||
Truncate old tool outputs without model call. Fast and cheap.
|
||||
|
||||
```python
|
||||
def snip_old_tool_results(messages: list[dict], max_chars: int = 2000) -> list[dict]:
|
||||
"""
|
||||
For tool results older than N turns, truncate to max_chars.
|
||||
Preserve first/last lines, add [snipped N chars] marker.
|
||||
"""
|
||||
```
|
||||
|
||||
### 3.3 Token Estimation
|
||||
|
||||
```python
|
||||
def estimate_tokens(messages: list[dict]) -> int:
|
||||
"""Use tiktoken for GPT models, chars/3.5 fallback."""
|
||||
|
||||
def get_context_limit(model: str) -> int:
|
||||
"""Return context window size from provider registry."""
|
||||
```
|
||||
|
||||
### 3.4 Integration Point
|
||||
|
||||
```python
|
||||
# In agent.py run() loop, before each API call:
|
||||
def _maybe_compact(state: AgentState, config: dict) -> bool:
|
||||
token_count = estimate_tokens(state.messages)
|
||||
threshold = get_context_limit(config["model"]) * 0.7
|
||||
if token_count > threshold:
|
||||
state.messages = compact_messages(state.messages, config)
|
||||
return True
|
||||
return False
|
||||
```
|
||||
|
||||
### 3.5 Public API
|
||||
|
||||
```python
|
||||
maybe_compact(state: AgentState, config: dict) -> bool
|
||||
estimate_tokens(messages: list[dict]) -> int
|
||||
get_context_limit(model: str) -> int
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. Tool System Enhancement (`tool_registry.py` + `tools.py`)
|
||||
|
||||
### 4.1 Tool Registry
|
||||
|
||||
```python
|
||||
@dataclass
|
||||
class ToolDef:
|
||||
name: str
|
||||
schema: dict # JSON schema for parameters
|
||||
func: Callable # (params: dict, config: dict) -> str
|
||||
read_only: bool # True = auto-approve in 'auto' mode
|
||||
concurrent_safe: bool # True = safe for parallel sub-agent use
|
||||
|
||||
_TOOLS: dict[str, ToolDef] = {}
|
||||
|
||||
def register_tool(tool_def: ToolDef) -> None
|
||||
def get_tool(name: str) -> ToolDef | None
|
||||
def get_all_tools() -> list[ToolDef]
|
||||
def get_tool_schemas() -> list[dict]
|
||||
def execute_tool(name: str, params: dict, config: dict) -> str
|
||||
```
|
||||
|
||||
### 4.2 Tool Output Truncation
|
||||
|
||||
Prevent oversized tool outputs (e.g., `cat` large file, `ls -R`) from blowing up context
|
||||
before compaction even gets a chance to run. Applied at the `execute_tool` boundary:
|
||||
|
||||
```python
|
||||
MAX_TOOL_OUTPUT = 32_000 # ~8K tokens, configurable per tool
|
||||
|
||||
def execute_tool(name, params, config):
|
||||
tool = get_tool(name)
|
||||
result = tool.func(params, config)
|
||||
|
||||
# Immediate truncation at source
|
||||
if len(result) > MAX_TOOL_OUTPUT:
|
||||
head = result[:MAX_TOOL_OUTPUT // 2]
|
||||
tail = result[-MAX_TOOL_OUTPUT // 4:]
|
||||
snipped = len(result) - len(head) - len(tail)
|
||||
result = f"{head}\n\n[... {snipped} chars truncated ...]\n\n{tail}"
|
||||
|
||||
return result
|
||||
```
|
||||
|
||||
Additionally, `Bash` tool caps `subprocess` stdout reads to prevent unbounded
|
||||
output (e.g., `cat /dev/urandom`).
|
||||
|
||||
This creates a two-layer defense:
|
||||
- **Layer 0 (here):** hard truncation at tool execution time — prevents oversized messages
|
||||
- **Layer 2 (compaction.py snip):** soft truncation of old tool results — reclaims context space
|
||||
|
||||
### 4.3 Built-in Tools Refactor
|
||||
|
||||
Existing tools.py implementations unchanged. Wrap each with `register_tool()` at module load:
|
||||
|
||||
```python
|
||||
register_tool(ToolDef(
|
||||
name="Read", schema=READ_SCHEMA, func=_read_file,
|
||||
read_only=True, concurrent_safe=True
|
||||
))
|
||||
```
|
||||
|
||||
### 4.3 Permission Logic (unified)
|
||||
|
||||
```python
|
||||
# agent.py
|
||||
def _check_permission(tool_name, params, config):
|
||||
tool = get_tool(tool_name)
|
||||
if config["permission_mode"] == "accept-all":
|
||||
return True
|
||||
if tool.read_only:
|
||||
return True
|
||||
if tool_name == "Bash" and _is_safe_command(params["command"]):
|
||||
return True
|
||||
return None # ask user
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Sub-Agent (`subagent.py`)
|
||||
|
||||
### 5.1 Data Model
|
||||
|
||||
```python
|
||||
@dataclass
|
||||
class SubAgentTask:
|
||||
id: str
|
||||
prompt: str
|
||||
status: str # "pending" | "running" | "completed" | "failed" | "cancelled"
|
||||
messages: list[dict] # independent message history
|
||||
result: str | None
|
||||
model: str | None # optional model override
|
||||
depth: int = 0 # recursion depth counter
|
||||
_cancel_flag: bool = False
|
||||
_future: Future | None = None
|
||||
|
||||
@dataclass
|
||||
class SubAgentManager:
|
||||
tasks: dict[str, SubAgentTask] = field(default_factory=dict)
|
||||
max_concurrent: int = 3
|
||||
max_depth: int = 3
|
||||
_pool: ThreadPoolExecutor = field(default_factory=
|
||||
lambda: ThreadPoolExecutor(max_workers=3))
|
||||
|
||||
def spawn(self, prompt, config, system_prompt, depth=0) -> SubAgentTask
|
||||
def get_result(self, task_id) -> str | None
|
||||
def list_tasks(self) -> list[SubAgentTask]
|
||||
def cancel(self, task_id) -> bool
|
||||
def wait(self, task_id, timeout=None) -> SubAgentTask
|
||||
```
|
||||
|
||||
### 5.2 Execution Model — Threading from Day 1
|
||||
|
||||
Sub-agents run in background threads via `ThreadPoolExecutor`. This enables:
|
||||
- Non-blocking spawn (main agent continues or waits by choice)
|
||||
- Cancellation via cooperative flag
|
||||
- Concurrent sub-agents (up to `max_concurrent`)
|
||||
|
||||
```python
|
||||
def spawn(self, prompt, config, system_prompt, depth=0):
|
||||
if depth >= self.max_depth:
|
||||
return SubAgentTask(status="failed",
|
||||
result="Error: max sub-agent depth reached.")
|
||||
|
||||
task = SubAgentTask(id=uuid4().hex[:8], prompt=prompt,
|
||||
status="running", depth=depth, ...)
|
||||
|
||||
def _run():
|
||||
sub_state = AgentState()
|
||||
try:
|
||||
for event in agent.run(
|
||||
prompt, sub_state, config, system_prompt,
|
||||
depth=depth + 1,
|
||||
cancel_check=lambda: task._cancel_flag
|
||||
):
|
||||
if isinstance(event, TurnDone):
|
||||
task.result = extract_final_text(sub_state.messages)
|
||||
task.status = "completed"
|
||||
except Exception as e:
|
||||
task.result = f"Error: {e}"
|
||||
task.status = "failed"
|
||||
|
||||
task._future = self._pool.submit(_run)
|
||||
self.tasks[task.id] = task
|
||||
return task
|
||||
```
|
||||
|
||||
### 5.3 Cooperative Cancellation
|
||||
|
||||
Python threads cannot be killed safely. Instead, `agent.run()` checks a
|
||||
`cancel_check` callable each loop iteration:
|
||||
|
||||
```python
|
||||
# agent.py run() — new parameter
|
||||
def run(user_message, state, config, system_prompt,
|
||||
depth=0, cancel_check=None):
|
||||
...
|
||||
while True:
|
||||
if cancel_check and cancel_check():
|
||||
return # clean exit
|
||||
for event in stream(...):
|
||||
yield event
|
||||
...
|
||||
```
|
||||
|
||||
### 5.4 Depth Limiting (No Tool Removal)
|
||||
|
||||
Sub-agents CAN call Agent tool (enabling A -> B -> C chains). Depth is
|
||||
passed through, and the Agent tool returns an error at `max_depth`:
|
||||
|
||||
```python
|
||||
def _agent_tool_func(params, config, depth=0):
|
||||
if depth >= manager.max_depth:
|
||||
return ("Error: max sub-agent depth reached. "
|
||||
"Complete this task directly without spawning sub-agents.")
|
||||
return manager.spawn(params["prompt"], config, system_prompt, depth)
|
||||
```
|
||||
|
||||
The model sees the error and adapts — no silent capability removal.
|
||||
|
||||
### 5.5 Context Strategy
|
||||
|
||||
Sub-agent gets **fresh context** (no parent message history):
|
||||
|
||||
```python
|
||||
sub_system_prompt = f"""You are a sub-agent. Your task:
|
||||
{prompt}
|
||||
|
||||
Working directory: {cwd}
|
||||
{memory_context}
|
||||
"""
|
||||
```
|
||||
|
||||
### 5.6 Tool Registration — 3 Tools
|
||||
|
||||
The sub-agent system registers three tools:
|
||||
|
||||
**Agent** — spawn a sub-agent:
|
||||
|
||||
```python
|
||||
AGENT_SCHEMA = {
|
||||
"name": "Agent",
|
||||
"description": "Launch a sub-agent to handle a task independently.",
|
||||
"input_schema": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"prompt": {"type": "string", "description": "Task description"},
|
||||
"model": {"type": "string", "description": "Optional model override"},
|
||||
"wait": {"type": "boolean", "default": True,
|
||||
"description": "True = block until done (default). "
|
||||
"False = return task_id immediately."}
|
||||
},
|
||||
"required": ["prompt"]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
- `wait=True` (default): spawn + block + return result. Feels synchronous to model.
|
||||
- `wait=False`: spawn + return task_id immediately. Model must use CheckAgentResult later.
|
||||
|
||||
**CheckAgentResult** — poll a background sub-agent:
|
||||
|
||||
```python
|
||||
CHECK_AGENT_RESULT_SCHEMA = {
|
||||
"name": "CheckAgentResult",
|
||||
"description": "Check the result of a background sub-agent task.",
|
||||
"input_schema": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"task_id": {"type": "string", "description": "Task ID from Agent tool"}
|
||||
},
|
||||
"required": ["task_id"]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Returns: status + result (if completed), or status + "still running".
|
||||
|
||||
**ListAgentTasks** — overview of all sub-agents:
|
||||
|
||||
```python
|
||||
LIST_AGENT_TASKS_SCHEMA = {
|
||||
"name": "ListAgentTasks",
|
||||
"description": "List all sub-agent tasks and their status.",
|
||||
"input_schema": {"type": "object", "properties": {}}
|
||||
}
|
||||
```
|
||||
|
||||
Returns a table of `[id, status, prompt_preview]` for all tasks.
|
||||
|
||||
---
|
||||
|
||||
## 6. Memory System (`memory.py`)
|
||||
|
||||
### 6.1 Storage
|
||||
|
||||
```
|
||||
~/.nano_claude/memory/
|
||||
├── MEMORY.md # Index file (max 200 lines)
|
||||
├── user_role.md # Individual memory files
|
||||
├── feedback_testing.md
|
||||
└── ...
|
||||
```
|
||||
|
||||
Memory file format:
|
||||
|
||||
```markdown
|
||||
---
|
||||
name: user role
|
||||
description: user is a data scientist focused on logging
|
||||
type: user
|
||||
created: 2026-04-02
|
||||
---
|
||||
|
||||
User is a data scientist, currently investigating observability/logging.
|
||||
```
|
||||
|
||||
### 6.2 Public API
|
||||
|
||||
```python
|
||||
@dataclass
|
||||
class MemoryEntry:
|
||||
name: str
|
||||
description: str
|
||||
type: str # "user" | "feedback" | "project" | "reference"
|
||||
content: str
|
||||
file_path: str
|
||||
created: str
|
||||
|
||||
def load_index() -> list[MemoryEntry]
|
||||
def save_memory(entry: MemoryEntry) -> None
|
||||
def delete_memory(name: str) -> None
|
||||
def search_memory(query: str) -> list[MemoryEntry]
|
||||
def get_memory_context() -> str # for system prompt injection
|
||||
```
|
||||
|
||||
### 6.3 Tool Registration
|
||||
|
||||
Two tools for model-driven memory management:
|
||||
|
||||
- **MemorySave**: `{name, type, description, content}` -> write file + update index
|
||||
- **MemoryDelete**: `{name}` -> remove file + update index
|
||||
|
||||
### 6.4 Context Integration
|
||||
|
||||
`context.py:build_system_prompt()` appends `memory.get_memory_context()` (the MEMORY.md index). Model uses Read tool to access full memory file content when needed.
|
||||
|
||||
---
|
||||
|
||||
## 7. Skills System (`skills.py`)
|
||||
|
||||
### 7.1 Skill Definition
|
||||
|
||||
Markdown files with frontmatter:
|
||||
|
||||
```
|
||||
~/.nano_claude/skills/commit.md
|
||||
```
|
||||
|
||||
```markdown
|
||||
---
|
||||
name: commit
|
||||
description: Create a git commit with conventional format
|
||||
triggers: ["/commit", "commit changes"]
|
||||
tools: [Bash, Read]
|
||||
---
|
||||
|
||||
# Commit Skill
|
||||
|
||||
Analyze staged changes and create a well-formatted commit message.
|
||||
...
|
||||
```
|
||||
|
||||
### 7.2 Search Path
|
||||
|
||||
```python
|
||||
SKILL_PATHS = [
|
||||
Path.cwd() / ".nano_claude" / "skills", # project-level (priority)
|
||||
Path.home() / ".nano_claude" / "skills", # user-level
|
||||
]
|
||||
```
|
||||
|
||||
### 7.3 Public API
|
||||
|
||||
```python
|
||||
@dataclass
|
||||
class SkillDef:
|
||||
name: str
|
||||
description: str
|
||||
triggers: list[str]
|
||||
tools: list[str]
|
||||
prompt: str
|
||||
file_path: str
|
||||
|
||||
def load_skills() -> list[SkillDef]
|
||||
def find_skill(query: str) -> SkillDef | None
|
||||
def execute_skill(skill, args, state, config) -> Generator
|
||||
```
|
||||
|
||||
### 7.4 Execution Model
|
||||
|
||||
Skills are just prompts injected into the normal agent loop:
|
||||
|
||||
```python
|
||||
def execute_skill(skill, args, state, config):
|
||||
prompt = f"[Skill: {skill.name}]\n\n{skill.prompt}"
|
||||
if args:
|
||||
prompt += f"\n\nUser context: {args}"
|
||||
system_prompt = build_system_prompt(config)
|
||||
for event in agent.run(prompt, state, config, system_prompt):
|
||||
yield event
|
||||
```
|
||||
|
||||
### 7.5 REPL Integration
|
||||
|
||||
In `nano_claude.py`, unmatched `/` commands fall through to skill lookup:
|
||||
|
||||
```python
|
||||
if user_input.startswith("/"):
|
||||
# Try built-in slash commands first
|
||||
# If no match -> find_skill(user_input)
|
||||
# If skill found -> execute_skill(...)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 8. Diff View for File Modifications
|
||||
|
||||
Core UX improvement: show git-style red/green diff when Edit or Write modifies an existing file.
|
||||
|
||||
### 8.1 Diff Generation (in tools.py)
|
||||
|
||||
Edit and Write tool implementations capture before/after content and generate unified diff:
|
||||
|
||||
```python
|
||||
import difflib
|
||||
|
||||
def generate_unified_diff(old, new, filename, context_lines=3):
|
||||
"""
|
||||
Args:
|
||||
old: original file content, str
|
||||
new: modified file content, str
|
||||
filename: display name, str
|
||||
context_lines: lines of context around changes, int
|
||||
Returns:
|
||||
unified diff string
|
||||
"""
|
||||
old_lines = old.splitlines(keepends=True)
|
||||
new_lines = new.splitlines(keepends=True)
|
||||
diff = difflib.unified_diff(
|
||||
old_lines, new_lines,
|
||||
fromfile=f"a/{filename}", tofile=f"b/{filename}",
|
||||
n=context_lines
|
||||
)
|
||||
return "".join(diff)
|
||||
```
|
||||
|
||||
Tool return values change:
|
||||
- **Edit**: `"Changes applied to {filename}:\n\n{diff}"`
|
||||
- **Write** (existing file): `"File updated:\n\n{diff}"`
|
||||
- **Write** (new file): `"New file created: {filename} ({n} lines)"` (no diff)
|
||||
|
||||
### 8.2 REPL Rendering (in nano_claude.py)
|
||||
|
||||
Detect diff blocks in tool output and render with ANSI colors:
|
||||
|
||||
```python
|
||||
def render_diff(diff_text):
|
||||
for line in diff_text.splitlines():
|
||||
if line.startswith("+++") or line.startswith("---"):
|
||||
print(f"\033[1m{line}\033[0m") # bold
|
||||
elif line.startswith("+"):
|
||||
print(f"\033[32m{line}\033[0m") # green
|
||||
elif line.startswith("-"):
|
||||
print(f"\033[31m{line}\033[0m") # red
|
||||
elif line.startswith("@@"):
|
||||
print(f"\033[36m{line}\033[0m") # cyan
|
||||
else:
|
||||
print(line)
|
||||
```
|
||||
|
||||
### 8.3 Diff Truncation
|
||||
|
||||
For large diffs (e.g., Write replaces entire file), cap the diff display:
|
||||
|
||||
```python
|
||||
MAX_DIFF_LINES = 80
|
||||
|
||||
def maybe_truncate_diff(diff_text):
|
||||
lines = diff_text.splitlines()
|
||||
if len(lines) > MAX_DIFF_LINES:
|
||||
shown = lines[:MAX_DIFF_LINES]
|
||||
remaining = len(lines) - MAX_DIFF_LINES
|
||||
return "\n".join(shown) + f"\n\n[... {remaining} more lines ...]"
|
||||
return diff_text
|
||||
```
|
||||
|
||||
Note: truncation applies to the **display** in REPL only. The full diff is still
|
||||
returned to the model so it can verify the change.
|
||||
|
||||
---
|
||||
|
||||
## 9. Implementation Order
|
||||
|
||||
Each step is an independent PR:
|
||||
|
||||
| Phase | Module | Depends On | Estimated Lines |
|
||||
|-------|--------|-----------|-----------------|
|
||||
| 1 | `tool_registry.py` + `tools.py` refactor | None | ~600 |
|
||||
| 2 | Diff view in `tools.py` + `nano_claude.py` | Phase 1 | ~100 |
|
||||
| 3 | `compaction.py` + agent.py integration | Phase 1 | ~300 |
|
||||
| 4 | `memory.py` + context.py integration | Phase 1 | ~200 |
|
||||
| 5 | `subagent.py` + agent.py integration (threading) | Phase 1 | ~350 |
|
||||
| 6 | `skills.py` + nano_claude.py integration | Phase 1, 4 | ~200 |
|
||||
| 7 | Slash commands + config updates | All above | ~300 |
|
||||
|
||||
**Total new code: ~2050 lines. Grand total: ~4.2K lines.**
|
||||
|
||||
---
|
||||
|
||||
## 10. Key Decisions
|
||||
|
||||
| Decision | Choice | Rationale |
|
||||
|----------|--------|-----------|
|
||||
| Compression layers | 2 (autoCompact + snip) | Layer 3 is experimental in Claude Code |
|
||||
| Tool output truncation | Hard cap at execute_tool boundary | Prevents oversized outputs before compaction runs |
|
||||
| Sub-agent execution | Threading from day 1 | Sync blocks main agent, can't cancel, can't parallelize |
|
||||
| Sub-agent depth | Depth counter (max 3), no tool removal | Model sees error and adapts; sub-sub-agents allowed |
|
||||
| Sub-agent tools | Agent + CheckAgentResult + ListAgentTasks | Model needs feedback loop for async tasks |
|
||||
| Diff view | difflib unified diff + ANSI colors | Core UX, zero dependencies |
|
||||
| Memory search | Keyword match, no embeddings | Keep simple, model judges relevance |
|
||||
| Skills format | Markdown + frontmatter | Human-readable, git-friendly, no Python needed |
|
||||
| Tool registry | Global dict + register function | Simple, extensible, easy to migrate to package |
|
||||
| Target models | GPT-5.4, Gemini 3/3.1 Pro | User's primary use case |
|
||||
| No Claude support | Intentional | Official Claude Code exists |
|
||||
|
||||
---
|
||||
|
||||
## 11. Future Considerations (Not in Scope)
|
||||
|
||||
- MCP protocol support
|
||||
- Remote skill marketplace
|
||||
- Voice mode
|
||||
- Bridge to desktop apps
|
||||
- contextCollapse (Layer 3 compression)
|
||||
Reference in New Issue
Block a user