Logo

Nano Claude Code: A Minimal Python Reimplementation

The newest source of Claude Code Β· Issue

## πŸ”₯πŸ”₯πŸ”₯ News (Pacific Time) - 01:47 PM, Apr 01, 2026: Support VLLM inference (**~2000** lines of Python Code) - 11:30 AM, Apr 01, 2026: Support more **closed-source** models and **open-source models**: Claude, GPT, Gemini, Kimi, Qwen, Zhipu, DeepSeek, and local open-source models via Ollama or any OpenAI-compatible endpoint. (**~1700** lines of Python Code) - 09:50 AM, Apr 01, 2026: Support more **closed-source** models**: Claude, GPT, Gemini. (**~1300** lines of Python Code) - 08:23 AM, Apr 01, 2026: Release the initial version of Nano Claude Code (**~900 lines** of Python Code) # Nano Claude Code ![demo](demo.gif) A minimal Python implementation of Claude Code in ~900 lines (Initial version), **supporting Claude, GPT, Gemini, Kimi, Qwen, Zhipu, DeepSeek, and local open-source models via Ollama or any OpenAI-compatible endpoint.** --- ## Content * [Features](#features) * [Supported Models](#supported-models) + [Closed-Source (API)](#closed-source--api-) + [Open-Source (Local via Ollama)](#open-source--local-via-ollama-) * [Installation](#installation) * [Usage: Closed-Source API Models](#usage--closed-source-api-models) + [Anthropic Claude](#anthropic-claude) + [OpenAI GPT](#openai-gpt) + [Google Gemini](#google-gemini) + [Kimi (Moonshot AI)](#kimi--moonshot-ai-) + [Qwen (Alibaba DashScope)](#qwen--alibaba-dashscope-) + [Zhipu GLM](#zhipu-glm) + [DeepSeek](#deepseek) * [Usage: Open-Source Models (Local)](#usage--open-source-models--local-) + [Option A β€” Ollama (Recommended)](#option-a---ollama--recommended-) + [Option B β€” LM Studio](#option-b---lm-studio) + [Option C β€” vLLM / Self-Hosted OpenAI-Compatible Server](#option-c---vllm---self-hosted-openai-compatible-server) * [Model Name Format](#model-name-format) * [CLI Reference](#cli-reference) * [Slash Commands (REPL)](#slash-commands--repl-) * [Configuring API Keys](#configuring-api-keys) + [Method 1: Environment Variables (recommended)](#method-1--environment-variables--recommended-) + [Method 2: Set Inside the REPL (persisted)](#method-2--set-inside-the-repl--persisted-) + [Method 3: Edit the Config File Directly](#method-3--edit-the-config-file-directly) * [Permission System](#permission-system) * [Built-in Tools](#built-in-tools) * [CLAUDE.md Support](#claudemd-support) * [Session Management](#session-management) * [Project Structure](#project-structure) * [FAQ](#faq) ## Features | Feature | Details | |---|---| | Multi-provider | Anthropic Β· OpenAI Β· Gemini Β· Kimi Β· Qwen Β· Zhipu Β· DeepSeek Β· Ollama Β· LM Studio Β· Custom endpoint | | Interactive REPL | readline history, Tab-complete slash commands | | Agent loop | Streaming API + automatic tool-use loop | | 8 built-in tools | Read Β· Write Β· Edit Β· Bash Β· Glob Β· Grep Β· WebFetch Β· WebSearch | | Permission system | `auto` / `accept-all` / `manual` modes | | 14 slash commands | `/model` Β· `/config` Β· `/save` Β· `/cost` Β· … | | Context injection | Auto-loads `CLAUDE.md`, git status, cwd | | Session persistence | Save / load conversations to `~/.nano_claude/sessions/` | | Extended Thinking | Toggle on/off (Claude models only) | | Cost tracking | Token usage + estimated USD cost | | Non-interactive mode | `--print` flag for scripting / CI | --- ## Supported Models ### Closed-Source (API) | Provider | Model | Context | Strengths | API Key Env | |---|---|---|---|---| | **Anthropic** | `claude-opus-4-6` | 200k | Most capable, best for complex reasoning | `ANTHROPIC_API_KEY` | | **Anthropic** | `claude-sonnet-4-6` | 200k | Balanced speed & quality | `ANTHROPIC_API_KEY` | | **Anthropic** | `claude-haiku-4-5-20251001` | 200k | Fast, cost-efficient | `ANTHROPIC_API_KEY` | | **OpenAI** | `gpt-4o` | 128k | Strong multimodal & coding | `OPENAI_API_KEY` | | **OpenAI** | `gpt-4o-mini` | 128k | Fast, cheap | `OPENAI_API_KEY` | | **OpenAI** | `o3-mini` | 200k | Strong reasoning | `OPENAI_API_KEY` | | **OpenAI** | `o1` | 200k | Advanced reasoning | `OPENAI_API_KEY` | | **Google** | `gemini-2.5-pro-preview-03-25` | 1M | Long context, multimodal | `GEMINI_API_KEY` | | **Google** | `gemini-2.0-flash` | 1M | Fast, large context | `GEMINI_API_KEY` | | **Google** | `gemini-1.5-pro` | 2M | Largest context window | `GEMINI_API_KEY` | | **Moonshot (Kimi)** | `moonshot-v1-8k` | 8k | Chinese & English | `MOONSHOT_API_KEY` | | **Moonshot (Kimi)** | `moonshot-v1-32k` | 32k | Chinese & English | `MOONSHOT_API_KEY` | | **Moonshot (Kimi)** | `moonshot-v1-128k` | 128k | Long context | `MOONSHOT_API_KEY` | | **Alibaba (Qwen)** | `qwen-max` | 32k | Best Qwen quality | `DASHSCOPE_API_KEY` | | **Alibaba (Qwen)** | `qwen-plus` | 128k | Balanced | `DASHSCOPE_API_KEY` | | **Alibaba (Qwen)** | `qwen-turbo` | 1M | Fast, cheap | `DASHSCOPE_API_KEY` | | **Alibaba (Qwen)** | `qwq-32b` | 32k | Strong reasoning | `DASHSCOPE_API_KEY` | | **Zhipu (GLM)** | `glm-4-plus` | 128k | Best GLM quality | `ZHIPU_API_KEY` | | **Zhipu (GLM)** | `glm-4` | 128k | General purpose | `ZHIPU_API_KEY` | | **Zhipu (GLM)** | `glm-4-flash` | 128k | Free tier available | `ZHIPU_API_KEY` | | **DeepSeek** | `deepseek-chat` | 64k | Strong coding | `DEEPSEEK_API_KEY` | | **DeepSeek** | `deepseek-reasoner` | 64k | Chain-of-thought reasoning | `DEEPSEEK_API_KEY` | ### Open-Source (Local via Ollama) | Model | Size | Strengths | Pull Command | |---|---|---|---| | `llama3.3` | 70B | General purpose, strong reasoning | `ollama pull llama3.3` | | `llama3.2` | 3B / 11B | Lightweight | `ollama pull llama3.2` | | `qwen2.5-coder` | 7B / 32B | **Best for coding tasks** | `ollama pull qwen2.5-coder` | | `qwen2.5` | 7B / 72B | Chinese & English | `ollama pull qwen2.5` | | `deepseek-r1` | 7B–70B | Reasoning, math | `ollama pull deepseek-r1` | | `deepseek-coder-v2` | 16B | Coding | `ollama pull deepseek-coder-v2` | | `mistral` | 7B | Fast, efficient | `ollama pull mistral` | | `mixtral` | 8x7B | Strong MoE model | `ollama pull mixtral` | | `phi4` | 14B | Microsoft, strong reasoning | `ollama pull phi4` | | `gemma3` | 4B / 12B / 27B | Google open model | `ollama pull gemma3` | | `codellama` | 7B / 34B | Code generation | `ollama pull codellama` | > **Note:** Tool calling requires a model that supports function calling. Recommended local models: `qwen2.5-coder`, `llama3.3`, `mistral`, `phi4`. --- ## Installation ```bash git clone cd nano_claude_code pip install -r requirements.txt # or manually: pip install anthropic openai httpx rich ``` --- ## Usage: Closed-Source API Models ### Anthropic Claude Get your API key at [console.anthropic.com](https://console.anthropic.com). ```bash export ANTHROPIC_API_KEY=sk-ant-api03-... # Default model (claude-opus-4-6) python nano_claude.py # Choose a specific model python nano_claude.py --model claude-sonnet-4-6 python nano_claude.py --model claude-haiku-4-5-20251001 # Enable Extended Thinking python nano_claude.py --model claude-opus-4-6 --thinking --verbose ``` ### OpenAI GPT Get your API key at [platform.openai.com](https://platform.openai.com). ```bash export OPENAI_API_KEY=sk-... python nano_claude.py --model gpt-4o python nano_claude.py --model gpt-4o-mini python nano_claude.py --model o3-mini ``` ### Google Gemini Get your API key at [aistudio.google.com](https://aistudio.google.com). ```bash export GEMINI_API_KEY=AIza... python nano_claude.py --model gemini/gemini-2.0-flash python nano_claude.py --model gemini/gemini-1.5-pro python nano_claude.py --model gemini/gemini-2.5-pro-preview-03-25 ``` ### Kimi (Moonshot AI) Get your API key at [platform.moonshot.cn](https://platform.moonshot.cn). ```bash export MOONSHOT_API_KEY=sk-... python nano_claude.py --model kimi/moonshot-v1-32k python nano_claude.py --model kimi/moonshot-v1-128k ``` ### Qwen (Alibaba DashScope) Get your API key at [dashscope.aliyun.com](https://dashscope.aliyun.com). ```bash export DASHSCOPE_API_KEY=sk-... python nano_claude.py --model qwen/qwen-max python nano_claude.py --model qwen/qwq-32b python nano_claude.py --model qwen/qwen2.5-coder-32b-instruct ``` ### Zhipu GLM Get your API key at [open.bigmodel.cn](https://open.bigmodel.cn). ```bash export ZHIPU_API_KEY=... python nano_claude.py --model zhipu/glm-4-plus python nano_claude.py --model zhipu/glm-4-flash # free tier ``` ### DeepSeek Get your API key at [platform.deepseek.com](https://platform.deepseek.com). ```bash export DEEPSEEK_API_KEY=sk-... python nano_claude.py --model deepseek/deepseek-chat python nano_claude.py --model deepseek/deepseek-reasoner ``` --- ## Usage: Open-Source Models (Local) ### Option A β€” Ollama (Recommended) Ollama runs models locally with zero configuration. No API key required. **Step 1: Install Ollama** ```bash # macOS / Linux curl -fsSL https://ollama.com/install.sh | sh # Or download from https://ollama.com/download ``` **Step 2: Pull a model** ```bash # Best for coding (recommended) ollama pull qwen2.5-coder # 4.7 GB (7B) ollama pull qwen2.5-coder:32b # 19 GB (32B) # General purpose ollama pull llama3.3 # 42 GB (70B) ollama pull llama3.2 # 2.0 GB (3B) # Reasoning ollama pull deepseek-r1 # 4.7 GB (7B) ollama pull deepseek-r1:32b # 19 GB (32B) # Other ollama pull phi4 # 9.1 GB (14B) ollama pull mistral # 4.1 GB (7B) ``` **Step 3: Start Ollama server** (runs automatically on macOS; on Linux run manually) ```bash ollama serve # starts on http://localhost:11434 ``` **Step 4: Run nano claude** ```bash python nano_claude.py --model ollama/qwen2.5-coder python nano_claude.py --model ollama/llama3.3 python nano_claude.py --model ollama/deepseek-r1 ``` **List your locally available models:** ```bash ollama list ``` Then use any model from the list: ```bash python nano_claude.py --model ollama/ ``` --- ### Option B β€” LM Studio LM Studio provides a GUI to download and run models, with a built-in OpenAI-compatible server. **Step 1:** Download [LM Studio](https://lmstudio.ai) and install it. **Step 2:** Search and download a model inside LM Studio (GGUF format). **Step 3:** Go to **Local Server** tab β†’ click **Start Server** (default port: 1234). **Step 4:** ```bash python nano_claude.py --model lmstudio/ # e.g.: python nano_claude.py --model lmstudio/phi-4-GGUF python nano_claude.py --model lmstudio/qwen2.5-coder-7b ``` The model name should match what LM Studio shows in the server status bar. --- ### Option C β€” vLLM / Self-Hosted OpenAI-Compatible Server For self-hosted inference servers (vLLM, TGI, llama.cpp server, etc.) that expose an OpenAI-compatible API: Quick Start for option C: Step 1: Start vllm: ``` CUDA_VISIBLE_DEVICES=7 python -m vllm.entrypoints.openai.api_server \ --model Qwen/Qwen2.5-Coder-7B-Instruct \ --host 0.0.0.0 \ --port 8000 \ --enable-auto-tool-choice \ --tool-call-parser hermes ``` Step 2: Start nano claude: ``` export CUSTOM_BASE_URL=http://localhost:8000/v1 export CUSTOM_API_KEY=none python nano_claude.py --model custom/Qwen/Qwen2.5-Coder-7B-Instruct ``` ```bash # Example: vLLM serving Qwen2.5-Coder-32B python -m vllm.entrypoints.openai.api_server \ --model Qwen/Qwen2.5-Coder-32B-Instruct \ --port 8000 # Then run nano claude pointing to your server: python nano_claude.py ``` Inside the REPL: ``` /config custom_base_url=http://localhost:8000/v1 /config custom_api_key=token-abc123 # skip if no auth /model custom/Qwen2.5-Coder-32B-Instruct ``` Or set via environment: ```bash export CUSTOM_BASE_URL=http://localhost:8000/v1 export CUSTOM_API_KEY=token-abc123 python nano_claude.py --model custom/Qwen2.5-Coder-32B-Instruct ``` For a remote GPU server: ```bash /config custom_base_url=http://192.168.1.100:8000/v1 /model custom/your-model-name ``` --- ## Model Name Format Three equivalent formats are supported: ```bash # 1. Auto-detect by prefix (works for well-known models) python nano_claude.py --model gpt-4o python nano_claude.py --model gemini-2.0-flash python nano_claude.py --model deepseek-chat # 2. Explicit provider prefix with slash python nano_claude.py --model ollama/qwen2.5-coder python nano_claude.py --model kimi/moonshot-v1-128k # 3. Explicit provider prefix with colon (also works) python nano_claude.py --model kimi:moonshot-v1-32k python nano_claude.py --model qwen:qwen-max ``` **Auto-detection rules:** | Model prefix | Detected provider | |---|---| | `claude-` | anthropic | | `gpt-`, `o1`, `o3` | openai | | `gemini-` | gemini | | `moonshot-`, `kimi-` | kimi | | `qwen`, `qwq-` | qwen | | `glm-` | zhipu | | `deepseek-` | deepseek | | `llama`, `mistral`, `phi`, `gemma`, `mixtral`, `codellama` | ollama | --- ## CLI Reference ``` python nano_claude.py [OPTIONS] [PROMPT] Options: -p, --print Non-interactive: run prompt and exit -m, --model MODEL Override model (e.g. gpt-4o, ollama/llama3.3) --accept-all Auto-approve all operations (no permission prompts) --verbose Show thinking blocks and per-turn token counts --thinking Enable Extended Thinking (Claude only) --version Print version and exit -h, --help Show help ``` **Examples:** ```bash # Interactive REPL with default model python nano_claude.py # Switch model at startup python nano_claude.py --model gpt-4o python nano_claude.py -m ollama/deepseek-r1:32b # Non-interactive / scripting python nano_claude.py --print "Write a Python fibonacci function" python nano_claude.py -p "Explain the Rust borrow checker in 3 sentences" -m gemini/gemini-2.0-flash # CI / automation (no permission prompts) python nano_claude.py --accept-all --print "Initialize a Python project with pyproject.toml" # Debug mode (see tokens + thinking) python nano_claude.py --thinking --verbose ``` --- ## Slash Commands (REPL) Type `/` and press **Tab** to autocomplete. | Command | Description | |---|---| | `/help` | Show all commands | | `/clear` | Clear conversation history | | `/model` | Show current model + list all available models | | `/model ` | Switch model (takes effect immediately) | | `/config` | Show all current config values | | `/config key=value` | Set a config value (persisted to disk) | | `/save` | Save session (auto-named by timestamp) | | `/save ` | Save session to named file | | `/load` | List all saved sessions | | `/load ` | Load a saved session | | `/history` | Print full conversation history | | `/context` | Show message count and token estimate | | `/cost` | Show token usage and estimated USD cost | | `/verbose` | Toggle verbose mode (tokens + thinking) | | `/thinking` | Toggle Extended Thinking (Claude only) | | `/permissions` | Show current permission mode | | `/permissions ` | Set permission mode: `auto` / `accept-all` / `manual` | | `/cwd` | Show current working directory | | `/cwd ` | Change working directory | | `/exit` / `/quit` | Exit | **Switching models inside a session:** ``` [myproject] ❯ /model Current model: claude-opus-4-6 (provider: anthropic) Available models by provider: anthropic claude-opus-4-6, claude-sonnet-4-6, ... openai gpt-4o, gpt-4o-mini, o3-mini, ... ollama llama3.3, llama3.2, phi4, mistral, ... ... [myproject] ❯ /model gpt-4o Model set to gpt-4o (provider: openai) [myproject] ❯ /model ollama/qwen2.5-coder Model set to ollama/qwen2.5-coder (provider: ollama) ``` --- ## Configuring API Keys ### Method 1: Environment Variables (recommended) ```bash # Add to ~/.bashrc or ~/.zshrc export ANTHROPIC_API_KEY=sk-ant-... export OPENAI_API_KEY=sk-... export GEMINI_API_KEY=AIza... export MOONSHOT_API_KEY=sk-... # Kimi export DASHSCOPE_API_KEY=sk-... # Qwen export ZHIPU_API_KEY=... # Zhipu GLM export DEEPSEEK_API_KEY=sk-... # DeepSeek ``` ### Method 2: Set Inside the REPL (persisted) ``` /config anthropic_api_key=sk-ant-... /config openai_api_key=sk-... /config gemini_api_key=AIza... /config kimi_api_key=sk-... /config qwen_api_key=sk-... /config zhipu_api_key=... /config deepseek_api_key=sk-... ``` Keys are saved to `~/.nano_claude/config.json` and loaded automatically on next launch. ### Method 3: Edit the Config File Directly ```json // ~/.nano_claude/config.json { "model": "qwen/qwen-max", "max_tokens": 8192, "permission_mode": "auto", "verbose": false, "thinking": false, "qwen_api_key": "sk-...", "kimi_api_key": "sk-...", "deepseek_api_key": "sk-..." } ``` --- ## Permission System | Mode | Behavior | |---|---| | `auto` (default) | Read-only operations always allowed. Prompts before Bash commands and file writes. | | `accept-all` | Never prompts. All operations proceed automatically. | | `manual` | Prompts before every single operation, including reads. | **When prompted:** ``` Allow: Run: git commit -am "fix bug" [y/N/a(ccept-all)] ``` - `y` β€” approve this one action - `n` or Enter β€” deny - `a` β€” approve and switch to `accept-all` for the rest of the session **Commands always auto-approved in `auto` mode:** `ls`, `cat`, `head`, `tail`, `wc`, `pwd`, `echo`, `git status`, `git log`, `git diff`, `git show`, `find`, `grep`, `rg`, `python`, `node`, `pip show`, `npm list`, and other read-only shell commands. --- ## Built-in Tools | Tool | Description | Key Parameters | |---|---|---| | `Read` | Read file with line numbers | `file_path`, `limit`, `offset` | | `Write` | Create or overwrite file | `file_path`, `content` | | `Edit` | Exact string replacement in file | `file_path`, `old_string`, `new_string`, `replace_all` | | `Bash` | Execute shell command | `command`, `timeout` (default 30s) | | `Glob` | Find files by glob pattern | `pattern` (e.g. `**/*.py`), `path` | | `Grep` | Regex search in files (uses ripgrep if available) | `pattern`, `path`, `glob`, `output_mode` | | `WebFetch` | Fetch and extract text from URL | `url`, `prompt` | | `WebSearch` | Search the web via DuckDuckGo | `query` | --- ## CLAUDE.md Support Place a `CLAUDE.md` file in your project to give the model persistent context about your codebase. Nano Claude automatically finds and injects it into the system prompt. ``` ~/.claude/CLAUDE.md # Global β€” applies to all projects /your/project/CLAUDE.md # Project-level β€” found by walking up from cwd ``` **Example `CLAUDE.md`:** ```markdown # Project: FastAPI Backend ## Stack - Python 3.12, FastAPI, PostgreSQL, SQLAlchemy 2.0, Alembic - Tests: pytest, coverage target 90% ## Conventions - Format with black, lint with ruff - Full type annotations required - New endpoints must have corresponding tests ## Important Notes - Never hard-code credentials β€” use environment variables - Do not modify existing Alembic migration files - The `staging` branch deploys automatically to staging on push ``` --- ## Session Management ```bash # Inside REPL: /save # auto-name: session_20260401_143022.json /save debug_auth_bug # named save /load # list all saved sessions /load debug_auth_bug # resume a session /load session_20260401_143022.json ``` Sessions are stored as JSON in `~/.nano_claude/sessions/`. --- ## Project Structure ``` nano_claude_code/ β”œβ”€β”€ nano_claude.py # Entry point: REPL + slash commands + output rendering (~580 lines) β”œβ”€β”€ agent.py # Agent loop: neutral message format + tool dispatch (~160 lines) β”œβ”€β”€ providers.py # Multi-provider: adapters + message format conversion (~480 lines) β”œβ”€β”€ tools.py # 8 tool implementations + JSON schemas (~360 lines) β”œβ”€β”€ context.py # System prompt builder: CLAUDE.md + git + cwd (~100 lines) β”œβ”€β”€ config.py # Config load/save/defaults (~70 lines) β”œβ”€β”€ demo.py # Demo script (requires API key) β”œβ”€β”€ make_demo.py # Generates demo.gif and screenshot.png β”œβ”€β”€ demo.gif # Animated demo β”œβ”€β”€ screenshot.png # Static screenshot └── requirements.txt ``` --- ## FAQ **Q: Tool calls don't work with my local Ollama model.** Not all models support function calling. Use one of the recommended tool-calling models: `qwen2.5-coder`, `llama3.3`, `mistral`, or `phi4`. ```bash ollama pull qwen2.5-coder python nano_claude.py --model ollama/qwen2.5-coder ``` **Q: How do I connect to a remote GPU server running vLLM?** ``` /config custom_base_url=http://your-server-ip:8000/v1 /config custom_api_key=your-token /model custom/your-model-name ``` **Q: How do I check my API cost?** ``` /cost Input tokens: 3,421 Output tokens: 892 Est. cost: $0.0648 USD ``` **Q: Can I use multiple API keys in the same session?** Yes. Set all the keys you need upfront (via env vars or `/config`). Then switch models freely β€” each call uses the key for the active provider. **Q: How do I make a model available across all projects?** Add keys to `~/.bashrc` or `~/.zshrc`. Set the default model in `~/.nano_claude/config.json`: ```json { "model": "claude-sonnet-4-6" } ``` **Q: Qwen / Zhipu returns garbled text.** Ensure your `DASHSCOPE_API_KEY` / `ZHIPU_API_KEY` is correct and the account has sufficient quota. Both providers use UTF-8 and handle Chinese well. **Q: Can I pipe input to nano claude?** ```bash echo "Explain this file" | python nano_claude.py --print --accept-all cat error.log | python nano_claude.py -p "What is causing this error?" ``` **Q: How do I run it as a CLI tool from anywhere?** ```bash # Add an alias to ~/.bashrc or ~/.zshrc alias nc='python /path/to/nano_claude_code/nano_claude.py' # Or install as a script pip install -e . # if setup.py exists ```