## π₯π₯π₯ News (Pacific Time)
- 01:47 PM, Apr 01, 2026: Support VLLM inference (**~2000** lines of Python Code)
- 11:30 AM, Apr 01, 2026: Support more **closed-source** models and **open-source models**: Claude, GPT, Gemini, Kimi, Qwen, Zhipu, DeepSeek, and local open-source models via Ollama or any OpenAI-compatible endpoint. (**~1700** lines of Python Code)
- 09:50 AM, Apr 01, 2026: Support more **closed-source** models**: Claude, GPT, Gemini. (**~1300** lines of Python Code)
- 08:23 AM, Apr 01, 2026: Release the initial version of Nano Claude Code (**~900 lines** of Python Code)
# Nano Claude Code

A minimal Python implementation of Claude Code in ~900 lines (Initial version), **supporting Claude, GPT, Gemini, Kimi, Qwen, Zhipu, DeepSeek, and local open-source models via Ollama or any OpenAI-compatible endpoint.**
---
## Content
* [Features](#features)
* [Supported Models](#supported-models)
+ [Closed-Source (API)](#closed-source--api-)
+ [Open-Source (Local via Ollama)](#open-source--local-via-ollama-)
* [Installation](#installation)
* [Usage: Closed-Source API Models](#usage--closed-source-api-models)
+ [Anthropic Claude](#anthropic-claude)
+ [OpenAI GPT](#openai-gpt)
+ [Google Gemini](#google-gemini)
+ [Kimi (Moonshot AI)](#kimi--moonshot-ai-)
+ [Qwen (Alibaba DashScope)](#qwen--alibaba-dashscope-)
+ [Zhipu GLM](#zhipu-glm)
+ [DeepSeek](#deepseek)
* [Usage: Open-Source Models (Local)](#usage--open-source-models--local-)
+ [Option A β Ollama (Recommended)](#option-a---ollama--recommended-)
+ [Option B β LM Studio](#option-b---lm-studio)
+ [Option C β vLLM / Self-Hosted OpenAI-Compatible Server](#option-c---vllm---self-hosted-openai-compatible-server)
* [Model Name Format](#model-name-format)
* [CLI Reference](#cli-reference)
* [Slash Commands (REPL)](#slash-commands--repl-)
* [Configuring API Keys](#configuring-api-keys)
+ [Method 1: Environment Variables (recommended)](#method-1--environment-variables--recommended-)
+ [Method 2: Set Inside the REPL (persisted)](#method-2--set-inside-the-repl--persisted-)
+ [Method 3: Edit the Config File Directly](#method-3--edit-the-config-file-directly)
* [Permission System](#permission-system)
* [Built-in Tools](#built-in-tools)
* [CLAUDE.md Support](#claudemd-support)
* [Session Management](#session-management)
* [Project Structure](#project-structure)
* [FAQ](#faq)
## Features
| Feature | Details |
|---|---|
| Multi-provider | Anthropic Β· OpenAI Β· Gemini Β· Kimi Β· Qwen Β· Zhipu Β· DeepSeek Β· Ollama Β· LM Studio Β· Custom endpoint |
| Interactive REPL | readline history, Tab-complete slash commands |
| Agent loop | Streaming API + automatic tool-use loop |
| 8 built-in tools | Read Β· Write Β· Edit Β· Bash Β· Glob Β· Grep Β· WebFetch Β· WebSearch |
| Permission system | `auto` / `accept-all` / `manual` modes |
| 14 slash commands | `/model` Β· `/config` Β· `/save` Β· `/cost` Β· β¦ |
| Context injection | Auto-loads `CLAUDE.md`, git status, cwd |
| Session persistence | Save / load conversations to `~/.nano_claude/sessions/` |
| Extended Thinking | Toggle on/off (Claude models only) |
| Cost tracking | Token usage + estimated USD cost |
| Non-interactive mode | `--print` flag for scripting / CI |
---
## Supported Models
### Closed-Source (API)
| Provider | Model | Context | Strengths | API Key Env |
|---|---|---|---|---|
| **Anthropic** | `claude-opus-4-6` | 200k | Most capable, best for complex reasoning | `ANTHROPIC_API_KEY` |
| **Anthropic** | `claude-sonnet-4-6` | 200k | Balanced speed & quality | `ANTHROPIC_API_KEY` |
| **Anthropic** | `claude-haiku-4-5-20251001` | 200k | Fast, cost-efficient | `ANTHROPIC_API_KEY` |
| **OpenAI** | `gpt-4o` | 128k | Strong multimodal & coding | `OPENAI_API_KEY` |
| **OpenAI** | `gpt-4o-mini` | 128k | Fast, cheap | `OPENAI_API_KEY` |
| **OpenAI** | `o3-mini` | 200k | Strong reasoning | `OPENAI_API_KEY` |
| **OpenAI** | `o1` | 200k | Advanced reasoning | `OPENAI_API_KEY` |
| **Google** | `gemini-2.5-pro-preview-03-25` | 1M | Long context, multimodal | `GEMINI_API_KEY` |
| **Google** | `gemini-2.0-flash` | 1M | Fast, large context | `GEMINI_API_KEY` |
| **Google** | `gemini-1.5-pro` | 2M | Largest context window | `GEMINI_API_KEY` |
| **Moonshot (Kimi)** | `moonshot-v1-8k` | 8k | Chinese & English | `MOONSHOT_API_KEY` |
| **Moonshot (Kimi)** | `moonshot-v1-32k` | 32k | Chinese & English | `MOONSHOT_API_KEY` |
| **Moonshot (Kimi)** | `moonshot-v1-128k` | 128k | Long context | `MOONSHOT_API_KEY` |
| **Alibaba (Qwen)** | `qwen-max` | 32k | Best Qwen quality | `DASHSCOPE_API_KEY` |
| **Alibaba (Qwen)** | `qwen-plus` | 128k | Balanced | `DASHSCOPE_API_KEY` |
| **Alibaba (Qwen)** | `qwen-turbo` | 1M | Fast, cheap | `DASHSCOPE_API_KEY` |
| **Alibaba (Qwen)** | `qwq-32b` | 32k | Strong reasoning | `DASHSCOPE_API_KEY` |
| **Zhipu (GLM)** | `glm-4-plus` | 128k | Best GLM quality | `ZHIPU_API_KEY` |
| **Zhipu (GLM)** | `glm-4` | 128k | General purpose | `ZHIPU_API_KEY` |
| **Zhipu (GLM)** | `glm-4-flash` | 128k | Free tier available | `ZHIPU_API_KEY` |
| **DeepSeek** | `deepseek-chat` | 64k | Strong coding | `DEEPSEEK_API_KEY` |
| **DeepSeek** | `deepseek-reasoner` | 64k | Chain-of-thought reasoning | `DEEPSEEK_API_KEY` |
### Open-Source (Local via Ollama)
| Model | Size | Strengths | Pull Command |
|---|---|---|---|
| `llama3.3` | 70B | General purpose, strong reasoning | `ollama pull llama3.3` |
| `llama3.2` | 3B / 11B | Lightweight | `ollama pull llama3.2` |
| `qwen2.5-coder` | 7B / 32B | **Best for coding tasks** | `ollama pull qwen2.5-coder` |
| `qwen2.5` | 7B / 72B | Chinese & English | `ollama pull qwen2.5` |
| `deepseek-r1` | 7Bβ70B | Reasoning, math | `ollama pull deepseek-r1` |
| `deepseek-coder-v2` | 16B | Coding | `ollama pull deepseek-coder-v2` |
| `mistral` | 7B | Fast, efficient | `ollama pull mistral` |
| `mixtral` | 8x7B | Strong MoE model | `ollama pull mixtral` |
| `phi4` | 14B | Microsoft, strong reasoning | `ollama pull phi4` |
| `gemma3` | 4B / 12B / 27B | Google open model | `ollama pull gemma3` |
| `codellama` | 7B / 34B | Code generation | `ollama pull codellama` |
> **Note:** Tool calling requires a model that supports function calling. Recommended local models: `qwen2.5-coder`, `llama3.3`, `mistral`, `phi4`.
---
## Installation
```bash
git clone
cd nano_claude_code
pip install -r requirements.txt
# or manually:
pip install anthropic openai httpx rich
```
---
## Usage: Closed-Source API Models
### Anthropic Claude
Get your API key at [console.anthropic.com](https://console.anthropic.com).
```bash
export ANTHROPIC_API_KEY=sk-ant-api03-...
# Default model (claude-opus-4-6)
python nano_claude.py
# Choose a specific model
python nano_claude.py --model claude-sonnet-4-6
python nano_claude.py --model claude-haiku-4-5-20251001
# Enable Extended Thinking
python nano_claude.py --model claude-opus-4-6 --thinking --verbose
```
### OpenAI GPT
Get your API key at [platform.openai.com](https://platform.openai.com).
```bash
export OPENAI_API_KEY=sk-...
python nano_claude.py --model gpt-4o
python nano_claude.py --model gpt-4o-mini
python nano_claude.py --model o3-mini
```
### Google Gemini
Get your API key at [aistudio.google.com](https://aistudio.google.com).
```bash
export GEMINI_API_KEY=AIza...
python nano_claude.py --model gemini/gemini-2.0-flash
python nano_claude.py --model gemini/gemini-1.5-pro
python nano_claude.py --model gemini/gemini-2.5-pro-preview-03-25
```
### Kimi (Moonshot AI)
Get your API key at [platform.moonshot.cn](https://platform.moonshot.cn).
```bash
export MOONSHOT_API_KEY=sk-...
python nano_claude.py --model kimi/moonshot-v1-32k
python nano_claude.py --model kimi/moonshot-v1-128k
```
### Qwen (Alibaba DashScope)
Get your API key at [dashscope.aliyun.com](https://dashscope.aliyun.com).
```bash
export DASHSCOPE_API_KEY=sk-...
python nano_claude.py --model qwen/qwen-max
python nano_claude.py --model qwen/qwq-32b
python nano_claude.py --model qwen/qwen2.5-coder-32b-instruct
```
### Zhipu GLM
Get your API key at [open.bigmodel.cn](https://open.bigmodel.cn).
```bash
export ZHIPU_API_KEY=...
python nano_claude.py --model zhipu/glm-4-plus
python nano_claude.py --model zhipu/glm-4-flash # free tier
```
### DeepSeek
Get your API key at [platform.deepseek.com](https://platform.deepseek.com).
```bash
export DEEPSEEK_API_KEY=sk-...
python nano_claude.py --model deepseek/deepseek-chat
python nano_claude.py --model deepseek/deepseek-reasoner
```
---
## Usage: Open-Source Models (Local)
### Option A β Ollama (Recommended)
Ollama runs models locally with zero configuration. No API key required.
**Step 1: Install Ollama**
```bash
# macOS / Linux
curl -fsSL https://ollama.com/install.sh | sh
# Or download from https://ollama.com/download
```
**Step 2: Pull a model**
```bash
# Best for coding (recommended)
ollama pull qwen2.5-coder # 4.7 GB (7B)
ollama pull qwen2.5-coder:32b # 19 GB (32B)
# General purpose
ollama pull llama3.3 # 42 GB (70B)
ollama pull llama3.2 # 2.0 GB (3B)
# Reasoning
ollama pull deepseek-r1 # 4.7 GB (7B)
ollama pull deepseek-r1:32b # 19 GB (32B)
# Other
ollama pull phi4 # 9.1 GB (14B)
ollama pull mistral # 4.1 GB (7B)
```
**Step 3: Start Ollama server** (runs automatically on macOS; on Linux run manually)
```bash
ollama serve # starts on http://localhost:11434
```
**Step 4: Run nano claude**
```bash
python nano_claude.py --model ollama/qwen2.5-coder
python nano_claude.py --model ollama/llama3.3
python nano_claude.py --model ollama/deepseek-r1
```
**List your locally available models:**
```bash
ollama list
```
Then use any model from the list:
```bash
python nano_claude.py --model ollama/
```
---
### Option B β LM Studio
LM Studio provides a GUI to download and run models, with a built-in OpenAI-compatible server.
**Step 1:** Download [LM Studio](https://lmstudio.ai) and install it.
**Step 2:** Search and download a model inside LM Studio (GGUF format).
**Step 3:** Go to **Local Server** tab β click **Start Server** (default port: 1234).
**Step 4:**
```bash
python nano_claude.py --model lmstudio/
# e.g.:
python nano_claude.py --model lmstudio/phi-4-GGUF
python nano_claude.py --model lmstudio/qwen2.5-coder-7b
```
The model name should match what LM Studio shows in the server status bar.
---
### Option C β vLLM / Self-Hosted OpenAI-Compatible Server
For self-hosted inference servers (vLLM, TGI, llama.cpp server, etc.) that expose an OpenAI-compatible API:
Quick Start for option C:
Step 1: Start vllm:
```
CUDA_VISIBLE_DEVICES=7 python -m vllm.entrypoints.openai.api_server \
--model Qwen/Qwen2.5-Coder-7B-Instruct \
--host 0.0.0.0 \
--port 8000 \
--enable-auto-tool-choice \
--tool-call-parser hermes
```
Step 2: Start nano claudeοΌ
```
export CUSTOM_BASE_URL=http://localhost:8000/v1
export CUSTOM_API_KEY=none
python nano_claude.py --model custom/Qwen/Qwen2.5-Coder-7B-Instruct
```
```bash
# Example: vLLM serving Qwen2.5-Coder-32B
python -m vllm.entrypoints.openai.api_server \
--model Qwen/Qwen2.5-Coder-32B-Instruct \
--port 8000
# Then run nano claude pointing to your server:
python nano_claude.py
```
Inside the REPL:
```
/config custom_base_url=http://localhost:8000/v1
/config custom_api_key=token-abc123 # skip if no auth
/model custom/Qwen2.5-Coder-32B-Instruct
```
Or set via environment:
```bash
export CUSTOM_BASE_URL=http://localhost:8000/v1
export CUSTOM_API_KEY=token-abc123
python nano_claude.py --model custom/Qwen2.5-Coder-32B-Instruct
```
For a remote GPU server:
```bash
/config custom_base_url=http://192.168.1.100:8000/v1
/model custom/your-model-name
```
---
## Model Name Format
Three equivalent formats are supported:
```bash
# 1. Auto-detect by prefix (works for well-known models)
python nano_claude.py --model gpt-4o
python nano_claude.py --model gemini-2.0-flash
python nano_claude.py --model deepseek-chat
# 2. Explicit provider prefix with slash
python nano_claude.py --model ollama/qwen2.5-coder
python nano_claude.py --model kimi/moonshot-v1-128k
# 3. Explicit provider prefix with colon (also works)
python nano_claude.py --model kimi:moonshot-v1-32k
python nano_claude.py --model qwen:qwen-max
```
**Auto-detection rules:**
| Model prefix | Detected provider |
|---|---|
| `claude-` | anthropic |
| `gpt-`, `o1`, `o3` | openai |
| `gemini-` | gemini |
| `moonshot-`, `kimi-` | kimi |
| `qwen`, `qwq-` | qwen |
| `glm-` | zhipu |
| `deepseek-` | deepseek |
| `llama`, `mistral`, `phi`, `gemma`, `mixtral`, `codellama` | ollama |
---
## CLI Reference
```
python nano_claude.py [OPTIONS] [PROMPT]
Options:
-p, --print Non-interactive: run prompt and exit
-m, --model MODEL Override model (e.g. gpt-4o, ollama/llama3.3)
--accept-all Auto-approve all operations (no permission prompts)
--verbose Show thinking blocks and per-turn token counts
--thinking Enable Extended Thinking (Claude only)
--version Print version and exit
-h, --help Show help
```
**Examples:**
```bash
# Interactive REPL with default model
python nano_claude.py
# Switch model at startup
python nano_claude.py --model gpt-4o
python nano_claude.py -m ollama/deepseek-r1:32b
# Non-interactive / scripting
python nano_claude.py --print "Write a Python fibonacci function"
python nano_claude.py -p "Explain the Rust borrow checker in 3 sentences" -m gemini/gemini-2.0-flash
# CI / automation (no permission prompts)
python nano_claude.py --accept-all --print "Initialize a Python project with pyproject.toml"
# Debug mode (see tokens + thinking)
python nano_claude.py --thinking --verbose
```
---
## Slash Commands (REPL)
Type `/` and press **Tab** to autocomplete.
| Command | Description |
|---|---|
| `/help` | Show all commands |
| `/clear` | Clear conversation history |
| `/model` | Show current model + list all available models |
| `/model ` | Switch model (takes effect immediately) |
| `/config` | Show all current config values |
| `/config key=value` | Set a config value (persisted to disk) |
| `/save` | Save session (auto-named by timestamp) |
| `/save ` | Save session to named file |
| `/load` | List all saved sessions |
| `/load ` | Load a saved session |
| `/history` | Print full conversation history |
| `/context` | Show message count and token estimate |
| `/cost` | Show token usage and estimated USD cost |
| `/verbose` | Toggle verbose mode (tokens + thinking) |
| `/thinking` | Toggle Extended Thinking (Claude only) |
| `/permissions` | Show current permission mode |
| `/permissions ` | Set permission mode: `auto` / `accept-all` / `manual` |
| `/cwd` | Show current working directory |
| `/cwd ` | Change working directory |
| `/exit` / `/quit` | Exit |
**Switching models inside a session:**
```
[myproject] β― /model
Current model: claude-opus-4-6 (provider: anthropic)
Available models by provider:
anthropic claude-opus-4-6, claude-sonnet-4-6, ...
openai gpt-4o, gpt-4o-mini, o3-mini, ...
ollama llama3.3, llama3.2, phi4, mistral, ...
...
[myproject] β― /model gpt-4o
Model set to gpt-4o (provider: openai)
[myproject] β― /model ollama/qwen2.5-coder
Model set to ollama/qwen2.5-coder (provider: ollama)
```
---
## Configuring API Keys
### Method 1: Environment Variables (recommended)
```bash
# Add to ~/.bashrc or ~/.zshrc
export ANTHROPIC_API_KEY=sk-ant-...
export OPENAI_API_KEY=sk-...
export GEMINI_API_KEY=AIza...
export MOONSHOT_API_KEY=sk-... # Kimi
export DASHSCOPE_API_KEY=sk-... # Qwen
export ZHIPU_API_KEY=... # Zhipu GLM
export DEEPSEEK_API_KEY=sk-... # DeepSeek
```
### Method 2: Set Inside the REPL (persisted)
```
/config anthropic_api_key=sk-ant-...
/config openai_api_key=sk-...
/config gemini_api_key=AIza...
/config kimi_api_key=sk-...
/config qwen_api_key=sk-...
/config zhipu_api_key=...
/config deepseek_api_key=sk-...
```
Keys are saved to `~/.nano_claude/config.json` and loaded automatically on next launch.
### Method 3: Edit the Config File Directly
```json
// ~/.nano_claude/config.json
{
"model": "qwen/qwen-max",
"max_tokens": 8192,
"permission_mode": "auto",
"verbose": false,
"thinking": false,
"qwen_api_key": "sk-...",
"kimi_api_key": "sk-...",
"deepseek_api_key": "sk-..."
}
```
---
## Permission System
| Mode | Behavior |
|---|---|
| `auto` (default) | Read-only operations always allowed. Prompts before Bash commands and file writes. |
| `accept-all` | Never prompts. All operations proceed automatically. |
| `manual` | Prompts before every single operation, including reads. |
**When prompted:**
```
Allow: Run: git commit -am "fix bug" [y/N/a(ccept-all)]
```
- `y` β approve this one action
- `n` or Enter β deny
- `a` β approve and switch to `accept-all` for the rest of the session
**Commands always auto-approved in `auto` mode:**
`ls`, `cat`, `head`, `tail`, `wc`, `pwd`, `echo`, `git status`, `git log`, `git diff`, `git show`, `find`, `grep`, `rg`, `python`, `node`, `pip show`, `npm list`, and other read-only shell commands.
---
## Built-in Tools
| Tool | Description | Key Parameters |
|---|---|---|
| `Read` | Read file with line numbers | `file_path`, `limit`, `offset` |
| `Write` | Create or overwrite file | `file_path`, `content` |
| `Edit` | Exact string replacement in file | `file_path`, `old_string`, `new_string`, `replace_all` |
| `Bash` | Execute shell command | `command`, `timeout` (default 30s) |
| `Glob` | Find files by glob pattern | `pattern` (e.g. `**/*.py`), `path` |
| `Grep` | Regex search in files (uses ripgrep if available) | `pattern`, `path`, `glob`, `output_mode` |
| `WebFetch` | Fetch and extract text from URL | `url`, `prompt` |
| `WebSearch` | Search the web via DuckDuckGo | `query` |
---
## CLAUDE.md Support
Place a `CLAUDE.md` file in your project to give the model persistent context about your codebase. Nano Claude automatically finds and injects it into the system prompt.
```
~/.claude/CLAUDE.md # Global β applies to all projects
/your/project/CLAUDE.md # Project-level β found by walking up from cwd
```
**Example `CLAUDE.md`:**
```markdown
# Project: FastAPI Backend
## Stack
- Python 3.12, FastAPI, PostgreSQL, SQLAlchemy 2.0, Alembic
- Tests: pytest, coverage target 90%
## Conventions
- Format with black, lint with ruff
- Full type annotations required
- New endpoints must have corresponding tests
## Important Notes
- Never hard-code credentials β use environment variables
- Do not modify existing Alembic migration files
- The `staging` branch deploys automatically to staging on push
```
---
## Session Management
```bash
# Inside REPL:
/save # auto-name: session_20260401_143022.json
/save debug_auth_bug # named save
/load # list all saved sessions
/load debug_auth_bug # resume a session
/load session_20260401_143022.json
```
Sessions are stored as JSON in `~/.nano_claude/sessions/`.
---
## Project Structure
```
nano_claude_code/
βββ nano_claude.py # Entry point: REPL + slash commands + output rendering (~580 lines)
βββ agent.py # Agent loop: neutral message format + tool dispatch (~160 lines)
βββ providers.py # Multi-provider: adapters + message format conversion (~480 lines)
βββ tools.py # 8 tool implementations + JSON schemas (~360 lines)
βββ context.py # System prompt builder: CLAUDE.md + git + cwd (~100 lines)
βββ config.py # Config load/save/defaults (~70 lines)
βββ demo.py # Demo script (requires API key)
βββ make_demo.py # Generates demo.gif and screenshot.png
βββ demo.gif # Animated demo
βββ screenshot.png # Static screenshot
βββ requirements.txt
```
---
## FAQ
**Q: Tool calls don't work with my local Ollama model.**
Not all models support function calling. Use one of the recommended tool-calling models: `qwen2.5-coder`, `llama3.3`, `mistral`, or `phi4`.
```bash
ollama pull qwen2.5-coder
python nano_claude.py --model ollama/qwen2.5-coder
```
**Q: How do I connect to a remote GPU server running vLLM?**
```
/config custom_base_url=http://your-server-ip:8000/v1
/config custom_api_key=your-token
/model custom/your-model-name
```
**Q: How do I check my API cost?**
```
/cost
Input tokens: 3,421
Output tokens: 892
Est. cost: $0.0648 USD
```
**Q: Can I use multiple API keys in the same session?**
Yes. Set all the keys you need upfront (via env vars or `/config`). Then switch models freely β each call uses the key for the active provider.
**Q: How do I make a model available across all projects?**
Add keys to `~/.bashrc` or `~/.zshrc`. Set the default model in `~/.nano_claude/config.json`:
```json
{ "model": "claude-sonnet-4-6" }
```
**Q: Qwen / Zhipu returns garbled text.**
Ensure your `DASHSCOPE_API_KEY` / `ZHIPU_API_KEY` is correct and the account has sufficient quota. Both providers use UTF-8 and handle Chinese well.
**Q: Can I pipe input to nano claude?**
```bash
echo "Explain this file" | python nano_claude.py --print --accept-all
cat error.log | python nano_claude.py -p "What is causing this error?"
```
**Q: How do I run it as a CLI tool from anywhere?**
```bash
# Add an alias to ~/.bashrc or ~/.zshrc
alias nc='python /path/to/nano_claude_code/nano_claude.py'
# Or install as a script
pip install -e . # if setup.py exists
```