homelab/voice-server/README.md

# Homelab Voice Server

A local text-to-speech server using Piper TTS, designed to work with Claude Code voice assistant functionality.

## Features

- **Local TTS**: Uses Piper neural TTS for natural-sounding speech
- **OpenAI Compatible**: Drop-in replacement for OpenAI TTS API
- **Multiple Voices**: Support for different voice models and languages
- **FastAPI**: Modern, fast web framework with automatic API documentation
- **Poetry**: Dependency management and virtual environments

## Quick Start

1. **Install dependencies:**
   ```bash
   cd voice-server
   poetry install
   ```

2. **Download voice models:**
   ```bash
   mkdir -p ~/.local/share/piper-voices
   cd ~/.local/share/piper-voices

   # Ryan voice (recommended - male US English)
   wget https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/ryan/medium/en_US-ryan-medium.onnx
   wget https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/ryan/medium/en_US-ryan-medium.onnx.json
   ```

3. **Start the server:**
   ```bash
   poetry run voice-server
   ```

4. **Test the API:**
   ```bash
   curl -X POST "http://127.0.0.1:8880/v1/audio/speech" \
     -H "Content-Type: application/json" \
     -d '{"input": "Hello from the voice server!", "voice": "ryan"}' \
     --output test.wav
   ```

## API Endpoints

### Health Check
- `GET /health` - Server health and status

### Models (OpenAI Compatible)
- `GET /v1/models` - List available models

### Voices
- `GET /v1/voices` - List all available voices
- `GET /v1/voices/{voice_name}` - Get specific voice information

### Speech Synthesis (OpenAI Compatible)
- `POST /v1/audio/speech` - Generate speech from text

#### Request Body
```json
{
  "input": "Text to speak",
  "voice": "ryan",
  "speed": 1.0
}
```

## Configuration

Environment variables (prefix with `VOICE_SERVER_`):

- `VOICE_SERVER_HOST`: Server host (default: 127.0.0.1)
- `VOICE_SERVER_PORT`: Server port (default: 8880)
- `VOICE_SERVER_DEFAULT_VOICE`: Default voice (default: ryan)
- `VOICE_SERVER_LOG_LEVEL`: Logging level (default: info)

## Available Voices

| Voice | Gender | Language | Description |
|-------|--------|----------|-------------|
| ryan  | Male   | en-US    | Professional, clear (recommended for AI) |
| alan  | Male   | en-GB    | Sophisticated British accent |
| lessac| Female | en-US    | Natural, conversational |

## Development

### API Documentation
Visit `http://127.0.0.1:8880/docs` when the server is running for interactive API documentation.

### Adding New Voices
1. Download voice model files to `~/.local/share/piper-voices/`
2. Add voice configuration to `src/voice_server/config.py`
3. Restart the server

### Running Tests
```bash
poetry run pytest
```

### Code Formatting
```bash
poetry run black src/
poetry run isort src/
```

## Integration with Claude Code

The voice server is designed to work with Claude Code's voice-mode functionality:

```python
# In Claude Code
converse("Hello! I can now speak using the local voice server.", wait_for_response=False)
```

## Troubleshooting

### Server Won't Start
- Check that piper-tts is installed: `which piper-tts`
- Verify voice models are downloaded
- Check port 8880 is available

### No Audio Output
- Test piper directly: `echo "test" | piper-tts -m ~/.local/share/piper-voices/en_US-ryan-medium.onnx -f test.wav`
- Check audio system settings
- Verify file permissions on voice models

### Voice Not Available
- Check voice files exist: `ls ~/.local/share/piper-voices/`
- Verify file naming matches configuration
- Check server logs for detailed error messages