Add comprehensive voice assistant documentation

- Complete setup guide for Piper TTS installation - Voice model download instructions with multiple options - API usage examples and troubleshooting guide - Available voice models comparison table - Integration instructions for Claude Code 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-17 14:56:42 +02:00
parent 16081ec85e
commit e2b79e9662
1 changed files with 126 additions and 0 deletions
--- a/docs/voice-assistant.md
+++ b/docs/voice-assistant.md
@@ -0,0 +1,126 @@
+# Voice Assistant Setup
+
+This document describes how to set up AI voice capabilities for Claude Code using local TTS (Text-to-Speech) services.
+
+## Overview
+
+The voice assistant setup uses:
+- **Piper TTS**: Local neural text-to-speech engine for generating natural-sounding speech
+- **FastAPI**: HTTP server wrapper to make Piper compatible with voice-mode
+- **Ryan voice model**: Professional male US English voice for AI assistant personality
+
+## Prerequisites
+
+### System Dependencies
+
+Install required packages from AUR:
+```bash
+yay -S piper-tts
+```
+
+### Voice Models
+
+Download voice models to the piper voices directory:
+```bash
+# Create voice models directory
+mkdir -p ~/.local/share/piper-voices
+cd ~/.local/share/piper-voices
+
+# Download Ryan voice (male US English - recommended for AI assistant)
+wget https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/ryan/medium/en_US-ryan-medium.onnx
+wget https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/ryan/medium/en_US-ryan-medium.onnx.json
+
+# Optional: Download Alan voice (male British English)
+wget https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_GB/alan/medium/en_GB-alan-medium.onnx
+wget https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_GB/alan/medium/en_GB-alan-medium.onnx.json
+```
+
+### Testing Piper TTS
+
+Test the installation:
+```bash
+echo "Hello, this is a test of piper text to speech" | piper-tts -m ~/.local/share/piper-voices/en_US-ryan-medium.onnx -f /tmp/test_voice.wav
+```
+
+You should hear a clear male voice saying the test phrase.
+
+## Voice Server Setup
+
+The voice server provides an HTTP API compatible with OpenAI's TTS format, allowing Claude Code to use Piper TTS seamlessly.
+
+### Installation
+
+1. Navigate to the voice server directory:
+   ```bash
+   cd /path/to/homelab/voice-server
+   ```
+
+2. Install dependencies with Poetry:
+   ```bash
+   poetry install
+   ```
+
+3. Start the voice server:
+   ```bash
+   poetry run voice-server
+   ```
+
+The server will start on `http://127.0.0.1:8880` and provide:
+- `/v1/audio/speech` - TTS endpoint compatible with OpenAI API
+- `/v1/models` - List available models
+- `/health` - Health check endpoint
+
+## Usage
+
+### Starting Voice Mode
+
+Use the custom voice command to start both the server and enable voice mode:
+```bash
+./scripts/enable-voice.sh
+```
+
+### Voice Conversation
+
+Once the server is running, you can use voice commands in Claude Code:
+```python
+# Text-to-speech only (no microphone input)
+converse("Hello! I can now speak using the local piper TTS system.", wait_for_response=False)
+```
+
+### Configuration
+
+The voice server uses the Ryan voice model by default. To change voices, edit the configuration in:
+```
+voice-server/config.py
+```
+
+## Available Voice Models
+
+| Voice | Gender | Accent | Description |
+|-------|--------|--------|-------------|
+| ryan | Male | US English | Professional, clear, recommended for AI assistant |
+| alan | Male | British English | Sophisticated, formal |
+| lessac | Female | US English | Natural, conversational |
+
+## Troubleshooting
+
+### Voice Server Won't Start
+- Ensure piper-tts is installed: `which piper-tts`
+- Check voice models are downloaded: `ls ~/.local/share/piper-voices/`
+- Verify port 8880 is available: `netstat -tlnp | grep 8880`
+
+### Poor Audio Quality
+- Try a different voice model
+- Check audio system: `pactl info`
+- Test piper directly: `echo "test" | piper-tts -m ~/.local/share/piper-voices/en_US-ryan-medium.onnx -f /tmp/test.wav`
+
+### Audio Not Playing
+- Check PulseAudio is running: `systemctl --user status pulseaudio`
+- Test system audio: `speaker-test -t wav -c 2`
+
+## Future Enhancements
+
+- **Speech-to-Text**: Add Whisper.cpp for full voice conversations
+- **Voice Selection**: Runtime voice switching via API
+- **Voice Cloning**: Custom voice models
+- **Multi-language**: Support for other languages