Add professional voice assistant server implementation

- FastAPI-based TTS server using Piper neural text-to-speech - Poetry for dependency management and virtual environments - OpenAI-compatible API endpoints for seamless integration - Support for multiple voice models (Ryan, Alan, Lessac) - Robust error handling and voice fallback system - Professional logging and configuration management - Docker-ready with proper Python packaging 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-17 14:56:01 +02:00
parent 82f9cc4990
commit 572434d42e
13 changed files with 1722 additions and 0 deletions
--- a/voice-server/README.md
+++ b/voice-server/README.md
@@ -0,0 +1,130 @@
+# Homelab Voice Server
+
+A local text-to-speech server using Piper TTS, designed to work with Claude Code voice assistant functionality.
+
+## Features
+
+- **Local TTS**: Uses Piper neural TTS for natural-sounding speech
+- **OpenAI Compatible**: Drop-in replacement for OpenAI TTS API
+- **Multiple Voices**: Support for different voice models and languages
+- **FastAPI**: Modern, fast web framework with automatic API documentation
+- **Poetry**: Dependency management and virtual environments
+
+## Quick Start
+
+1. **Install dependencies:**
+   ```bash
+   cd voice-server
+   poetry install
+   ```
+
+2. **Download voice models:**
+   ```bash
+   mkdir -p ~/.local/share/piper-voices
+   cd ~/.local/share/piper-voices
+   
+   # Ryan voice (recommended - male US English)
+   wget https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/ryan/medium/en_US-ryan-medium.onnx
+   wget https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/ryan/medium/en_US-ryan-medium.onnx.json
+   ```
+
+3. **Start the server:**
+   ```bash
+   poetry run voice-server
+   ```
+
+4. **Test the API:**
+   ```bash
+   curl -X POST "http://127.0.0.1:8880/v1/audio/speech" \
+     -H "Content-Type: application/json" \
+     -d '{"input": "Hello from the voice server!", "voice": "ryan"}' \
+     --output test.wav
+   ```
+
+## API Endpoints
+
+### Health Check
+- `GET /health` - Server health and status
+
+### Models (OpenAI Compatible)
+- `GET /v1/models` - List available models
+
+### Voices
+- `GET /v1/voices` - List all available voices
+- `GET /v1/voices/{voice_name}` - Get specific voice information
+
+### Speech Synthesis (OpenAI Compatible)
+- `POST /v1/audio/speech` - Generate speech from text
+
+#### Request Body
+```json
+{
+  "input": "Text to speak",
+  "voice": "ryan",
+  "speed": 1.0
+}
+```
+
+## Configuration
+
+Environment variables (prefix with `VOICE_SERVER_`):
+
+- `VOICE_SERVER_HOST`: Server host (default: 127.0.0.1)
+- `VOICE_SERVER_PORT`: Server port (default: 8880)
+- `VOICE_SERVER_DEFAULT_VOICE`: Default voice (default: ryan)
+- `VOICE_SERVER_LOG_LEVEL`: Logging level (default: info)
+
+## Available Voices
+
+| Voice | Gender | Language | Description |
+|-------|--------|----------|-------------|
+| ryan  | Male   | en-US    | Professional, clear (recommended for AI) |
+| alan  | Male   | en-GB    | Sophisticated British accent |
+| lessac| Female | en-US    | Natural, conversational |
+
+## Development
+
+### API Documentation
+Visit `http://127.0.0.1:8880/docs` when the server is running for interactive API documentation.
+
+### Adding New Voices
+1. Download voice model files to `~/.local/share/piper-voices/`
+2. Add voice configuration to `src/voice_server/config.py`
+3. Restart the server
+
+### Running Tests
+```bash
+poetry run pytest
+```
+
+### Code Formatting
+```bash
+poetry run black src/
+poetry run isort src/
+```
+
+## Integration with Claude Code
+
+The voice server is designed to work with Claude Code's voice-mode functionality:
+
+```python
+# In Claude Code
+converse("Hello! I can now speak using the local voice server.", wait_for_response=False)
+```
+
+## Troubleshooting
+
+### Server Won't Start
+- Check that piper-tts is installed: `which piper-tts`
+- Verify voice models are downloaded
+- Check port 8880 is available
+
+### No Audio Output
+- Test piper directly: `echo "test" | piper-tts -m ~/.local/share/piper-voices/en_US-ryan-medium.onnx -f test.wav`
+- Check audio system settings
+- Verify file permissions on voice models
+
+### Voice Not Available
+- Check voice files exist: `ls ~/.local/share/piper-voices/`
+- Verify file naming matches configuration
+- Check server logs for detailed error messages