Files

Arpad Krejczinger 572434d42e Add professional voice assistant server implementation

- FastAPI-based TTS server using Piper neural text-to-speech
- Poetry for dependency management and virtual environments
- OpenAI-compatible API endpoints for seamless integration
- Support for multiple voice models (Ryan, Alan, Lessac)
- Robust error handling and voice fallback system
- Professional logging and configuration management
- Docker-ready with proper Python packaging

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-08-17 14:56:01 +02:00

3.5 KiB

Raw Blame History

Homelab Voice Server

A local text-to-speech server using Piper TTS, designed to work with Claude Code voice assistant functionality.

Features

Local TTS: Uses Piper neural TTS for natural-sounding speech
OpenAI Compatible: Drop-in replacement for OpenAI TTS API
Multiple Voices: Support for different voice models and languages
FastAPI: Modern, fast web framework with automatic API documentation
Poetry: Dependency management and virtual environments

Quick Start

Install dependencies:
```
cd voice-server
poetry install
```

Download voice models:

mkdir -p ~/.local/share/piper-voices
cd ~/.local/share/piper-voices

# Ryan voice (recommended - male US English)
wget https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/ryan/medium/en_US-ryan-medium.onnx
wget https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/ryan/medium/en_US-ryan-medium.onnx.json

Start the server:
```
poetry run voice-server
```

Test the API:

curl -X POST "http://127.0.0.1:8880/v1/audio/speech" \
  -H "Content-Type: application/json" \
  -d '{"input": "Hello from the voice server!", "voice": "ryan"}' \
  --output test.wav

API Endpoints

Health Check

GET /health - Server health and status

Models (OpenAI Compatible)

GET /v1/models - List available models

Voices

GET /v1/voices - List all available voices
GET /v1/voices/{voice_name} - Get specific voice information

Speech Synthesis (OpenAI Compatible)

POST /v1/audio/speech - Generate speech from text

Request Body

{
  "input": "Text to speak",
  "voice": "ryan",
  "speed": 1.0
}

Configuration

Environment variables (prefix with VOICE_SERVER_):

VOICE_SERVER_HOST: Server host (default: 127.0.0.1)
VOICE_SERVER_PORT: Server port (default: 8880)
VOICE_SERVER_DEFAULT_VOICE: Default voice (default: ryan)
VOICE_SERVER_LOG_LEVEL: Logging level (default: info)

Available Voices

Voice	Gender	Language	Description
ryan	Male	en-US	Professional, clear (recommended for AI)
alan	Male	en-GB	Sophisticated British accent
lessac	Female	en-US	Natural, conversational

Development

API Documentation

Visit http://127.0.0.1:8880/docs when the server is running for interactive API documentation.

Adding New Voices

Download voice model files to ~/.local/share/piper-voices/
Add voice configuration to src/voice_server/config.py
Restart the server

Running Tests

poetry run pytest

Code Formatting

poetry run black src/
poetry run isort src/

Integration with Claude Code

The voice server is designed to work with Claude Code's voice-mode functionality:

# In Claude Code
converse("Hello! I can now speak using the local voice server.", wait_for_response=False)

Troubleshooting

Server Won't Start

Check that piper-tts is installed: which piper-tts
Verify voice models are downloaded
Check port 8880 is available

No Audio Output

Test piper directly: echo "test" | piper-tts -m ~/.local/share/piper-voices/en_US-ryan-medium.onnx -f test.wav
Check audio system settings
Verify file permissions on voice models

Voice Not Available

Check voice files exist: ls ~/.local/share/piper-voices/
Verify file naming matches configuration
Check server logs for detailed error messages

3.5 KiB Raw Blame History