Mark TTS functionality as disabled due to onnxruntime removal (freed 1.2GB disk space during cleanup)
129 lines
4.0 KiB
Markdown
129 lines
4.0 KiB
Markdown
# Voice Assistant Setup
|
|
|
|
⚠️ **STATUS: DISABLED** - onnxruntime package was removed to free disk space (1.2GB). Voice functionality is currently unavailable.
|
|
|
|
This document describes how to set up AI voice capabilities for Claude Code using local TTS (Text-to-Speech) services.
|
|
|
|
## Overview
|
|
|
|
The voice assistant setup uses:
|
|
- **Piper TTS**: Local neural text-to-speech engine for generating natural-sounding speech
|
|
- **FastAPI**: HTTP server wrapper to make Piper compatible with voice-mode
|
|
- **Ryan voice model**: Professional male US English voice for AI assistant personality
|
|
- **onnxruntime**: ML inference library (removed - required for TTS)
|
|
|
|
## Prerequisites
|
|
|
|
### System Dependencies
|
|
|
|
Install required packages from AUR:
|
|
```bash
|
|
yay -S piper-tts
|
|
```
|
|
|
|
### Voice Models
|
|
|
|
Download voice models to the piper voices directory:
|
|
```bash
|
|
# Create voice models directory
|
|
mkdir -p ~/.local/share/piper-voices
|
|
cd ~/.local/share/piper-voices
|
|
|
|
# Download Ryan voice (male US English - recommended for AI assistant)
|
|
wget https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/ryan/medium/en_US-ryan-medium.onnx
|
|
wget https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/ryan/medium/en_US-ryan-medium.onnx.json
|
|
|
|
# Optional: Download Alan voice (male British English)
|
|
wget https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_GB/alan/medium/en_GB-alan-medium.onnx
|
|
wget https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_GB/alan/medium/en_GB-alan-medium.onnx.json
|
|
```
|
|
|
|
### Testing Piper TTS
|
|
|
|
Test the installation:
|
|
```bash
|
|
echo "Hello, this is a test of piper text to speech" | piper-tts -m ~/.local/share/piper-voices/en_US-ryan-medium.onnx -f /tmp/test_voice.wav
|
|
```
|
|
|
|
You should hear a clear male voice saying the test phrase.
|
|
|
|
## Voice Server Setup
|
|
|
|
The voice server provides an HTTP API compatible with OpenAI's TTS format, allowing Claude Code to use Piper TTS seamlessly.
|
|
|
|
### Installation
|
|
|
|
1. Navigate to the voice server directory:
|
|
```bash
|
|
cd /path/to/homelab/voice-server
|
|
```
|
|
|
|
2. Install dependencies with Poetry:
|
|
```bash
|
|
poetry install
|
|
```
|
|
|
|
3. Start the voice server:
|
|
```bash
|
|
poetry run voice-server
|
|
```
|
|
|
|
The server will start on `http://127.0.0.1:8880` and provide:
|
|
- `/v1/audio/speech` - TTS endpoint compatible with OpenAI API
|
|
- `/v1/models` - List available models
|
|
- `/health` - Health check endpoint
|
|
|
|
## Usage
|
|
|
|
### Starting Voice Mode
|
|
|
|
Use the custom voice command to start both the server and enable voice mode:
|
|
```bash
|
|
./scripts/enable-voice.sh
|
|
```
|
|
|
|
### Voice Conversation
|
|
|
|
Once the server is running, you can use voice commands in Claude Code:
|
|
```python
|
|
# Text-to-speech only (no microphone input)
|
|
converse("Hello! I can now speak using the local piper TTS system.", wait_for_response=False)
|
|
```
|
|
|
|
### Configuration
|
|
|
|
The voice server uses the Ryan voice model by default. To change voices, edit the configuration in:
|
|
```
|
|
voice-server/config.py
|
|
```
|
|
|
|
## Available Voice Models
|
|
|
|
| Voice | Gender | Accent | Description |
|
|
|-------|--------|--------|-------------|
|
|
| ryan | Male | US English | Professional, clear, recommended for AI assistant |
|
|
| alan | Male | British English | Sophisticated, formal |
|
|
| lessac | Female | US English | Natural, conversational |
|
|
|
|
## Troubleshooting
|
|
|
|
### Voice Server Won't Start
|
|
- Ensure piper-tts is installed: `which piper-tts`
|
|
- Check voice models are downloaded: `ls ~/.local/share/piper-voices/`
|
|
- Verify port 8880 is available: `netstat -tlnp | grep 8880`
|
|
|
|
### Poor Audio Quality
|
|
- Try a different voice model
|
|
- Check audio system: `pactl info`
|
|
- Test piper directly: `echo "test" | piper-tts -m ~/.local/share/piper-voices/en_US-ryan-medium.onnx -f /tmp/test.wav`
|
|
|
|
### Audio Not Playing
|
|
- Check PulseAudio is running: `systemctl --user status pulseaudio`
|
|
- Test system audio: `speaker-test -t wav -c 2`
|
|
|
|
## Future Enhancements
|
|
|
|
- **Speech-to-Text**: Add Whisper.cpp for full voice conversations
|
|
- **Voice Selection**: Runtime voice switching via API
|
|
- **Voice Cloning**: Custom voice models
|
|
- **Multi-language**: Support for other languages |