Files

Arpad Krejczinger 9aa881d895 Document voice assistant TTS service status

Mark TTS functionality as disabled due to onnxruntime removal
(freed 1.2GB disk space during cleanup)

2025-10-11 18:25:08 +02:00

4.0 KiB

Raw Blame History

Voice Assistant Setup

⚠️ STATUS: DISABLED - onnxruntime package was removed to free disk space (1.2GB). Voice functionality is currently unavailable.

This document describes how to set up AI voice capabilities for Claude Code using local TTS (Text-to-Speech) services.

Overview

The voice assistant setup uses:

Piper TTS: Local neural text-to-speech engine for generating natural-sounding speech
FastAPI: HTTP server wrapper to make Piper compatible with voice-mode
Ryan voice model: Professional male US English voice for AI assistant personality
onnxruntime: ML inference library (removed - required for TTS)

Prerequisites

System Dependencies

Install required packages from AUR:

yay -S piper-tts

Voice Models

Download voice models to the piper voices directory:

# Create voice models directory
mkdir -p ~/.local/share/piper-voices
cd ~/.local/share/piper-voices

# Download Ryan voice (male US English - recommended for AI assistant)
wget https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/ryan/medium/en_US-ryan-medium.onnx
wget https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/ryan/medium/en_US-ryan-medium.onnx.json

# Optional: Download Alan voice (male British English)
wget https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_GB/alan/medium/en_GB-alan-medium.onnx
wget https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_GB/alan/medium/en_GB-alan-medium.onnx.json

Testing Piper TTS

Test the installation:

echo "Hello, this is a test of piper text to speech" | piper-tts -m ~/.local/share/piper-voices/en_US-ryan-medium.onnx -f /tmp/test_voice.wav

You should hear a clear male voice saying the test phrase.

Voice Server Setup

The voice server provides an HTTP API compatible with OpenAI's TTS format, allowing Claude Code to use Piper TTS seamlessly.

Installation

Navigate to the voice server directory:
```
cd /path/to/homelab/voice-server
```
Install dependencies with Poetry:
```
poetry install
```
Start the voice server:
```
poetry run voice-server
```

The server will start on http://127.0.0.1:8880 and provide:

/v1/audio/speech - TTS endpoint compatible with OpenAI API
/v1/models - List available models
/health - Health check endpoint

Usage

Starting Voice Mode

Use the custom voice command to start both the server and enable voice mode:

./scripts/enable-voice.sh

Voice Conversation

Once the server is running, you can use voice commands in Claude Code:

# Text-to-speech only (no microphone input)
converse("Hello! I can now speak using the local piper TTS system.", wait_for_response=False)

Configuration

The voice server uses the Ryan voice model by default. To change voices, edit the configuration in:

voice-server/config.py

Available Voice Models

Voice	Gender	Accent	Description
ryan	Male	US English	Professional, clear, recommended for AI assistant
alan	Male	British English	Sophisticated, formal
lessac	Female	US English	Natural, conversational

Troubleshooting

Voice Server Won't Start

Ensure piper-tts is installed: which piper-tts
Check voice models are downloaded: ls ~/.local/share/piper-voices/
Verify port 8880 is available: netstat -tlnp | grep 8880

Poor Audio Quality

Try a different voice model
Check audio system: pactl info
Test piper directly: echo "test" | piper-tts -m ~/.local/share/piper-voices/en_US-ryan-medium.onnx -f /tmp/test.wav

Audio Not Playing

Check PulseAudio is running: systemctl --user status pulseaudio
Test system audio: speaker-test -t wav -c 2

Future Enhancements

Speech-to-Text: Add Whisper.cpp for full voice conversations
Voice Selection: Runtime voice switching via API
Voice Cloning: Custom voice models
Multi-language: Support for other languages

4.0 KiB Raw Blame History