VoiceGateway // DOCS
Configuration

Models

Every model in VoiceGateway is identified by a string in `provider/model` format.

Models

Model ID format

Every model in VoiceGateway is identified by a string in provider/model format.

Plain text
deepgram/nova-3
openai/gpt-4.1-mini
cartesia/sonic-3

Language and voice suffixes

STT model IDs can include a language suffix separated by a colon:

Plain text
deepgram/nova-3:en
deepgram/nova-3:es

TTS model IDs can include a voice suffix:

Plain text
cartesia/sonic-3:narrator-male
openai/tts-1:nova

LLM model IDs preserve trailing colons verbatim, so Ollama tags survive:

Plain text
ollama/qwen2.5:3b
ollama/llama3.2:3b

This asymmetry mirrors livekit.agents.inference: STT and TTS strip the last colon segment, LLM does not.

Using model IDs in code

agent.py
from voicegateway import inference

# Pass model ID strings directly to inference factories.
stt = inference.STT("deepgram/nova-3:en")          # :en parsed as language
llm = inference.LLM("openai/gpt-4.1-mini")
tts = inference.TTS("cartesia/sonic-3:narrator-male")  # :voice-id parsed as voice
llm_local = inference.LLM("ollama/qwen2.5:3b")     # :3b kept as part of model name

Registering custom models

You can register model aliases in voicegw.yaml under the models section. The aliases surface in the dashboard and CLI for display purposes; the voicegateway.inference module parses provider/model strings directly from the factory call, so an alias does not change runtime behaviour. Aliases are organised by modality (stt, llm, tts).

Via YAML

voicegw.yaml
models:
  stt:
    fast-stt:
      provider: deepgram
      model: nova-3
    accurate-stt:
      provider: assemblyai
      model: best
  llm:
    reasoning:
      provider: anthropic
      model: claude-sonnet-4-5
    fast-chat:
      provider: groq
      model: llama-3.1-8b-instant
  tts:
    narrator:
      provider: cartesia
      model: sonic-3
      default_voice: narrator-male
    cheap-tts:
      provider: piper
      model: en_US-lessac-medium

Each model entry supports:

  • provider (string, required) -- the provider identifier
  • model (string) -- the model name at the provider
  • default_voice (string, optional) -- default voice for TTS models

Via the dashboard

Models can also be registered through the web dashboard at the daemon URL (default http://localhost:8080). Models added through the dashboard are persisted in the SQLite database and merged with the YAML config at startup.

Via MCP

If you have the MCP server running (voicegw mcp), you can register models through MCP tool calls from your IDE. See the MCP documentation for details.

Model examples

STT models

Model IDProviderNotes
deepgram/nova-3DeepgramBest cloud STT accuracy
deepgram/nova-2DeepgramLower cost alternative
openai/whisper-1OpenAIOpenAI-hosted Whisper
groq/whisper-large-v3GroqFast Whisper via Groq
assemblyai/universal-2AssemblyAIHigh accuracy, single tier
local/whisper-large-v3Whisper (local)Best local STT
local/whisper-baseWhisper (local)Fastest local STT

LLM models

Model IDProviderNotes
openai/gpt-4.1-miniOpenAIGood cost/quality balance
openai/gpt-4.1OpenAIBest quality
anthropic/claude-sonnet-4-20250514AnthropicStrong reasoning
anthropic/claude-haiku-4-5AnthropicFast and cheap
groq/llama-3.3-70b-versatileGroqFast open-source LLM
groq/llama-3.1-8b-instantGroqUltra-fast, smaller model
ollama/llama3.2:3bOllama (local)Local LLM via Ollama
ollama/mistral:7bOllama (local)Local Mistral

TTS models

Model IDProviderNotes
cartesia/sonic-3CartesiaLow-latency streaming
openai/tts-1OpenAIFast cloud TTS
openai/tts-1-hdOpenAIHigh quality cloud TTS
elevenlabs/eleven_multilingual_v2ElevenLabs29 languages
elevenlabs/eleven_turbo_v2ElevenLabsFaster, English-focused
deepgram/aura-asteria-enDeepgramDeepgram TTS
local/kokoroKokoro (local)Lightweight local TTS
local/piper:en_US-lessac-mediumPiper (local)Fast offline TTS (voice ID after :)

See: Providers, Stacks, voicegw.yaml Reference

On this page