Desk-Server | Rhasspy

Streams microphone data from a PyAudio device (Documentation)

Device:

UDP Audio (Output)

(outside ASR listening)

Output siteId

(output to different siteId)

Starts an arecord process locally and reads audio data from its standard out (Documentation)

Device:

UDP Audio (Output)

(outside ASR listening)

Output siteId

Calls an external program to record raw audio (Documentation)

Record Program

Record Arguments

Sample Rate

(Hertz)

Sample Width

(bytes)

Channels

List Program

List Arguments

Test Program

Test Arguments

UDP Audio Port

(outside ASR listening)

Output siteId

(output to different siteId)

Expects audio frames from a custom external service that supports the Hermes Protocol (Documentation)

Audio Statistics

Note: statistics will not work if UDP audio is enabled

Connecting to websocket...

Current Energy:

Max Energy:

Min Energy:

Max/Current Ratio:

Listens for a keyphrase using pocketsphinx (Documentation)

Wake Keyphrase

3-4 syllables recommended

Sensitivity

UDP Audio (Input)

(outside ASR listening)

Listens for one or more wake words with snowboy (Documentation)

Put models in /profiles/it/snowboy

Model:

Available Models:

Sensitivity

(comma-separated)

Apply Frontend (more info)

See documentation for how to use multiple wake words.

UDP Audio (Input)

(outside ASR listening)

Listens for a wake word with porcupine (Documentation)

Put models in /profiles/it/porcupine

Keyword File:

Available Keywords:

Sensitivity

(comma-separated)

UDP Audio (Input)

(outside ASR listening)

Listens for a wake word with Mycroft Precise (Documentation)

Put models in /profiles/it/precise

Model File:

Available Models:

Sensitivity

Trigger Level

UDP Audio (Input)

(outside ASR listening)

Listens for a wake word with Raven (Documentation)

Template Directory: profiles/it/raven

Enabled	Keyword	Example 1	Example 2	Example 3

Default Settings

Probability Threshold

Average Templates

Minimum Matches

VAD Sensitivity:

(1-3)

UDP Audio (Input)

(outside ASR listening)

Calls a custom external program and wakes up when the program exits (Documentation)

Program

Arguments

Expects hotword detections from an external service that supports the Hermes Protocol (Documentation)

Satellite siteIds:

(comma-separated)

Does speech recognition with CMU's pocketsphinx. Less accurate, but supports many languages (Documentation)

Acoustic Model: profiles/it/acoustic_model
Dictionary: profiles/it/dictionary.txt
Language Model: profiles/it/language_model.txt

Open transcription mode (no custom voice commands)

Minimum Confidence:

(0-1, 1 is most strict)

Base Dictionary: profiles/it/base_dictionary.txt
Base Language Model: profiles/it/base_language_model.txt

Mixed Language Model Weight

(0-1, 0 to disable)

Base Language Model FST: profiles/it/base_language_model.fst

Does speech recognition with Kaldi. Fast and accurate for trained sentences (Documentation)

Model Root: /profiles/it/kaldi/model
Graph Directory: <model_root>/graph

Language Model Type:

Replace unknown words with

(re-train required)

Unknown Words Probability:

(0-1, default: 1e-5)

Silence Probability:

(0-1, default: 0.5)

Maximum Unknown Words

(words in longest possible misspoken sentence)

Maximum Frequent Words

(used to catch misspoken words)

Cancel Word:

(word used to cancel an intent at any time)

Cancel Probability:

(0-1, default: 1e-2)

Open transcription mode (no custom voice commands)

Minimum Confidence:

(0-1, 1 is most strict)

Base Dictionary: /profiles/it/kaldi/base_dictionary.txt
Base Graph Directory: profiles/it/base_graph
Base Language Model: profiles/it/kaldi/base_language_model.txt

Mixed Language Model Weight

(0-1, 0 to disable)

Base Language Model FST: profiles/it/base_language_model.fst

Does speech recognition with Mozilla's DeepSpeech version 0.9. Slower on some hardware, but often more accurate (Documentation)

Acoustic Model: profiles/it/deepspeech/model/0.9/output_graph.pbmm
Language Model: profiles/it/deepspeech/lm.binary
Scorer: profiles/it/deepspeech/scorer

Open transcription mode (no custom voice commands)

Minimum Confidence:

(0-1, 1 is most strict)

Base Language Model: profiles/it/deepspeech/model/0.9/base_lm.binary
Base Scorer: profiles/it/deepspeech/model/0.9/base.scorer

Mixed Language Model Weight

(0-1, 0 to disable)

Base Language Model FST: profiles/it/base_language_model.fst

Satellite siteIds:

(comma-separated)

Does speech recognition with Vosk. Fast and accurate for open transcription (Documentation)

Model Root: profiles/it/vosk/model
Words JSON: profiles/it/vosk/words.json

Open transcription mode (no custom voice commands)

Minimum Confidence:

(0-1, 1 is most strict)

POSTs WAV audio to a remote HTTP endpoint, expecting a plain text transcription back (Documentation)

Speech to Text URL

Example: http://localhost:12101//rhasspy/api/speech-to-text

Minimum Confidence:

(0-1, 1 is most strict)

Training URL

Calls an external program with WAV audio on standard input and expects a text transcription on standard output (Documentation)

Program

Arguments

Minimum Confidence:

(0-1, 1 is most strict)

Satellite siteIds:

(comma-separated)

Expects an external service that supports the Hermes Protocol to do speech to text (Documentation)

Satellite siteIds:

(comma-separated)

Voice Command Settings

Documentation

Silence Method

VAD Sensitivity:

(1-3)

Current Energy Threshold:

(audio above threshold is speech)

Max/Current Energy Ratio Threshold

(audio below threshold is speech)

Max Energy:

(set from audio input if no value)

Skip Before:

(seconds)

Minimum Duration:

(seconds)

Maximum Duration:

(seconds or empty)

Speech Before:

(seconds)

Silence After:

(seconds)

Record Before:

(seconds)

Uses rhasspy-nlu to recognize only the sentences Rhasspy was trained on (Documentation)

Intent Graph: /profiles/it/intent_graph.pickle.gz
Stop Words: /usr/lib/rhasspy/rhasspy-profile/rhasspyprofile/profiles/it/stop_words.txt

Fuzzy text matching

Failure Token:

(input word that always causes recognition failure)

Satellite siteIds:

(comma-separated)

Finds the closest matching intent using rapidfuzz library (Documentation)

Minimum Confidence:

(0-1, 1 is most strict)

Satellite siteIds:

(comma-separated)

Documentation

Note: requires an external service (installation)

Server URL:

Example: http://localhost:5005

Examples File: profiles/it/intent_examples.md
Intent Graph: /profiles/it/intent_graph.pickle.gz

Project Name:

YAML Config:

Satellite siteIds:

(comma-separated)

Uses Snips NLU to flexibly recognize sentences (Documentation)

Engine: profiles/it/snips/engine
Dataset: profiles/it/snips/dataset.yaml

Language:

Supported Languages: de, en, es, fr, it, ja, ko, pt_br, pt_pt, zh

Satellite siteIds:

(comma-separated)

POSTs plain text to an HTTP endpoint and receives intent JSON back (Documentation)

Rhasspy Text-to-Intent URL

Example: http://localhost:12101//rhasspy/api/text-to-intent

Training URL

Satellite siteIds:

(comma-separated)

Calls an external program with text on standard input and expects intent JSON on standard output (Documentation)

Program

Arguments

Satellite siteIds:

(comma-separated)

Expects an external service that supports the Hermes Protocol to recognize intents (Documentation)

Uses eSpeak to speak sentences. Sounds robotic, but supports many languages and locales (Documentation).

Voice:

Available Voices:

Volume

Uses SVOX's picotts to speak sentences (Documentation)

Language:

Available Languages:

Volume

Uses an improved fork of picoTTS to speak sentences (Documentation)

Language:

Available Languages:

Volume

Uses FestVox's flite to speak sentences (Documentation)

Voice:

Available Voices:

Volume

Uses a remote MaryTTS web server to speak sentences (Documentation)

Note: requires an external service (Docker image available)

URL:

Locale:

Voice:

Available Voices:

Volume

Uses Google's WaveNet to speak sentences (Documentation)

Note: requires an internet connection and a Google account

Voice:

Sample Rate

Credentials: profiles/it/tts/googlewavenet/credentials.json
WAV cache: profiles/it/tts/googlewavenet/cache

Volume

Uses a remote OpenTTS web server to speak sentences (Documentation)

Note: requires an external service (Docker image available)

URL:

Voice:

Available Voices:

Volume

Uses Larynx to speak sentences (Documentation)

Voice:

Available Voices:

Vocoder Quality:

Volume

POSTs text to a remote HTTP endpoint and plays received WAV audio (Documentation)

Rhasspy Text-to-Speech URL

Example: http://localhost:12101//rhasspy/api/text-to-speech

Calls an external program with text on standard input and plays WAV audio from standard output (Documentation)

Say Program

Arguments

Default Language

Voices Program

Arguments

Volume

Expects an external service that supports the Hermes Protocol to speak sentences (Documentation)

Satellite siteIds:

(comma-separated)

Plays WAV files on the local device by calling the aplay command (Documentation)

Device:

Available Devices:

Volume

POSTs WAV audio to a remote HTTP endpoint (Documentation)

Audio Output URL

Calls an external program with WAV audio on standard input (Documentation)

Program

Arguments

Expects an external service that supports the Hermes Protocol to play audio (Documentation)

Sends intents or events directly to Home Assistant or Hass.io (Documentation)

Hass URL

Address of your Home Assistant server

Access Token

Long-lived access token (automatically filled in Hass.IO)

API Password

Home Assistant password (deprecated)

Send events to Home Assistant (/rhasspy/api/events)

Events will be named

Send intents to Home Assistant (/rhasspy/api/intent/handle)

Requires the intent component and intent scripts in your configuration.yaml

POSTs intent JSON to a remote HTTP endpoint (Documentation)

Remote URL

Calls a custom external program with intent JSON on standard input (Documentation)

Program

Arguments

Satellite siteIds:

(comma-separated)

Sounds

WAV files to play when Rhasspy wakes up and is finished recording a voice command.

Note: dialogue management must be set to "Rhasspy" to hear these sounds

Use ${RHASSPY_PROFILE_DIR} environment variable for your profile directory.

Wake WAV

(empty to disable)

Recorded WAV

(empty to disable)

Error WAV

(empty to disable)

Volume

Certificates

Files needed for using HTTPS with your Rhasspy server or Home Assistant

Note: all paths should be absolute

Use ${RHASSPY_PROFILE_DIR} environment variable for your profile directory.

Certificate Path:

PEM certificate file (required)

Key Path:

Private key file (optional)

Audio Recording Disabled PyAudio (Recommended) arecord Local Command Hermes MQTT

Wake Word Disabled Rhasspy Raven Porcupine (Recommended) Snowboy Mycroft Precise Pocketsphinx Local Command Hermes MQTT

Default Settings

Speech to Text Disabled Pocketsphinx Kaldi (Recommended) Mozilla DeepSpeech Vosk Remote HTTP Local Command Hermes MQTT

Voice Command Settings

Intent Recognition Disabled Fsticuffs (Recommended) Fuzzywuzzy Rasa NLU Remote HTTP Local Command Hermes MQTT

Text to Speech Disabled Espeak PicoTTS NanoTTS (Recommended) Flite (Incompatible) MaryTTS Google Wavenet OpenTTS Larynx Remote HTTP Local Command Hermes MQTT

Audio Playing Disabled aplay (Recommended) Remote HTTP Local Command Hermes MQTT

Dialogue Management Disabled Rhasspy (Recommended) Hermes MQTT

Intent Handling Disabled Home Assistant Remote HTTP Local Command

Sounds

Certificates