Spec proposal: OVOS-AUDIO-IN-1 — Audio Input Service

Proposal for OVOS-AUDIO-IN-1, the audio input service specification.

## Problem

The audio input service — the component that acquires audio, runs pre-STT processing, transcribes to text, and injects the result into the utterance lifecycle — has no normative contract. How it is implemented (microphone, file, remote stream, wake word, VAD) is entirely deployer-defined and should stay that way. What it must produce is not specified anywhere.

## Proposal

Minimal spec with three normative obligations:

1. **A STT mechanism MUST exist** — deployer-defined; engine, model, API, or local process are all out of scope
2. **Audio-transformer chain MUST run before STT** (TRANSFORM-1 §3.1) — canonical use cases: language identification (writing `session.detected_lang`), denoising/normalisation, speaker recognition (result written into `Message.context`)
3. **MUST emit `ovos.utterance.handle`** with `data.utterances` (array of transcription candidates) and `data.lang` (BCP-47 output language)

Everything else — capture method, STT engine selection, post-STT transformer chains — is deployer concern and explicitly out of scope.

## Language fields

- **Language selection order** (inputs to STT): `session.detected_lang` → `session.request_lang` → `session.lang`
- **`data.lang`** — the transcript's output language (what the text is in)
- **`session.stt_lang`** (SHOULD write) — the language the STT model was configured to assume; matches `data.lang` in normal transcription, diverges in speech-translation (stt_lang = audio's spoken language, data.lang = translated output language)

## PR

PR #51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spec proposal: OVOS-AUDIO-IN-1 — Audio Input Service #52

Problem

Proposal

Language fields

PR

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Spec proposal: OVOS-AUDIO-IN-1 — Audio Input Service #52

Description

Problem

Proposal

Language fields

PR

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions