pacing.core.transcription_interfaces

Transcription interfaces for the PACING platform.

These interfaces define how audio is converted to text. Implementations can use various speech-to-text services (Deepgram, Whisper, Google Speech, etc.) or mock transcribers for testing.

class pacing.core.transcription_interfaces.ITranscriber[source]

Abstract interface for speech-to-text transcription.

This interface allows the system to support multiple transcription backends without changing the core logic. The transcriber is responsible for:

Converting audio chunks to text
Providing confidence scores for transcriptions
Handling partial (streaming) transcriptions
Speaker diarization (if supported)

Design Philosophy: - Transcribers should be stateless or manage their own state - They should handle their own buffering and context management - Confidence scores must be normalized to [0.0, 1.0]

get_model_info() → dict[source]

Get information about the transcription model.

Returns:: Model metadata (name, version, language, etc.)
Return type:: dict

abstractmethod supports_speaker_diarization() → bool[source]

Check if this transcriber supports speaker diarization.

Returns:: True if speaker_id will be populated in TranscriptionResult
Return type:: bool

abstractmethod async transcribe_chunk(audio_chunk: ndarray, sample_rate: int, is_final: bool = False) → TranscriptionResult[source]

Transcribe a single audio chunk.

Parameters:

audio_chunk – Audio samples (typically float32 or int16)
sample_rate – Sample rate in Hz
is_final – Whether this is the final chunk in a sequence

Returns:

The transcription with confidence score

Return type:

TranscriptionResult

Notes

For streaming transcription, is_final=False produces partial results
Implementations should handle silence gracefully
Empty audio should return empty text with high confidence

async transcribe_stream(audio_stream: AsyncIterator[ndarray], sample_rate: int) → AsyncIterator[TranscriptionResult][source]

Transcribe a stream of audio chunks.

This is a convenience method that processes an audio stream and yields transcription results. The default implementation calls transcribe_chunk() for each audio chunk.

Parameters:

audio_stream – Async iterator of audio chunks
sample_rate – Sample rate in Hz

Yields:

TranscriptionResult – Transcriptions as they become available

Example

async for result in transcriber.transcribe_stream(audio_stream, 16000):: print(f”{result.text} (confidence: {result.confidence_score})”)