aix

AIX: Artificial Intelligence eXtensions

A clean, pythonic facade for common AI operations that abstracts away provider-specific details and complexities.

Quick Start:
>>> from aix import chat, embeddings, prompt_func, models

# Simple chat >>> response = chat(“What is 2+2?”) # doctest: +SKIP ‘The answer is 4.’

# Create prompt-based functions >>> translate = prompt_func(“Translate to French: {text}”) >>> translate(text=”Hello world”) # doctest: +SKIP ‘Bonjour le monde’

# Get embeddings >>> vecs = list(embeddings([“hello”, “world”])) # doctest: +SKIP >>> len(vecs) # doctest: +SKIP 2

# Discover models >>> models.discover() # doctest: +SKIP >>> list(models)[:5] # doctest: +SKIP [‘openai/gpt-4o’, ‘openai/gpt-4o-mini’, …]

Main Features:
  • chat(): Simple chat interface across providers

  • embeddings(): Vector embeddings for text

  • prompt_func(): Create functions from prompt templates

  • models: Model discovery and selection

  • generate_image(): Text-to-image generation

  • text_to_speech(), transcribe(): Audio operations

  • generate_video(): Text-to-video generation (provider-dependent)

  • Batch operations for efficiency

  • Clean, i2mint-style Mapping interfaces

Backends:
  • Uses LiteLLM for provider interactions

  • Supports OpenAI, Anthropic, Google, and 100+ models

  • OpenRouter integration for multi-provider access

For detailed documentation, see: https://github.com/thorwhalen/aix

class aix.AixConfig(chat: ChatConfig = <factory>, embeddings: EmbeddingConfig = <factory>, image: ImageConfig = <factory>, audio: AudioConfig = <factory>, video: VideoConfig = <factory>, vision: VisionConfig = <factory>, aliases: Mapping[str, str]=<factory>)[source]

Top-level AIX configuration (single source of truth for defaults).

class aix.BatchProcessor(*, batch_size: int = None, max_workers: int = None, show_progress: bool = True)[source]

Stateful batch processor for managing long-running operations.

Provides a higher-level interface for batch processing with progress tracking, error handling, and result caching.

Examples

>>> processor = BatchProcessor(show_progress=True)
>>> results = processor.process_chats(prompts)
>>> processor.save_results("output.json")
clear()[source]

Clear stored results and errors.

process_chats(prompts: Iterable[str | list[dict]], **kwargs) list[str][source]

Process chat prompts and store results.

Parameters:
  • prompts – Prompts to process

  • **kwargs – Additional parameters for batch_chat()

Returns:

List of results

process_embeddings(texts: Iterable[str], **kwargs) list[Sequence[float]][source]

Process embeddings and store results.

Parameters:
  • texts – Texts to embed

  • **kwargs – Additional parameters for batch_embeddings()

Returns:

List of embedding vectors

save_results(filepath: str)[source]

Save results to file.

Parameters:

filepath – Path to save results (JSON)

class aix.ChatSession(system_prompt: str = None, *, model: str = None, **chat_kwargs)[source]

Stateful chat session that maintains conversation history.

This class provides a convenient way to have multi-turn conversations without manually managing message history.

Examples

>>> session = ChatSession()
>>> response = session.send("My name is Alice")
>>> response = session.send("What's my name?")
'Your name is Alice.'
clear_history(keep_system: bool = True)[source]

Clear conversation history.

Parameters:

keep_system – If True, preserve system message (if any)

send(message: str, **kwargs) str[source]

Send a message and get a response.

Parameters:
  • message – User message to send

  • **kwargs – Override chat parameters for this message

Returns:

Assistant’s response

class aix.EmbeddingCache(model: str = None, **embedding_kwargs)[source]

Cache for embeddings to avoid redundant API calls.

Useful when you need to embed the same texts multiple times.

Examples

>>> cache = EmbeddingCache()
>>> vec1 = cache.embed("hello")  # API call
>>> vec2 = cache.embed("hello")  # From cache
>>> vec1 == vec2
True
clear()[source]

Clear the cache.

embed(text: str, force_refresh: bool = False) Sequence[float][source]

Get embedding for text, using cache if available.

Parameters:
  • text – Text to embed

  • force_refresh – If True, bypass cache and get fresh embedding

Returns:

Vector embedding

embed_batch(texts: Iterable[str], force_refresh: bool = False) list[Sequence[float]][source]

Get embeddings for multiple texts, using cache when possible.

Parameters:
  • texts – Texts to embed

  • force_refresh – If True, bypass cache

Returns:

List of embeddings in same order as input texts

class aix.GeneratedAudio(data: bytes, model: str = None, text: str = None, voice: str = None, format: str = 'mp3')[source]

Wrapper for generated audio.

Provides convenient access to audio data and saving.

Examples

>>> audio = GeneratedAudio(data=b'...', model="tts-1")
>>> audio.save("output.mp3")
>>> data = audio.as_bytes()
as_bytes() bytes[source]

Get audio as bytes.

Returns:

Audio data as bytes

play()[source]

Play the audio.

Requires a system audio player or library like pygame/pyaudio.

Examples

>>> audio.play()
save(path: str | Path)[source]

Save audio to file.

Parameters:

path – Output file path

Examples

>>> audio.save("output.mp3")
>>> audio.save("speech.wav")
class aix.GeneratedImage(url: str = None, b64_json: str = None, model: str = None, prompt: str = None, revised_prompt: str = None)[source]

Wrapper for generated images.

Provides convenient access to image data in various formats.

Examples

>>> img = GeneratedImage(url="https://...", model="dall-e-3")
>>> img.save("output.png")
>>> img.show()
>>> data = img.as_bytes()
as_bytes() bytes[source]

Get image as bytes.

Returns:

Image data as bytes

as_pil_image()[source]

Get image as PIL Image object.

Returns:

PIL.Image object

Raises:

ImportError – If PIL is not installed

save(path: str | Path, format: str = None)[source]

Save image to file.

Parameters:
  • path – Output file path

  • format – Image format (e.g., ‘PNG’, ‘JPEG’). Auto-detected from path if None.

Examples

>>> img.save("output.png")
>>> img.save("output.jpg", format="JPEG")
show()[source]

Display the image.

Requires PIL (Pillow) to be installed.

Examples

>>> img.show()
class aix.GeneratedVideo(url: str = None, data: bytes = None, model: str = None, prompt: str = None, duration: float = None, resolution: str = None, status: str = 'completed', task_id: str = None)[source]

Wrapper for generated videos.

Provides convenient access to video data and metadata.

Examples

>>> video = GeneratedVideo(url="https://...")
>>> video.save("output.mp4")
>>> print(video.duration)
5.0
as_bytes() bytes[source]

Get video as bytes.

Returns:

Video data as bytes

save(path: str | Path)[source]

Save video to file.

Parameters:

path – Output file path

Examples

>>> video.save("output.mp4")
wait_until_complete(max_wait: int = 300, poll_interval: int = 5)[source]

Wait for video generation to complete (for async operations).

Parameters:
  • max_wait – Maximum time to wait in seconds

  • poll_interval – Time between status checks in seconds

Raises:
  • TimeoutError – If generation doesn’t complete within max_wait

  • RuntimeError – If generation fails

class aix.ImageComparison(match: bool, confidence: float, explanation: str, aspects: tuple[~aix.vision.RubricVerdict, ...]=<factory>, model: str | None = None)[source]

Structured result of comparing a candidate image to reference(s).

Behaves like an ordered, read-only mapping of aspect -> RubricVerdict (comparison["identity"], in, iteration, len) so per-aspect lookups read naturally, while also carrying the overall verdict.

match

Overall pass/fail across the whole rubric.

Type:

bool

confidence

Overall confidence in [0.0, 1.0].

Type:

float

explanation

A short overall summary of the comparison.

Type:

str

aspects

The per-aspect verdicts, one RubricVerdict per rubric item, in rubric order.

Type:

tuple[aix.vision.RubricVerdict, …]

model

The vision model id that produced the verdict.

Type:

str | None

get(aspect: str, default: RubricVerdict | None = None) RubricVerdict | None[source]

Return the verdict for aspect, or default if absent.

exception aix.MissingCredentialError(model_or_provider: str | None, *, provider: str | None = None, env_names: list[str] | None = None)[source]

Raised when a required API key cannot be resolved.

The message names which key is missing, how to set it, and (when known) where to obtain one. Key values are never included.

class aix.ModelStore(storage_path: str | Path = None, auto_discover: bool = False)[source]

User-friendly interface for model discovery and selection.

Provides a Mapping interface over the ModelManager with convenient access patterns and integration with chat/embeddings functions.

Examples

>>> models = ModelStore()
>>> models.discover()  # Fetch available models
>>> # List all models
>>> list(models)[:5]
['openai/gpt-4o', 'openai/gpt-4o-mini', ...]
>>> # Get model info
>>> info = models['openai/gpt-4o']
>>> info.provider
'openai'
>>> # Filter models
>>> openai_models = models.filter(provider='openai')
>>> local_models = models.filter(is_local=True)
>>> # Use with chat
>>> from aix.chat import chat
>>> model = models['gpt-4o-mini']
>>> chat("Hello", model=model.id)
'Hello! How can I help you?'
by_provider(provider: str) list[Model][source]

Get all models from a specific provider.

Parameters:

provider – Provider name

Returns:

List of models

Examples

>>> models = ModelStore()
>>> openai_models = models.by_provider('openai')
by_task(task: str) list[Model][source]

Get models suitable for a specific task.

Parameters:

task – Task name (‘chat’, ‘embedding’, ‘image’, etc.)

Returns:

List of suitable models

Examples

>>> models = ModelStore()
>>> chat_models = models.by_task('chat')
discover(source: str = 'openrouter', auto_register: bool = True, verbose: bool = False) list[Model][source]

Discover models from a source.

Parameters:
  • source – Source name (‘openrouter’, ‘ollama’, etc.)

  • auto_register – If True, add discovered models to registry

  • verbose – If True, print progress information

Returns:

List of discovered models

Examples

>>> models = ModelStore()
>>> discovered = models.discover('openrouter')
>>> len(discovered) > 100
True
filter(*, provider: str = None, is_local: bool = None, min_context_size: int = None, max_context_size: int = None, has_capabilities: list[str] = None, tags: list[str] = None, custom_filter: callable = None) list[Model][source]

Filter models by criteria.

Parameters:
  • provider – Filter by provider name (‘openai’, ‘anthropic’, etc.)

  • is_local – Filter by local vs remote

  • min_context_size – Minimum context window size

  • max_context_size – Maximum context window size

  • has_capabilities – Required capabilities

  • tags – Required tags

  • custom_filter – Custom filter function: f(Model) -> bool

Returns:

List of models matching criteria

Examples

>>> models = ModelStore()
>>> models.discover(verbose=False)
>>> # Get OpenAI models
>>> openai = models.filter(provider='openai')
>>> # Get local models
>>> local = models.filter(is_local=True)
>>> # Get cheap models
>>> cheap = models.filter(
...     custom_filter=lambda m: m.cost_per_token.get('input', 0) < 0.001
... )
>>> # Combine criteria
>>> good_models = models.filter(
...     provider='openai',
...     min_context_size=8000,
...     custom_filter=lambda m: 'gpt-4' in m.id
... )
get_connector_metadata(model_id: str, connector: str) dict[str, Any][source]

Get connector-specific metadata for a model.

Parameters:
  • model_id – Model identifier

  • connector – Connector name (‘openai’, ‘openrouter’, etc.)

Returns:

Dict with connector-specific parameters

Examples

>>> models = ModelStore()
>>> meta = models.get_connector_metadata(
...     'openai/gpt-4o',
...     'openai'
... )
>>> meta['model']
'gpt-4o'
get_info(model_id: str) Model[source]

Get detailed information about a model.

Parameters:

model_id – Model identifier

Returns:

Model object with full metadata

Examples

>>> models = ModelStore()
>>> models.discover(verbose=False)
>>> info = models.get_info('openai/gpt-4o')
>>> info.context_size
128000
recommend(*, task: str = 'chat', max_cost_per_mtok: float = None, min_context_size: int = None, prefer_local: bool = False) list[Model][source]

Get recommended models based on requirements.

Parameters:
  • task – Primary task (‘chat’, ‘embedding’, etc.)

  • max_cost_per_mtok – Maximum cost per million tokens

  • min_context_size – Minimum required context size

  • prefer_local – Prefer local models if available

Returns:

List of recommended models, sorted by suitability

Examples

>>> models = ModelStore()
>>> recommendations = models.recommend(
...     task='chat',
...     max_cost_per_mtok=5.0,
...     min_context_size=16000
... )
search(query: str) list[Model][source]

Search models by text query.

Searches in model ID, provider, and tags.

Parameters:

query – Search query

Returns:

List of matching models

Examples

>>> models = ModelStore()
>>> models.discover(verbose=False)
>>> results = models.search('gpt-4')
>>> len(results) > 0
True
class aix.PromptFuncs(model: str = None, **default_kwargs)[source]

Collection of prompt-based functions.

Provides a namespace for organizing related prompt functions with attribute-based access.

Examples

>>> funcs = PromptFuncs()
>>> funcs.add('summarize', "Summarize: {text}")
>>> funcs.add('translate', "Translate {text} to {language}")
>>> funcs.summarize(text="Long article...")
'Summary...'
>>> funcs.translate(text="Hello", language="Spanish")
'Hola'
add(name: str, template: str, *, output_schema: dict | type = None, **kwargs) None[source]

Add a function to the collection.

Parameters:
  • name – Function name (will be accessible as attribute)

  • template – Prompt template

  • output_schema – Optional schema for structured output

  • **kwargs – Additional parameters for prompt_func

keys()[source]

Get all function names.

class aix.RubricVerdict(aspect: str, match: bool, confidence: float, note: str = '')[source]

A vision model’s verdict on a single rubric aspect.

aspect

The rubric item this verdict is about (e.g. "identity").

Type:

str

match

Whether the candidate matches the reference for this aspect.

Type:

bool

confidence

The model’s self-reported confidence in [0.0, 1.0].

Type:

float

note

A short free-text explanation of the verdict.

Type:

str

class aix.TranscriptionResult(text: str, language: str = None, duration: float = None, segments: list = None, model: str = None)[source]

Result of audio transcription.

Contains the transcribed text and optional metadata like segments and timestamps.

Examples

>>> result = TranscriptionResult(text="Hello world")
>>> print(result.text)
'Hello world'
aix.animate_image_to_video(image_path: str | Path, prompt: str = None, *, model: str = None, duration: float = 3.0, motion_strength: float = 0.5, **kwargs) GeneratedVideo

Animate a static image into a video.

Parameters:
  • image_path – Path to the source image

  • prompt – Optional text prompt to guide the animation

  • model – Video generation model to use

  • duration – Animation duration in seconds

  • motion_strength – Strength of motion (0.0 to 1.0)

  • **kwargs – Additional provider-specific parameters

Returns:

GeneratedVideo object

Examples

>>> from aix.video import animate_image
>>> video = animate_image(
...     "landscape.jpg",
...     prompt="Gentle camera pan across the scene",
...     duration=4
... )
>>> video.save("animated_landscape.mp4")
aix.ask(question: str, model: str = None, **kwargs) str[source]

Ask a single question and get an answer.

This is a convenience wrapper around chat() for simple Q&A.

Parameters:
  • question – The question to ask

  • model – Model to use

  • **kwargs – Additional parameters for chat()

Returns:

The answer as a string

Examples

>>> from aix.chat import ask
>>> ask("What is the capital of France?")
'The capital of France is Paris.'
aix.batch_chat(prompts: Iterable[str | list[dict]], *, model: str = None, batch_size: int = None, max_workers: int = None, show_progress: bool = False, **chat_kwargs) Iterable[str][source]

Process multiple chat prompts in batches.

This function processes multiple prompts efficiently by: 1. Chunking prompts into batches 2. Processing batches in parallel where possible 3. Yielding results in the same order as input

Parameters:
  • prompts – Iterable of prompts (strings or message lists)

  • model – Model to use for all prompts

  • batch_size – Number of prompts to process in each batch

  • max_workers – Maximum number of parallel workers

  • show_progress – If True, print progress information

  • **chat_kwargs – Additional parameters passed to chat()

Yields:

Responses in the same order as input prompts

Examples

>>> from aix.batches import batch_chat
>>> prompts = [
...     "What is 2+2?",
...     "What is 3+3?",
...     "What is 5+5?"
... ]
>>> results = list(batch_chat(prompts))
>>> len(results)
3
>>> # With specific model
>>> results = list(batch_chat(
...     prompts,
...     model="gpt-4o-mini",
...     batch_size=5
... ))
>>> # Process large dataset
>>> def generate_prompts():
...     for i in range(100):
...         yield f"Explain concept {i}"
>>> results = batch_chat(
...     generate_prompts(),
...     show_progress=True
... )
>>> for i, result in enumerate(results):
...     print(f"Result {i}: {result[:50]}...")
aix.batch_embeddings(segments: Iterable[str], *, model: str = None, batch_size: int = None, show_progress: bool = False, **embedding_kwargs) Iterable[Sequence[float]][source]

Generate embeddings for multiple texts in batches.

For efficiency, this function processes embeddings in chunks, as most embedding APIs can handle multiple texts per request.

Parameters:
  • segments – Iterable of text strings to embed

  • model – Embedding model to use

  • batch_size – Number of texts per batch

  • show_progress – If True, print progress information

  • **embedding_kwargs – Additional parameters for embeddings()

Yields:

Vector embeddings in the same order as input

Examples

>>> from aix.batches import batch_embeddings
>>> texts = ["hello", "world", "foo", "bar"] * 25  # 100 texts
>>> vectors = list(batch_embeddings(
...     texts,
...     batch_size=10,
...     show_progress=True
... ))
>>> len(vectors)
100
>>> # Process large dataset efficiently
>>> def read_documents():
...     # Generator that yields documents
...     for i in range(1000):
...         yield f"Document {i} content"
>>> all_vectors = []
>>> for vec in batch_embeddings(read_documents()):
...     all_vectors.append(vec)
aix.batch_process(items: Iterable[Any], process_func: callable, *, batch_size: int = None, max_workers: int = None, show_progress: bool = False, retry_attempts: int = None, retry_delay: float = None) Iterable[Any][source]

Generic batch processing with parallel execution and retries.

This is a general-purpose batch processor that can be used for any operation, not just chat or embeddings.

Parameters:
  • items – Items to process

  • process_func – Function to apply to each item

  • batch_size – Batch size for chunking

  • max_workers – Maximum parallel workers

  • show_progress – Show progress information

  • retry_attempts – Number of retry attempts on failure

  • retry_delay – Delay between retries (seconds)

Yields:

Results in same order as input

Examples

>>> from aix.batches import batch_process
>>> from aix.chat import chat
>>> # Custom processing function
>>> def analyze_sentiment(text):
...     return chat(f"Analyze sentiment: {text}")
>>> texts = ["I love this!", "This is terrible", "It's okay"]
>>> results = list(batch_process(
...     texts,
...     analyze_sentiment,
...     batch_size=5
... ))
>>> # With retries for flaky operations
>>> def flaky_api_call(item):
...     # Some API that might fail
...     return call_api(item)
>>> results = batch_process(
...     items,
...     flaky_api_call,
...     retry_attempts=3,
...     retry_delay=2.0
... )
aix.chat(prompt: str | Iterable[dict], *, model: str = None, temperature: float = None, max_tokens: int = None, stream: bool = False, api_key: str = None, **kwargs) str | Iterable[str][source]

Send a chat prompt and get a response.

This is the main chat interface for AIX. It abstracts away provider-specific details and provides a clean, consistent API across all models.

Parameters:
  • prompt – Either a string (becomes a user message) or a list of message dicts with ‘role’ and ‘content’ keys

  • model – Model identifier (e.g., ‘gpt-4o’, ‘claude-sonnet-4’, ‘openrouter/anthropic/claude-3.5-sonnet’). If None, uses default.

  • temperature – Sampling temperature (0.0 = deterministic, 2.0 = creative). If None, uses default (1.0).

  • max_tokens – Maximum tokens to generate. If None, uses model’s default.

  • stream – If True, return an iterator of text chunks. If False, return complete response as string.

  • api_key – Explicit API key. If None, resolved from the environment / .env / AIX config store for the model’s provider (see aix.credentials).

  • **kwargs – Additional provider-specific parameters passed to LiteLLM

Returns:

Complete response as string If stream=True: Iterator yielding text chunks as they arrive

Return type:

If stream=False

Raises:
  • ImportError – If LiteLLM is not installed

  • ValueError – If prompt format is invalid

Examples

>>> chat("What is Python?")
'Python is a high-level programming language...'
>>> chat("Hello", model="gpt-4o")
'Hello! How can I assist you today?'
>>> # Streaming response
>>> for chunk in chat("Count to 5", stream=True):
...     print(chunk, end='', flush=True)
1, 2, 3, 4, 5
>>> # With message history
>>> messages = [
...     {"role": "system", "content": "You are a helpful assistant."},
...     {"role": "user", "content": "What is 2+2?"}
... ]
>>> chat(messages)
'2+2 equals 4.'
aix.chat_with_history(system_prompt: str = None, *, model: str = None, **chat_kwargs) ChatSession[source]

Create a stateful chat session that maintains conversation history.

Parameters:
  • system_prompt – Optional system message to set context/behavior

  • model – Model to use for this session

  • **chat_kwargs – Additional parameters passed to chat()

Returns:

ChatSession object with send() method

Examples

>>> session = chat_with_history("You are a helpful math tutor")
>>> session.send("What is 2+2?")
'The answer is 4.'
>>> session.send("And if I add 3 to that?")
'That would be 7.'
>>> len(session.history)
5
aix.check_keys(providers: Iterable[str] | None = None) dict[str, dict][source]

Report, per provider, whether a usable API key is discoverable.

Never returns or logs key values – only availability and the env-var name(s) checked. Useful for quick setup debugging.

Parameters:

providers – Providers to check; defaults to every provider in PROVIDER_ENV_VARS.

Returns:

bool, “env_vars”: […], “source”: <where-found-or-None>}``. source is "env", "store", or None (never the value itself).

Return type:

Mapping ``provider -> {“available”

Examples

>>> report = check_keys(["openai"])
>>> report["openai"]["available"]
True
aix.check_requirements(model_or_provider: str | None, *, api_key: str | None = None) bool[source]

Presence-only preflight: ensure a key for model_or_provider is resolvable.

Does not validate the key over the network – only that one is discoverable via resolve_api_key() (explicit arg, env/.env, or store). Raises MissingCredentialError with actionable guidance when absent.

Returns True when a key is available.

Examples

>>> check_requirements("gpt-4o", api_key="sk-explicit")
True
aix.compare_images(candidate: str | Path | bytes | Any, reference: str | Path | bytes | Any | Sequence[Any], *, rubric: Sequence[str] = ('identity', 'costume', 'setting', 'lighting', 'props'), model: str | None = None, api_key: str | None = None, max_tokens: int | None = None, temperature: float | None = 0.0, detail: str | None = None, instruction: str = 'You are a strict visual continuity supervisor. The FIRST image is the CANDIDATE; the remaining image(s) are the locked REFERENCE the candidate must match. Judge each rubric aspect INDEPENDENTLY (do not let one aspect color another). For each aspect decide whether the candidate matches the reference, give a confidence in [0.0, 1.0], and a short note explaining what matches or drifts. Then give an overall match (true only if every important aspect matches), an overall confidence, and a one-sentence explanation.', **kwargs: Any) ImageComparison[source]

Compare a candidate image to reference image(s) on a rubric.

The explainable half of a reference supervisor: a vision model returns a per-aspect pass/fail checklist (does the face match? the costume? the set? the lighting? the props?) plus an overall verdict — the explainable layer over a cheap numeric identity-cosine gate (which lives elsewhere, e.g. lookbook). Built on to_image_content() (multi-image content block) and the same multimodal completion path as describe_image(), asked via a JSON contract for a structured answer.

The comparison is pointwise (each aspect judged on its own), not a ranked pairwise comparison, to avoid position bias.

Parameters:
  • candidate – The image under review (URL / path / bytes / PIL image / data: URI — anything to_image_content() accepts).

  • reference – The locked reference — a single image or a sequence of images (a locked set) in the same flexible formats. An empty sequence is an error.

  • rubric – The aspects to evaluate, one verdict per item. Defaults to DFLT_COMPARE_RUBRIC; pass DFLT_FILM_RUBRIC (or any custom sequence) to override — e.g. ("face", "architecture", "props", "lighting"). Must be non-empty.

  • model – Vision-capable model id or alias. None → the configured default (aix.config.VisionConfig).

  • api_key – Explicit API key; None resolves it for the model’s provider from the environment / .env / config store.

  • max_tokens – Cap on generated tokens (None → provider default).

  • temperature – Sampling temperature (default 0.0 for a stable, reproducible verdict; None → provider default).

  • detail – Image detail hint ("low" | "high" | "auto"), applied to every image block.

  • instruction – The system-style framing prepended to the rubric. Has a sensible default; override to retune the supervisor’s strictness.

  • **kwargs – Extra provider-specific params forwarded to LiteLLM.

Returns:

An ImageComparison — overall match / confidence / explanation plus an ordered, mapping-like collection of RubricVerdict (one per rubric aspect, keyed by aspect).

Raises:
  • ImportError – If LiteLLM is not installed.

  • ValueError – If rubric is empty or reference is an empty sequence, or if the model’s reply can’t be parsed as the expected JSON verdict.

Examples

Default rubric, single reference:

>>> compare_images("gen.png", "ref.png")
ImageComparison(match=True, confidence=0.9, ...)

Filmmaking rubric, a locked reference set:

>>> from aix.vision import DFLT_FILM_RUBRIC
>>> compare_images(
...     candidate="frame_042.png",
...     reference=["ref_front.png", "ref_side.png"],
...     rubric=DFLT_FILM_RUBRIC,
... )["face_identity"].match
True
aix.configure(**overrides: Any) AixConfig[source]

Apply runtime overrides to the active config and return it.

Examples

>>> from aix import config
>>> _ = config.configure(chat_model="openai/gpt-4o-mini")
>>> config.get_config().chat.model
'openai/gpt-4o-mini'
>>> _ = config.configure(chat={"temperature": 0.2})
>>> config.get_config().chat.temperature
0.2
aix.constrained_answer(prompt: str, valid_answers: list[str] | list[int] | list[float] | type | tuple[float, float], *, model: str = None, temperature: float = None, enhance_prompt: bool = False, n: int = 1)[source]

Get an answer from the LLM constrained to a set of valid answers or types.

Uses JSON mode to ensure the LLM returns a valid response based on constraints. More flexible than the oa version - works with any model that supports JSON mode via LiteLLM.

This can be seen as a facade for some common structured output use cases, as well as a convenient tool to do response statistics and validation (via n>1).

Parameters:
  • prompt – The question or prompt to ask the LLM

  • valid_answers – Can be: - list[str]: List of valid string options - list[int]: List of valid integer options - list[float]: List of valid float options - bool: Constrains answer to True or False - int: Any integer - float: Any number - tuple[float, float]: Numerical range (min, max) inclusive

  • model – The model to use for the LLM (default: uses DFLT_CHAT_MODEL)

  • temperature – Temperature for sampling (default: None, uses model’s default). Higher values (e.g., 1.0) give more random/varied results. Lower values (e.g., 0.0) give more deterministic results.

  • enhance_prompt – If True, adds explicit instructions to the prompt about JSON formatting and constraints. If False (default), relies on response_format alone. Default is False to match oa behavior.

  • n – Number of times to call the LLM (default: 1)

Returns:

One of the valid answers, respecting the type constraint. If n > 1, returns a list of answers.

Examples

>>> # String options
>>> answer = constrained_answer(
...     "Is Python compiled or interpreted?",
...     ["compiled", "interpreted", "both"]
... )
>>> answer in ["compiled", "interpreted", "both"]
True
>>> # Boolean
>>> answer = constrained_answer(
...     "Is Python dynamically typed?",
...     bool
... )
>>> isinstance(answer, bool)
True
>>> # Integer options
>>> answer = constrained_answer(
...     "How many wheels does a car have?",
...     [2, 3, 4, 6, 8]
... )
>>> answer in [2, 3, 4, 6, 8]
True
>>> # Numerical range
>>> answer = constrained_answer(
...     "What is a reasonable hourly rate for a senior Python developer? (USD)",
...     (50.0, 300.0)
... )
>>> 50.0 <= answer <= 300.0
True
>>> # Multiple samples for statistics
>>> answers = constrained_answer(
...     "Which is better: cats or dogs?",
...     ["cats", "dogs"],
...     n=10
... )
>>> len(answers)
10
aix.cosine_similarity(vec1: Sequence[float], vec2: Sequence[float]) float[source]

Compute cosine similarity between two vectors.

Parameters:
  • vec1 – First vector

  • vec2 – Second vector

Returns:

Cosine similarity (between -1 and 1)

Examples

>>> from aix.embeddings import embed, cosine_similarity
>>> v1 = embed("cat")
>>> v2 = embed("kitten")
>>> similarity = cosine_similarity(v1, v2)
>>> similarity > 0.8
True
aix.create_variation(image_path: str | Path, *, model: str = None, size: str = None, n: int = 1, api_key: str = None, **kwargs) GeneratedImage | list[GeneratedImage][source]

Create variations of an existing image.

Parameters:
  • image_path – Path to the source image

  • model – Model to use (typically ‘dall-e-2’)

  • size – Output image size

  • n – Number of variations to generate

  • **kwargs – Additional provider-specific parameters

Returns:

GeneratedImage or list of GeneratedImage objects

Examples

>>> from aix.image import create_variation
>>> variations = create_variation(
...     "original.png",
...     n=3,
...     size="512x512"
... )
>>> for i, var in enumerate(variations):
...     var.save(f"variation_{i}.png")
aix.describe_image(image: str | Path | bytes | Any, *, prompt: str = 'Describe this image in detail.', model: str | None = None, api_key: str | None = None, max_tokens: int | None = None, temperature: float | None = None, detail: str | None = None, **kwargs: Any) str[source]

Describe (or answer a question about) image and return the text.

The image→text primitive. image accepts a URL, a local path, raw bytes, a PIL image, or a data: URI (see to_image_content()). prompt is the instruction (default: a generic “describe this image”); pass a question for VQA or a rubric for a judgement. Everything past image is keyword.

Parameters:
  • image – The image (URL / path / bytes / PIL image / data: URI).

  • prompt – The text instruction accompanying the image.

  • model – Vision-capable model id (e.g. gpt-4o, claude-sonnet-4-6, gemini/gemini-1.5-pro) or an alias. None → the configured default (aix.config.VisionConfig).

  • api_key – Explicit API key; None resolves it from the environment / .env / config store for the model’s provider.

  • max_tokens – Cap on generated tokens (None → provider default).

  • temperature – Sampling temperature (None → provider default).

  • detail – Image detail hint ("low" | "high" | "auto").

  • **kwargs – Extra provider-specific params forwarded to LiteLLM.

Returns:

The model’s text response.

Raises:

ImportError – If LiteLLM is not installed.

>>> describe_image("cat.jpg", prompt="Caption it.")
'A cat on a sofa.'
aix.discover_available_models(source: str = 'openrouter', verbose: bool = True) list[Model][source]

Discover available models from a source.

Convenience function that uses the global models instance.

Parameters:
  • source – Source name (‘openrouter’, ‘ollama’, etc.)

  • verbose – Print progress information

Returns:

List of discovered models

Examples

>>> from aix.models import discover_available_models
>>> models = discover_available_models('openrouter')
>>> len(models) > 100
True
aix.edit_image(image_path: str | Path, prompt: str, *, mask_path: str | Path = None, model: str = None, size: str = None, n: int = 1, api_key: str = None, **kwargs) GeneratedImage | list[GeneratedImage][source]

Edit an existing image based on a prompt.

Parameters:
  • image_path – Path to the image to edit

  • prompt – Description of the desired edit

  • mask_path – Optional path to mask image (transparent areas will be edited)

  • model – Model to use (typically ‘dall-e-2’ for edits)

  • size – Output image size

  • n – Number of variations to generate

  • **kwargs – Additional provider-specific parameters

Returns:

GeneratedImage or list of GeneratedImage objects

Examples

>>> from aix.image import edit_image
>>> edited = edit_image(
...     "photo.png",
...     "Add a rainbow in the sky",
...     mask_path="sky_mask.png"
... )
>>> edited.save("edited_photo.png")
aix.embed(text: str, *, model: str = None, **kwargs) Sequence[float][source]

Generate embedding for a single text.

Convenience function for embedding a single text string.

Parameters:
  • text – Text string to embed

  • model – Model identifier

  • **kwargs – Additional parameters for embeddings()

Returns:

Vector embedding as sequence of floats

Examples

>>> from aix.embeddings import embed
>>> vec = embed("Hello, world!")
>>> len(vec)
1536
>>> # Compare similarity
>>> import numpy as np
>>> v1 = np.array(embed("cat"))
>>> v2 = np.array(embed("kitten"))
>>> v3 = np.array(embed("computer"))
>>> # Cosine similarity
>>> np.dot(v1, v2) / (np.linalg.norm(v1) * np.linalg.norm(v2))
0.92  # High similarity
>>> np.dot(v1, v3) / (np.linalg.norm(v1) * np.linalg.norm(v3))
0.23  # Low similarity
aix.embeddings(segments: Iterable[str], *, model: str = None, api_key: str = None, **kwargs) Iterable[Sequence[float]][source]

Generate embeddings for multiple text segments.

This is the main embedding interface for AIX. It abstracts away provider-specific details and provides a clean, consistent API across all embedding models.

Parameters:
  • segments – Iterable of text strings to embed

  • model – Model identifier (e.g., ‘text-embedding-3-small’, ‘text-embedding-ada-002’, ‘openrouter/openai/text-embedding-3-small’). If None, uses default.

  • api_key – Explicit API key. If None, resolved from the environment / .env / AIX config store for the model’s provider (see aix.credentials).

  • **kwargs – Additional provider-specific parameters passed to LiteLLM

Yields:

Vector embeddings as sequences of floats. Each embedding corresponds to one input segment in the same order.

Raises:
  • ImportError – If LiteLLM is not installed

  • ValueError – If segments is empty or invalid

Examples

>>> from aix.embeddings import embeddings
>>> texts = ["cat", "dog", "bird"]
>>> vecs = list(embeddings(texts))
>>> len(vecs)
3
>>> # With specific model
>>> vecs = list(embeddings(
...     ["hello", "world"],
...     model="text-embedding-3-large"
... ))
>>> # Process in chunks for large datasets
>>> def chunk_texts(texts, size=100):
...     for i in range(0, len(texts), size):
...         yield texts[i:i+size]
>>> all_vecs = []
>>> for chunk in chunk_texts(large_dataset):
...     all_vecs.extend(embeddings(chunk))
aix.extend_video(video_path: str | Path, prompt: str = None, *, extend_duration: float = 2.0, model: str = None, **kwargs) GeneratedVideo[source]

Extend an existing video with additional generated content.

Parameters:
  • video_path – Path to the source video

  • prompt – Optional text prompt to guide the extension

  • extend_duration – How many seconds to add

  • model – Video generation model

  • **kwargs – Additional provider-specific parameters

Returns:

GeneratedVideo object with extended content

Examples

>>> from aix.video import extend_video
>>> extended = extend_video(
...     "original.mp4",
...     prompt="Continue the same scene",
...     extend_duration=3
... )
aix.find_models(query: str) list[Model][source]

Search for models matching a query.

Convenience function that uses the global models instance.

Parameters:

query – Search query

Returns:

List of matching models

Examples

>>> from aix.models import find_models
>>> results = find_models('claude')
>>> any('claude' in m.id.lower() for m in results)
True
aix.find_most_similar(query: str | Sequence[float], candidates: Iterable[str], *, model: str = None, top_k: int = 5, **kwargs) list[tuple[str, float]][source]

Find most similar texts to a query.

Parameters:
  • query – Query text or pre-computed embedding vector

  • candidates – Candidate texts to compare against

  • model – Embedding model to use

  • top_k – Number of top results to return

  • **kwargs – Additional parameters for embeddings()

Returns:

List of (text, similarity_score) tuples, sorted by similarity (highest first)

Examples

>>> query = "What is machine learning?"
>>> docs = [
...     "Machine learning is a type of AI",
...     "Python is a programming language",
...     "Neural networks are used in deep learning",
... ]
>>> results = find_most_similar(query, docs, top_k=2)
>>> results[0][0]  # Most similar doc
'Machine learning is a type of AI'
aix.generate_image(prompt: str, *, model: str = None, size: str = None, quality: str = None, style: str = None, response_format: str = 'url', api_key: str = None, **kwargs) GeneratedImage[source]

Generate a single image from a text prompt.

Parameters:
  • prompt – Text description of the image to generate

  • model – Model to use (e.g., ‘dall-e-2’, ‘dall-e-3’, ‘stable-diffusion’)

  • size – Image size (e.g., ‘1024x1024’, ‘512x512’, ‘1792x1024’)

  • quality – Image quality (‘standard’ or ‘hd’ for DALL-E 3)

  • style – Image style (‘vivid’ or ‘natural’ for DALL-E 3)

  • response_format – Format of response (‘url’ or ‘b64_json’)

  • api_key – Explicit API key. If None, resolved from the environment / .env / AIX config store for the model’s provider (see aix.credentials).

  • **kwargs – Additional provider-specific parameters

Returns:

GeneratedImage object

Raises:

ImportError – If LiteLLM is not installed

Examples

>>> from aix.image import generate_image
>>> image = generate_image("A serene mountain landscape")
>>> image.save("landscape.png")
>>> # High quality with DALL-E 3
>>> image = generate_image(
...     "Abstract art with vibrant colors",
...     model="dall-e-3",
...     quality="hd",
...     style="vivid"
... )
>>> # Specific size
>>> image = generate_image(
...     "A futuristic city",
...     size="1792x1024"
... )
aix.generate_images(prompt: str, *, n: int = None, model: str = None, size: str = None, quality: str = None, style: str = None, response_format: str = 'url', api_key: str = None, **kwargs) list[GeneratedImage][source]

Generate multiple images from a text prompt.

Parameters:
  • prompt – Text description of images to generate

  • n – Number of images to generate

  • model – Model to use

  • size – Image size

  • quality – Image quality

  • style – Image style

  • response_format – Format of response

  • **kwargs – Additional provider-specific parameters

Returns:

List of GeneratedImage objects

Examples

>>> from aix.image import generate_images
>>> images = generate_images(
...     "A cute robot",
...     n=3,
...     size="512x512"
... )
>>> for i, img in enumerate(images):
...     img.save(f"robot_{i}.png")
aix.generate_video(prompt: str, *, model: str = None, duration: float = 5.0, resolution: str = '1280x720', fps: int = 24, aspect_ratio: str = None, style: str = None, seed: int = None, api_key: str = None, **kwargs) GeneratedVideo[source]

Generate a video from a text prompt.

Note: This is a high-level interface. Actual implementation depends on available video generation providers (Runway, Pika, etc.) and may require provider-specific API keys.

Parameters:
  • prompt – Text description of the video to generate

  • model – Video generation model to use

  • duration – Video duration in seconds (typically 2-10 seconds)

  • resolution – Video resolution (‘1280x720’, ‘1920x1080’, etc.)

  • fps – Frames per second

  • aspect_ratio – Aspect ratio (‘16:9’, ‘9:16’, ‘1:1’, etc.)

  • style – Video style hint (provider-specific)

  • seed – Random seed for reproducibility

  • api_key – Explicit API key for the video provider. If None, resolved from the environment / .env / AIX config store (see aix.credentials).

  • **kwargs – Additional provider-specific parameters

Returns:

GeneratedVideo object

Raises:
  • NotImplementedError – If no video provider is configured

  • ImportError – If required provider SDK is not installed

Examples

>>> from aix.video import generate_video
>>> video = generate_video(
...     "A serene ocean sunset with gentle waves",
...     duration=5,
...     resolution="1920x1080"
... )
>>> video.save("sunset.mp4")
>>> # Specific style
>>> video = generate_video(
...     "A futuristic city",
...     style="cyberpunk",
...     duration=4
... )
aix.get_config() AixConfig[source]

Return the current active AixConfig.

aix.get_model_info(model_id: str) Model[source]

Get information about a specific model.

Convenience function that uses the global models instance.

Parameters:

model_id – Model identifier

Returns:

Model object

Examples

>>> from aix.models import get_model_info
>>> info = get_model_info('openai/gpt-4o')
>>> info.provider
'openai'
aix.get_video_providers() list[str]

Get list of available video generation providers.

Returns:

List of provider names that are configured

Examples

>>> from aix.video import get_available_providers
>>> providers = get_available_providers()
>>> print(providers)
['runway', 'pika']
aix.load_config(path: str | None = None, *, environ: Mapping[str, str] | None = None) AixConfig[source]

Resolve an AixConfig from shipped defaults, TOML file, and env.

Precedence (low to high): shipped defaults < TOML file < environment variables. Runtime overrides (configure()/using()) and explicit call arguments sit above this and are applied elsewhere.

Parameters:
  • path – Optional explicit TOML path. Defaults to config_file_path().

  • environ – Optional environment mapping (defaults to os.environ).

Returns:

A fully resolved AixConfig.

aix.prompt_func(template: str, *, output_schema: dict | type = None, egress: Callable[[Any], Any] = None, model: str = None, temperature: float = None, name: str = None, **chat_kwargs) Callable[source]

Create a callable function from a prompt template.

This is the main function for creating prompt-based functions. It automatically detects parameters from the template and creates a function with those parameters.

Without output_schema: Returns text With output_schema: Returns structured data (dict, list, etc.) With egress: Returns whatever the egress post-processor returns.

Templates use the default-aware {name:default} dialect (matching oa.prompt_function): {name} is a required parameter, {name:default} supplies a default value (the text after the colon), and braces inside ` fenced ` regions are left literal. Plain {name}-only templates behave exactly as before.

>>> greet = prompt_func("Greet {name} in {language:English}")
>>> greet.param_names
['name', 'language']
Parameters:
  • template – Prompt template with {var} placeholders for parameters

  • output_schema – Optional schema for structured output. Can be: - Dict mapping field names to types: {“name”: str, “age”: int} - A single type for simple outputs: str, int, list, etc. - None for plain text output (default)

  • egress – Optional post-processor (result) -> Any applied to the output before returning — on both the text and structured paths. Lets a caller keep “prompt → typed Python value” inside the facade (e.g. parse the LLM text into a list of lines/ids) instead of wrapping the returned function. None (default) returns the raw result unchanged.

  • model – Model to use for this function

  • temperature – Temperature for generation

  • name – Optional __name__ for the generated function (for tracing / identity). Defaults to "prompt_based_function".

  • **chat_kwargs – Additional parameters passed to chat()

Returns:

Callable function with parameters derived from template

Examples

>>> # Simple text generation
>>> summarize = prompt_func("Summarize this text: {text}")
>>> summarize(text="Long article...")
'Brief summary...'
>>> # Structured output
>>> extract = prompt_func(
...     "Extract contact info from: {text}",
...     output_schema={"name": str, "email": str, "phone": str}
... )
>>> result = extract(text="Contact John at john@example.com, 555-1234")
>>> result['name']
'John'
>>> # Multiple parameters
>>> compare = prompt_func(
...     "Compare {item1} and {item2} in terms of {aspect}. "
...     "Keep it under {word_limit} words."
... )
>>> compare(
...     item1="Python",
...     item2="Java",
...     aspect="learning curve",
...     word_limit=50
... )
'Python has a gentler learning curve...'
>>> # With specific model
>>> creative_writer = prompt_func(
...     "Write a creative story about {topic}",
...     model="gpt-4o",
...     temperature=1.5
... )
>>> creative_writer(topic="a time-traveling cat")
'Once upon a time, there was a cat named Whiskers...'
aix.prompt_to_json(template: str, schema: dict | type, **kwargs) Callable[source]

Create a function that returns structured JSON output.

This is an explicit alias for prompt_func with output_schema.

Parameters:
  • template – Prompt template

  • schema – Output schema

  • **kwargs – Additional parameters for prompt_func

Returns:

Function that returns structured data

Examples

>>> extract = prompt_to_json(
...     "Extract name and age from: {text}",
...     schema={"name": str, "age": int}
... )
>>> result = extract(text="Alice is 30")
>>> isinstance(result, dict)
True
aix.prompt_to_text(template: str, **kwargs) Callable[source]

Create a function that returns text output.

This is an explicit alias for prompt_func without output_schema.

Parameters:
  • template – Prompt template

  • **kwargs – Additional parameters for prompt_func

Returns:

Function that returns text

Examples

>>> summarize = prompt_to_text("Summarize: {text}")
>>> result = summarize(text="Long text...")
>>> isinstance(result, str)
True
aix.resolve_api_key(model_or_provider: str | None, *, api_key: str | None = None, prompt_if_missing: bool = False) str | None[source]

Resolve an API key for model_or_provider through the documented layers.

Layers, highest precedence first: explicit api_key argument, provider environment variable (with soft .env discovery), the AIX config store, and – only when prompt_if_missing is true and running interactively – an interactive prompt that persists the answer.

Parameters:
  • model_or_provider – A model id (provider inferred via LiteLLM) or a provider name directly (e.g. "openai").

  • api_key – An explicit key; when given (non-empty) it is returned verbatim.

  • prompt_if_missing – If true, fall back to a REPL prompt + persist when no key is found elsewhere. Off by default so the common path never blocks on input.

Returns:

The resolved key, or None if genuinely absent.

Examples

>>> resolve_api_key("gpt-4o", api_key="sk-explicit")
'sk-explicit'
aix.resolve_model(model: str | None, *, config: AixConfig | None = None) str | None[source]

Resolve a semantic alias ("fast", "best", …) to a concrete model id.

Plain substitution: if model is a key in the active aliases table it is replaced (following chains, with cycle protection). Anything that is not an alias – including every literal model id like "gpt-4o" – is returned unchanged. None passes through (callers apply their own default first).

Note: aliases share a namespace with literal model ids, so an unknown name is treated as a literal id, not an error. Inspect available aliases via aix.get_config().aliases.

Examples

>>> from aix import config
>>> config.resolve_model("fast") in config.DEFAULT_ALIASES.values()
True
>>> config.resolve_model("gpt-4o")  # not an alias -> unchanged
'gpt-4o'
>>> config.resolve_model(None) is None
True
aix.set_config(config: AixConfig) AixConfig[source]

Replace the active config wholesale. Returns the new active config.

aix.text_to_speech(text: str, *, model: str = None, voice: str = None, speed: float = None, response_format: str = 'mp3', api_key: str = None, **kwargs) GeneratedAudio[source]

Convert text to speech audio.

Parameters:
  • text – Text to convert to speech

  • model – TTS model to use (e.g., ‘tts-1’, ‘tts-1-hd’)

  • voice – Voice to use (‘alloy’, ‘echo’, ‘fable’, ‘onyx’, ‘nova’, ‘shimmer’)

  • speed – Playback speed (0.25 to 4.0)

  • response_format – Audio format (‘mp3’, ‘opus’, ‘aac’, ‘flac’)

  • **kwargs – Additional provider-specific parameters

Returns:

GeneratedAudio object

Raises:

ImportError – If LiteLLM is not installed

Examples

>>> from aix.audio import text_to_speech
>>> audio = text_to_speech("Hello, how are you?")
>>> audio.save("greeting.mp3")
>>> # Different voice and speed
>>> audio = text_to_speech(
...     "This is a test.",
...     voice="nova",
...     speed=1.2
... )
>>> # High quality
>>> audio = text_to_speech(
...     "Important announcement",
...     model="tts-1-hd",
...     voice="onyx"
... )
aix.to_image_content(image: str | Path | bytes | Any, *, detail: str | None = None) dict[source]

Build a multimodal image_url content block for image.

image may be:

  • an http(s):// URL or a data: URI — passed through verbatim;

  • a local file path (str or Path) — read and inlined as a base64 data: URI with a guessed MIME type;

  • raw bytes — base64-inlined (MIME sniffed from magic bytes, else JPEG);

  • a PIL Image — encoded to PNG and inlined.

detail ("low" | "high" | "auto") is forwarded when set; the block is the OpenAI/LiteLLM multimodal shape understood across providers.

>>> to_image_content("https://x/y.jpg")
{'type': 'image_url', 'image_url': {'url': 'https://x/y.jpg'}}
>>> to_image_content("data:image/png;base64,AAAA")["image_url"]["url"][:10]
'data:image'
aix.transcribe(audio: str | Path | BinaryIO | bytes, *, engine: str = None, model: str = None, language: str = None, prompt: str = None, response_format: str = 'text', temperature: float = None, timestamp_granularities: list[str] = None, api_key: str = None, **kwargs) str | TranscriptionResult[source]

Transcribe audio to text.

By default this routes through LiteLLM (OpenAI-style transcription). Pass engine= to instead delegate to a scribed backend — one façade over many ASR engines (local Whisper / faster-whisper / vosk, or cloud Deepgram / AssemblyAI / Groq / ElevenLabs / Google …) with speaker diarization and SRT/VTT output. The return type is unchanged either way, so existing callers are unaffected.

Parameters:
  • audio – Audio file path, file object, or bytes.

  • engine – Optional scribed backend id (e.g. "faster-whisper", "deepgram"). When given, transcription is delegated to scribed (which resolves that engine’s own credentials); the LiteLLM path is bypassed. Requires pip install 'aix[scribed]'. See scribed.list_backends().

  • model – Transcription model (e.g. 'whisper-1'); for a scribed engine, the engine-specific model (e.g. a Whisper size).

  • language – Source language (ISO-639-1 code, e.g. 'en', 'es').

  • prompt – Optional text to guide the model’s style (LiteLLM path).

  • response_format'text' (default) → str; 'srt'/'vtt' → subtitle str (scribed path); else → TranscriptionResult.

  • temperature – Sampling temperature (LiteLLM path).

  • timestamp_granularities – Timestamp types (‘word’, ‘segment’) (LiteLLM path).

  • **kwargs – Additional parameters (forwarded to LiteLLM, or to the scribed backend — e.g. diarize=True).

Returns:

str for response_format in {text, srt, vtt}, else a TranscriptionResult.

Examples

>>> from aix.audio import transcribe
>>> text = transcribe("recording.mp3")
>>> # delegate to a scribed engine (local, free, diarized SRT):
>>> srt = transcribe(
...     "meeting.wav", engine="faster-whisper", response_format="srt"
... )
>>> dg = transcribe(
...     "call.mp3", engine="deepgram", diarize=True,
...     response_format="verbose_json",
... )
aix.transcribe_with_timestamps(audio: str | Path | BinaryIO | bytes, *, granularity: str = 'segment', model: str = None, **kwargs) TranscriptionResult[source]

Transcribe audio with detailed timestamps.

Parameters:
  • audio – Audio file path, file object, or bytes

  • granularity – Timestamp granularity (‘word’ or ‘segment’)

  • model – Transcription model

  • **kwargs – Additional parameters for transcribe()

Returns:

TranscriptionResult with detailed segments

Examples

>>> from aix.audio import transcribe_with_timestamps
>>> result = transcribe_with_timestamps("lecture.mp3")
>>> for segment in result.segments:
...     start = segment['start']
...     end = segment['end']
...     text = segment['text']
...     print(f"[{start:.2f}-{end:.2f}] {text}")
aix.translate_audio(audio: str | Path | BinaryIO | bytes, *, model: str = None, prompt: str = None, api_key: str = None, **kwargs) str[source]

Translate audio from any language to English.

Note: Currently uses Whisper’s translation capability which translates to English.

Parameters:
  • audio – Audio file path, file object, or bytes

  • model – Translation model (typically ‘whisper-1’)

  • prompt – Optional text to guide translation

  • **kwargs – Additional provider-specific parameters

Returns:

Translated text in English

Examples

>>> from aix.audio import translate_audio
>>> english_text = translate_audio("spanish_audio.mp3")
>>> print(english_text)
'This is the English translation.'
aix.using(**overrides: Any) Iterator[AixConfig][source]

Context manager applying scoped overrides, restored on exit.

Examples

>>> from aix import config
>>> with config.using(chat_temperature=0.0) as c:
...     c.chat.temperature
0.0