Warning: This is a tech preview of github.com/simple-repository. The service is unsupported and may be removed at any time.

Project Links

Meta

Author: LiveKit

Requires Python: >=3.10.0

Classifiers

Intended Audience

Developers

License

OSI Approved :: Apache Software License

Programming Language

Python :: 3
Python :: 3 :: Only
Python :: 3.10

Topic

Multimedia :: Sound/Audio
Multimedia :: Video
Scientific/Engineering :: Artificial Intelligence

Baseten plugin for LiveKit Agents

Support for Baseten-hosted models in LiveKit Agents, including STT (Speech-to-Text), TTS (Text-to-Speech), and LLM (Large Language Model) integrations.

Installation

pip install livekit-plugins-baseten

Pre-requisites

You'll need an API key from Baseten. It can be set as an environment variable: BASETEN_API_KEY

You also need to deploy a model to Baseten and will need your model endpoint to configure the plugin.

STT (Speech-to-Text)

The STT plugin connects to Baseten's Whisper Streaming WebSocket endpoint for real-time transcription. It works with both truss and chain deployments.

Recommended model

Whisper v3 Turbo – WebSocket

Endpoint URL formats

Deployment type	URL pattern
Truss	`wss://model-{model_id}.api.baseten.co/environments/production/websocket`
Chain	`wss://chain-{chain_id}.api.baseten.co/environments/production/websocket`

Basic usage

You can specify the endpoint in three ways:

from livekit.plugins import baseten

# 1. Using a truss model ID (recommended for truss deployments)
stt = baseten.STT(
    api_key="your-baseten-api-key",  # or set BASETEN_API_KEY env var
    model_id="your-model-id",
    language="en",
)

# 2. Using a chain ID (recommended for chain deployments)
stt = baseten.STT(
    api_key="your-baseten-api-key",
    chain_id="your-chain-id",
    language="en",
)

# 3. Using a full endpoint URL (for custom routing or deployment URLs)
stt = baseten.STT(
    api_key="your-baseten-api-key",
    model_endpoint="wss://model-{model_id}.api.baseten.co/environments/production/websocket",
    language="en",
)

Configuration options

Parameter	Default	Description
`api_key`	`BASETEN_API_KEY` env var	Baseten API key
`model_endpoint`	`BASETEN_MODEL_ENDPOINT` env var	Full WebSocket URL (takes priority over `model_id`/`chain_id`)
`model_id`	—	Baseten truss model ID; auto-constructs the endpoint URL
`chain_id`	—	Baseten chain ID; auto-constructs the endpoint URL
`language`	`"en"`	BCP-47 language code (use `"auto"` for auto-detection)
`encoding`	`"pcm_s16le"`	Audio encoding (`pcm_s16le` or `pcm_mulaw`)
`sample_rate`	`16000`	Audio sample rate in Hz
`enable_partial_transcripts`	`True`	Emit interim transcripts while the speaker is talking
`partial_transcript_interval_s`	`1.0`	Interval (seconds) between partial transcript updates
`final_transcript_max_duration_s`	`30`	Max seconds of audio before forcing a final transcript
`show_word_timestamps`	`True`	Include word-level timestamps in results
`vad_threshold`	`0.5`	Server-side VAD speech probability threshold (0.0–1.0)
`vad_min_silence_duration_ms`	`300`	Minimum silence (ms) to mark end of speech
`vad_speech_pad_ms`	`30`	Padding (ms) added around detected speech

Full voice pipeline example

import os
from livekit import agents
from livekit.agents import AgentSession, Agent, RoomInputOptions, inference
from livekit.plugins import baseten, openai, noise_cancellation
from livekit.agents.inference import TurnDetector

BASETEN_API_KEY = os.getenv("BASETEN_API_KEY")
whisper_model_id = "your-whisper-model-id"  # or use chain_id for chain deployments
orpheus_model_id = "your-orpheus-model-id"


class Assistant(Agent):
    def __init__(self) -> None:
        super().__init__(instructions="You are a helpful voice AI assistant.")


async def entrypoint(ctx: agents.JobContext):
    session = AgentSession(
        stt=baseten.STT(
            api_key=BASETEN_API_KEY,
            model_id=whisper_model_id,  # or chain_id="your-chain-id"
            language="en",
            enable_partial_transcripts=True,
        ),
        llm=openai.LLM(
            api_key=BASETEN_API_KEY,
            base_url="https://inference.baseten.co/v1",
            model="openai/gpt-oss-120b",
        ),
        tts=baseten.TTS(
            api_key=BASETEN_API_KEY,
            model_endpoint=(
                f"https://model-{orpheus_model_id}"
                ".api.baseten.co/environments/production/predict"
            ),
        ),
        vad=inference.VAD(),
        turn_detection=TurnDetector(),
    )

    await session.start(
        room=ctx.room,
        agent=Assistant(),
        room_input_options=RoomInputOptions(
            noise_cancellation=noise_cancellation.BVC(),
        ),
    )

    await session.generate_reply(
        instructions="Greet the user and offer your assistance."
    )


if __name__ == "__main__":
    agents.cli.run_app(agents.WorkerOptions(entrypoint_fnc=entrypoint))

TTS (Text-to-Speech)

The TTS plugin calls Baseten-hosted TTS models (e.g. Orpheus 3B) over HTTP.

tts = baseten.TTS(
    api_key="your-baseten-api-key",
    model_endpoint="https://model-{model_id}.api.baseten.co/environments/production/predict",
    voice="tara",
    language="en",
)

LLM (Large Language Model)

The LLM plugin wraps Baseten's OpenAI-compatible inference endpoint.

llm = baseten.LLM(
    api_key="your-baseten-api-key",
    model="openai/gpt-oss-120b",
)

Documentation

1.6.7 Jul 25, 2026

1.6.6 Jul 18, 2026

1.6.5 Jul 09, 2026

1.6.4 Jun 24, 2026

1.6.3 Jun 22, 2026

1.6.2 Jun 19, 2026

1.6.1 Jun 17, 2026

1.6.0 Jun 11, 2026

1.6.0rc2 May 29, 2026

1.6.0rc1 May 27, 2026

1.5.19rc1 Jun 08, 2026

1.5.18 Jun 05, 2026

1.5.17 Jun 03, 2026

1.5.16 Jun 01, 2026

1.5.15 May 29, 2026

1.5.14 May 27, 2026

1.5.13 May 25, 2026

1.5.12 May 21, 2026

1.5.11 May 19, 2026

1.5.10 May 18, 2026

1.5.9 May 13, 2026

1.5.8 May 05, 2026

1.5.7 Apr 30, 2026

1.5.6 Apr 22, 2026

1.5.5 Apr 20, 2026

1.5.4 Apr 16, 2026

1.5.3 Apr 15, 2026

1.5.2 Apr 08, 2026

1.5.1 Mar 23, 2026

1.5.0 Mar 19, 2026

1.5.0rc2 Mar 06, 2026

1.5.0rc1 Feb 13, 2026

1.4.6 Mar 16, 2026

1.4.5 Mar 11, 2026

1.4.4 Mar 03, 2026

1.4.3 Feb 23, 2026

1.4.2 Feb 17, 2026

1.4.1 Feb 06, 2026

1.4.0 Feb 06, 2026

1.4.0rc2 Jan 23, 2026

1.4.0rc1 Dec 23, 2025

1.3.12 Jan 21, 2026

1.3.11 Jan 14, 2026

1.3.10 Dec 23, 2025

1.3.9 Dec 19, 2025

1.3.8 Dec 17, 2025

1.3.7 Dec 16, 2025

1.3.6 Dec 03, 2025

1.3.5 Nov 25, 2025

1.3.4 Nov 24, 2025

1.3.3 Nov 19, 2025

1.3.2 Nov 17, 2025

1.3.1 Nov 17, 2025

1.3.0rc2 Nov 15, 2025

1.3.0rc1 Nov 06, 2025

1.2.18 Nov 05, 2025

1.2.17 Oct 29, 2025

1.2.16 Oct 27, 2025

1.2.15 Oct 15, 2025

1.2.14 Oct 01, 2025

1.2.13 Oct 01, 2025

1.2.12 Sep 29, 2025

1.2.11 Sep 18, 2025

1.2.9 Sep 15, 2025

1.2.8 Sep 02, 2025

1.2.7 Aug 28, 2025

1.2.6 Aug 18, 2025

1.2.5 Aug 10, 2025

1.2.4 Aug 07, 2025

1.2.3 Aug 04, 2025

1.2.2 Jul 24, 2025

1.2.1 Jul 17, 2025

1.2.0 Jul 17, 2025

1.1.7 Jul 15, 2025

1.1.6 Jul 10, 2025

1.1.5 Jun 30, 2025

1.1.4 Jun 25, 2025

1.1.3 Jun 21, 2025

1.1.2 Jun 20, 2025

1.1.1 Jun 10, 2025

1.1.0 Jun 10, 2025

Wheel compatibility matrix

Platform	Python 3
any

Files in release

livekit_plugins_baseten-1.6.7-py3-none-any.whl (14.8KiB)

livekit_plugins_baseten-1.6.7.tar.gz (12.2KiB)

Extras: None

Dependencies:

livekit-agents (>=1.6.7)