Agent Framework plugin for Baseten
Project Links
Meta
Author: LiveKit
Requires Python: >=3.10.0
Classifiers
Intended Audience
- Developers
License
- OSI Approved :: Apache Software License
Programming Language
- Python :: 3
- Python :: 3 :: Only
- Python :: 3.10
Topic
- Multimedia :: Sound/Audio
- Multimedia :: Video
- Scientific/Engineering :: Artificial Intelligence
Baseten plugin for LiveKit Agents
Support for Baseten-hosted models in LiveKit Agents, including STT (Speech-to-Text), TTS (Text-to-Speech), and LLM (Large Language Model) integrations.
Installation
pip install livekit-plugins-baseten
Pre-requisites
You'll need an API key from Baseten. It can be set as an environment variable: BASETEN_API_KEY
You also need to deploy a model to Baseten and will need your model endpoint to configure the plugin.
STT (Speech-to-Text)
The STT plugin connects to Baseten's Whisper Streaming WebSocket endpoint for real-time transcription. It works with both truss and chain deployments.
Recommended model
Endpoint URL formats
| Deployment type | URL pattern |
|---|---|
| Truss | wss://model-{model_id}.api.baseten.co/environments/production/websocket |
| Chain | wss://chain-{chain_id}.api.baseten.co/environments/production/websocket |
Basic usage
You can specify the endpoint in three ways:
from livekit.plugins import baseten
# 1. Using a truss model ID (recommended for truss deployments)
stt = baseten.STT(
api_key="your-baseten-api-key", # or set BASETEN_API_KEY env var
model_id="your-model-id",
language="en",
)
# 2. Using a chain ID (recommended for chain deployments)
stt = baseten.STT(
api_key="your-baseten-api-key",
chain_id="your-chain-id",
language="en",
)
# 3. Using a full endpoint URL (for custom routing or deployment URLs)
stt = baseten.STT(
api_key="your-baseten-api-key",
model_endpoint="wss://model-{model_id}.api.baseten.co/environments/production/websocket",
language="en",
)
Configuration options
| Parameter | Default | Description |
|---|---|---|
api_key |
BASETEN_API_KEY env var |
Baseten API key |
model_endpoint |
BASETEN_MODEL_ENDPOINT env var |
Full WebSocket URL (takes priority over model_id/chain_id) |
model_id |
— | Baseten truss model ID; auto-constructs the endpoint URL |
chain_id |
— | Baseten chain ID; auto-constructs the endpoint URL |
language |
"en" |
BCP-47 language code (use "auto" for auto-detection) |
encoding |
"pcm_s16le" |
Audio encoding (pcm_s16le or pcm_mulaw) |
sample_rate |
16000 |
Audio sample rate in Hz |
enable_partial_transcripts |
True |
Emit interim transcripts while the speaker is talking |
partial_transcript_interval_s |
1.0 |
Interval (seconds) between partial transcript updates |
final_transcript_max_duration_s |
30 |
Max seconds of audio before forcing a final transcript |
show_word_timestamps |
True |
Include word-level timestamps in results |
vad_threshold |
0.5 |
Server-side VAD speech probability threshold (0.0–1.0) |
vad_min_silence_duration_ms |
300 |
Minimum silence (ms) to mark end of speech |
vad_speech_pad_ms |
30 |
Padding (ms) added around detected speech |
Full voice pipeline example
import os
from livekit import agents
from livekit.agents import AgentSession, Agent, RoomInputOptions
from livekit.plugins import baseten, openai, noise_cancellation, silero
from livekit.plugins.turn_detector.multilingual import MultilingualModel
BASETEN_API_KEY = os.getenv("BASETEN_API_KEY")
whisper_model_id = "your-whisper-model-id" # or use chain_id for chain deployments
orpheus_model_id = "your-orpheus-model-id"
class Assistant(Agent):
def __init__(self) -> None:
super().__init__(instructions="You are a helpful voice AI assistant.")
async def entrypoint(ctx: agents.JobContext):
session = AgentSession(
stt=baseten.STT(
api_key=BASETEN_API_KEY,
model_id=whisper_model_id, # or chain_id="your-chain-id"
language="en",
enable_partial_transcripts=True,
),
llm=openai.LLM(
api_key=BASETEN_API_KEY,
base_url="https://inference.baseten.co/v1",
model="openai/gpt-oss-120b",
),
tts=baseten.TTS(
api_key=BASETEN_API_KEY,
model_endpoint=(
f"https://model-{orpheus_model_id}"
".api.baseten.co/environments/production/predict"
),
),
vad=silero.VAD.load(),
turn_detection=MultilingualModel(),
)
await session.start(
room=ctx.room,
agent=Assistant(),
room_input_options=RoomInputOptions(
noise_cancellation=noise_cancellation.BVC(),
),
)
await session.generate_reply(
instructions="Greet the user and offer your assistance."
)
if __name__ == "__main__":
agents.cli.run_app(agents.WorkerOptions(entrypoint_fnc=entrypoint))
TTS (Text-to-Speech)
The TTS plugin calls Baseten-hosted TTS models (e.g. Orpheus 3B) over HTTP.
tts = baseten.TTS(
api_key="your-baseten-api-key",
model_endpoint="https://model-{model_id}.api.baseten.co/environments/production/predict",
voice="tara",
language="en",
)
LLM (Large Language Model)
The LLM plugin wraps Baseten's OpenAI-compatible inference endpoint.
llm = baseten.LLM(
api_key="your-baseten-api-key",
model="openai/gpt-oss-120b",
)
Documentation
1.5.12
May 21, 2026
1.5.11
May 19, 2026
1.5.10
May 18, 2026
1.5.9
May 13, 2026
1.5.8
May 05, 2026
1.5.7
Apr 30, 2026
1.5.6
Apr 22, 2026
1.5.5
Apr 20, 2026
1.5.4
Apr 16, 2026
1.5.3
Apr 15, 2026
1.5.2
Apr 08, 2026
1.5.1
Mar 23, 2026
1.5.0
Mar 19, 2026
1.5.0rc2
Mar 06, 2026
1.5.0rc1
Feb 13, 2026
1.4.6
Mar 16, 2026
1.4.5
Mar 11, 2026
1.4.4
Mar 03, 2026
1.4.3
Feb 23, 2026
1.4.2
Feb 17, 2026
1.4.1
Feb 06, 2026
1.4.0
Feb 06, 2026
1.4.0rc2
Jan 23, 2026
1.4.0rc1
Dec 23, 2025
1.3.12
Jan 21, 2026
1.3.11
Jan 14, 2026
1.3.10
Dec 23, 2025
1.3.9
Dec 19, 2025
1.3.8
Dec 17, 2025
1.3.7
Dec 16, 2025
1.3.6
Dec 03, 2025
1.3.5
Nov 25, 2025
1.3.4
Nov 24, 2025
1.3.3
Nov 19, 2025
1.3.2
Nov 17, 2025
1.3.1
Nov 17, 2025
1.3.0rc2
Nov 15, 2025
1.3.0rc1
Nov 06, 2025
1.2.18
Nov 05, 2025
1.2.17
Oct 29, 2025
1.2.16
Oct 27, 2025
1.2.15
Oct 15, 2025
1.2.14
Oct 01, 2025
1.2.13
Oct 01, 2025
1.2.12
Sep 29, 2025
1.2.11
Sep 18, 2025
1.2.9
Sep 15, 2025
1.2.8
Sep 02, 2025
1.2.7
Aug 28, 2025
1.2.6
Aug 18, 2025
1.2.5
Aug 10, 2025
1.2.4
Aug 07, 2025
1.2.3
Aug 04, 2025
1.2.2
Jul 24, 2025
1.2.1
Jul 17, 2025
1.2.0
Jul 17, 2025
1.1.7
Jul 15, 2025
1.1.6
Jul 10, 2025
1.1.5
Jun 30, 2025
1.1.4
Jun 25, 2025
1.1.3
Jun 21, 2025
1.1.2
Jun 20, 2025
1.1.1
Jun 10, 2025
1.1.0
Jun 10, 2025