HTTP server for deploying and serving LlamaIndex workflows as web services
Project Links
Meta
Requires Python: >=3.10
Classifiers
LlamaAgents Server
HTTP server for deploying LlamaIndex Workflows as web services. Built on Starlette and Uvicorn.
Installation
pip install llama-agents-server
Quick Start
Create a server file (e.g., my_server.py):
import asyncio
from workflows import Workflow, step
from workflows.context import Context
from workflows.events import Event, StartEvent, StopEvent
from llama_agents.server import WorkflowServer
class StreamEvent(Event):
sequence: int
class GreetingWorkflow(Workflow):
@step
async def greet(self, ctx: Context, ev: StartEvent) -> StopEvent:
for i in range(3):
ctx.write_event_to_stream(StreamEvent(sequence=i))
name = ev.get("name", "World")
return StopEvent(result=f"Hello, {name}!")
server = WorkflowServer()
server.add_workflow("greet", GreetingWorkflow())
if __name__ == "__main__":
asyncio.run(server.serve("0.0.0.0", 8080))
Or run it with the CLI:
llama-agents-server my_server.py
Features
- REST API for running, streaming, and managing workflows
- Debugger UI automatically mounted at
/for visualizing and debugging workflows - Event streaming via newline-delimited JSON or Server-Sent Events
- Human-in-the-loop support for interactive workflows
- Persistence with built-in SQLite store (or bring your own via
AbstractWorkflowStore)
Client
Use llama-agents-client to interact with deployed servers programmatically.
Documentation
See the full deployment guide for API details, persistence configuration, and more.
0.3.2
Apr 02, 2026
0.3.1
Mar 20, 2026
0.3.0
Mar 17, 2026
0.2.3
Mar 12, 2026
0.2.2
Mar 11, 2026
0.2.1
Mar 07, 2026
0.2.0
Feb 28, 2026
0.2.0rc3
Feb 24, 2026
0.2.0rc2
Feb 12, 2026
0.2.0rc1
Feb 12, 2026
0.2.0rc0
Feb 09, 2026
0.1.3
Feb 13, 2026
0.1.2
Feb 06, 2026
0.1.1
Feb 05, 2026