catsu 0.1.8


pip install catsu

  Latest version

Released: Feb 20, 2026


Meta
Author: Bhavnick Minhas
Requires Python: >=3.11

Classifiers

Development Status
  • 3 - Alpha

Intended Audience
  • Developers

License
  • OSI Approved :: Apache Software License

Programming Language
  • Rust
  • Python :: Implementation :: CPython
  • Python :: 3
  • Python :: 3.11
  • Python :: 3.12
  • Python :: 3.13

Topic
  • Scientific/Engineering :: Artificial Intelligence

Catsu Logo

🐱 catsu

PyPI version Python License Documentation Discord

A unified, batteries-included client for embedding APIs that actually works.

The world of embedding API clients is broken.

  • Everyone defaults to OpenAI's client for embeddings, even though it wasn't designed for that purpose
  • Provider-specific libraries (VoyageAI, Cohere, etc.) are inconsistent, poorly maintained, or outright broken
  • Universal clients like LiteLLM don't focus on embeddings—they rely on native client libraries, inheriting all their problems
  • Every provider has different capabilities—some support dimension changes, others don't—with no standardized way to discover what's available
  • Most clients lack basic features like retry logic, proper error handling, and usage tracking

Catsu fixes this. It's a high-performance, unified client built specifically for embeddings with:

🎯 A clean, consistent API across all providers
🔄 Built-in retry logic with exponential backoff
💰 Automatic usage and cost tracking
📚 Rich model metadata and capability discovery
⚡ Rust core with Python bindings for maximum performance

Installation

pip install catsu

Quick Start

from catsu import Client

# Create client (reads API keys from environment)
client = Client()

# Generate embeddings
response = client.embed(
    "openai:text-embedding-3-small",
    ["Hello, world!", "How are you?"]
)

print(f"Dimensions: {response.dimensions}")
print(f"Tokens used: {response.usage.tokens}")
print(f"Embedding: {response.embeddings[0][:5]}")

Async Support

import asyncio
from catsu import Client

async def main():
    client = Client()
    response = await client.aembed(
        "openai:text-embedding-3-small",
        "Hello, async world!"
    )
    print(response.embeddings[0][:5])

asyncio.run(main())

With Options

response = client.embed(
    "openai:text-embedding-3-small",
    ["Search query"],
    input_type="query",  # "query" or "document"
    dimensions=256,      # output dimensions (if supported)
)

Model Catalog

# List all available models
models = client.list_models()

# Filter by provider
openai_models = client.list_models("openai")
for m in openai_models:
    print(f"{m.name}: {m.dimensions} dims, ${m.cost_per_million_tokens}/M tokens")

Configuration

client = Client(
    max_retries=5,   # Default: 3
    timeout=60,      # Default: 30 seconds
)

NumPy Integration

# Convert embeddings to numpy array
arr = response.to_numpy()
print(arr.shape)  # (2, 1536)

Context Manager

# Sync
with Client() as client:
    response = client.embed("openai:text-embedding-3-small", "Hello!")

# Async
async with Client() as client:
    response = await client.aembed("openai:text-embedding-3-small", "Hello!")

If you found this helpful, consider giving it a ⭐!

made with ❤️ by chonkie, inc.

No dependencies