langchain-litellm 0.6.6


pip install langchain-litellm

  Latest version

Released: May 21, 2026


Meta
Author: Akshay Dongare
Requires Python: <4.0,>=3.10

Classifiers

Development Status
  • 4 - Beta

Intended Audience
  • Developers

License
  • OSI Approved :: MIT License

Natural Language
  • English

Operating System
  • OS Independent

Programming Language
  • Python :: 3
  • Python :: 3 :: Only
  • Python :: 3.10
  • Python :: 3.11
  • Python :: 3.12
  • Python :: 3.13
  • Python :: 3.14

Topic
  • Scientific/Engineering :: Artificial Intelligence
  • Software Development :: Libraries :: Python Modules

Typing
  • Typed

langchain-litellm

๐Ÿ“ฆ Distribution ๐Ÿ”ง Project ๐Ÿš€ Activity
PyPI Package Version
PyPI Downloads
License: MIT
Platform
Python
uv
GitHub Issues Closed
GitHub Issues Open
GitHub PRs Open
GitHub PRs Closed

๐Ÿค” What is this?

This package contains the LangChain integration with LiteLLM. LiteLLM is a library that simplifies calling Anthropic, Azure, Huggingface, Replicate, etc.

๐Ÿ“– Documentation

For conceptual guides, tutorials, and examples on using these classes, see the LangChain Docs.

Advanced Features

Embeddings

Supported in v0.6.1+

Use LiteLLMEmbeddings to embed text across 100+ providers with a single, consistent interface. All configuration is explicit -- no environment variables required.

from langchain_litellm import LiteLLMEmbeddings

embeddings = LiteLLMEmbeddings(
    model="openai/text-embedding-3-small",
    api_key="sk-...",
)

vectors = embeddings.embed_documents(["hello", "world"])
query_vector = embeddings.embed_query("hello")

Switch providers by changing model -- the interface stays the same:

# Cohere
embeddings = LiteLLMEmbeddings(
    model="cohere/embed-english-v3.0",
    api_key="...",
    document_input_type="search_document",
    query_input_type="search_query",
)

# Azure OpenAI
embeddings = LiteLLMEmbeddings(
    model="azure/my-embedding-deployment",
    api_key="...",
    api_base="https://my-resource.openai.azure.com",
    api_version="2024-02-01",
)

# Bedrock
embeddings = LiteLLMEmbeddings(
    model="bedrock/amazon.titan-embed-text-v1",
)

For load-balancing across multiple deployments of the same model, use LiteLLMEmbeddingsRouter:

from litellm import Router
from langchain_litellm import LiteLLMEmbeddingsRouter

router = Router(model_list=[
    {
        "model_name": "text-embedding-3-small",
        "litellm_params": {
            "model": "openai/text-embedding-3-small",
            "api_key": "sk-key1",
        },
    },
    {
        "model_name": "text-embedding-3-small",
        "litellm_params": {
            "model": "openai/text-embedding-3-small",
            "api_key": "sk-key2",
        },
    },
])

embeddings = LiteLLMEmbeddingsRouter(router=router)
Vertex AI Grounding (Google Search)

Supported in v0.3.5+

You can use Google Search grounding with Vertex AI models (e.g., gemini-2.5-flash). Citations and metadata are returned in response_metadata (Batch) or additional_kwargs (Streaming).

Setup

import os
from langchain_litellm import ChatLiteLLM

os.environ["VERTEX_PROJECT"] = "your-project-id"
os.environ["VERTEX_LOCATION"] = "us-central1"

llm = ChatLiteLLM(model="vertex_ai/gemini-2.5-flash", temperature=0)

Batch Usage

# Invoke with Google Search tool enabled
response = llm.invoke(
    "What is the current stock price of Google?",
    tools=[{"googleSearch": {}}]
)

# Access Citations & Metadata
provider_fields = response.response_metadata.get("provider_specific_fields")
if provider_fields:
    # Vertex returns a list; the first item contains the grounding info
    print(provider_fields[0])

Streaming Usage

stream = llm.stream(
    "What is the current stock price of Google?",
    tools=[{"googleSearch": {}}]
)

for chunk in stream:
    print(chunk.content, end="", flush=True)
    # Metadata is injected into the chunk where it arrives
    if "provider_specific_fields" in chunk.additional_kwargs:
        print("\n[Metadata Found]:", chunk.additional_kwargs["provider_specific_fields"])

๐Ÿ“• Releases & Versioning

See our Releases and Versioning policies.

๐Ÿ’ Contributing

As an open-source project in a rapidly developing field, we are extremely open to contributions, whether it be in the form of a new feature, improved infrastructure, or better documentation.

For detailed information on how to contribute, see the Contributing Guide.

Wheel compatibility matrix

Platform Python 3
any

Files in release

Extras: None
Dependencies:
cryptography (<49.0.0,>=46.0.5)
httpx (<0.29.0,>=0.28.1)
langchain-core (<2.0.0,>=1.2.21)
litellm (!=1.82.7,!=1.82.8,<2.0.0,>=1.83.14)