GraphRAG SDK

The most accurate Graph RAG framework. Built on FalkorDB.

GraphRAG SDK builds knowledge graphs from documents and answers questions over them using retrieval-augmented generation. Every algorithmic concern (chunking, extraction, resolution, retrieval, reranking) is a swappable strategy behind an abstract interface. The default pipeline scores ~85% accuracy on a 100-question benchmark using GPT-4.1.

Quick Start

import asyncio
from graphrag_sdk import GraphRAG, ConnectionConfig, LiteLLM, LiteLLMEmbedder

async def main():
    async with GraphRAG(
        connection=ConnectionConfig(host="localhost", graph_name="my_graph"),
        llm=LiteLLM(model="openai/gpt-4o"),
        embedder=LiteLLMEmbedder(model="openai/text-embedding-3-small"),
    ) as rag:
        result = await rag.ingest("my_document.txt")
        print(f"Created {result.nodes_created} nodes, {result.relationships_created} edges")

        answer = await rag.completion("What is the main theme?")
        print(answer.answer)

asyncio.run(main())

Installation

pip install graphrag-sdk[litellm]       # OpenAI, Azure, Anthropic, 100+ models
pip install graphrag-sdk[openrouter]    # OpenRouter models
pip install graphrag-sdk[pdf]           # PDF ingestion
pip install graphrag-sdk[all]           # Everything

Prerequisites

Python >= 3.10
FalkorDB: docker run -p 6379:6379 falkordb/falkordb
An LLM API key (OpenAI, Azure OpenAI, OpenRouter, etc.)

Usage

Ingest & Query

import asyncio
from graphrag_sdk import GraphRAG, ConnectionConfig, LiteLLM, LiteLLMEmbedder

async def main():
    async with GraphRAG(
        connection=ConnectionConfig(host="localhost", graph_name="my_graph"),
        llm=LiteLLM(model="openai/gpt-4o"),
        embedder=LiteLLMEmbedder(model="openai/text-embedding-3-small"),
    ) as rag:
        await rag.ingest("report.pdf")                              # PDF
        await rag.ingest("source_id", text="Alice works at Acme.")  # Raw text
        await rag.finalize()                                         # Dedup + index

        # Retrieve context only
        context = await rag.retrieve("Where does Alice work?")

        # Full RAG: retrieve + generate answer
        result = await rag.completion("Where does Alice work?")
        print(result.answer)

asyncio.run(main())

Multi-Turn Conversations

completion() supports multi-turn conversations. With the built-in providers (LiteLLM, OpenRouterLLM), messages are passed natively to the LLM's chat API. Custom providers that only implement invoke() get automatic fallback via message concatenation.

from graphrag_sdk import ChatMessage

answer = await rag.completion(
    "What happened next?",
    history=[
        ChatMessage(role="user", content="Who is Alice?"),
        ChatMessage(role="assistant", content="Alice is an engineer at Acme Corp."),
    ],
)

Supported roles: "system", "user", "assistant". Invalid roles raise ValueError.

Schema Definition

from graphrag_sdk import GraphSchema, EntityType, RelationType

schema = GraphSchema(
    entities=[
        EntityType(label="Person", description="A human being"),
        EntityType(label="Organization", description="A company or institution"),
    ],
    relations=[
        RelationType(
            label="WORKS_AT",
            description="Is employed by",
            patterns=[("Person", "Organization")],
        ),
    ],
)

rag = GraphRAG(connection=conn, llm=llm, embedder=embedder, schema=schema)  # conn, llm, embedder from above

Strategy Customization

Override any pipeline step by passing a strategy:

from graphrag_sdk.ingestion.chunking_strategies.fixed_size import FixedSizeChunking
from graphrag_sdk import GraphExtraction, LLMExtractor
from graphrag_sdk.ingestion.resolution_strategies import SemanticResolution

# Custom chunking
await rag.ingest("doc.txt", chunker=FixedSizeChunking(chunk_size=1500, chunk_overlap=200))

# LLM-based entity extraction instead of GLiNER
await rag.ingest("doc.txt", extractor=GraphExtraction(llm=llm, entity_extractor=LLMExtractor(llm)))

Strategy Reference

Every algorithmic concern is a swappable strategy behind an abstract base class:

Concern	ABC	Built-in Options	Default
Loading	`LoaderStrategy`	`TextLoader`, `PdfLoader`	Auto-detect by extension
Chunking	`ChunkingStrategy`	`FixedSizeChunking`, `SentenceTokenCapChunking`, `ContextualChunking`, `CallableChunking`	`FixedSizeChunking`
Extraction	`ExtractionStrategy`	`GraphExtraction` (GLiNER2 + LLM)	`GraphExtraction`
Resolution	`ResolutionStrategy`	`ExactMatchResolution`, `DescriptionMergeResolution`, `SemanticResolution`, `LLMVerifiedResolution`	`ExactMatch`
Retrieval	`RetrievalStrategy`	`LocalRetrieval`, `MultiPathRetrieval`	`MultiPath` (5-path)
Reranking	`RerankingStrategy`	`CosineReranker`	Cosine

LLM & Embedding Providers

Provider	LLM Class	Embedder Class	Models
LiteLLM	`LiteLLM`	`LiteLLMEmbedder`	OpenAI, Azure, Anthropic, Cohere, 100+
OpenRouter	`OpenRouterLLM`	`OpenRouterEmbedder`	All OpenRouter models
Custom	Subclass `LLMInterface`	Subclass `Embedder`	Anything

Benchmark

#1 on GraphRAG-Bench Novel — 63.73 ACC, ahead of MS-GraphRAG (50.93) and LightRAG (45.09).

Metric	Value
Novel ACC	63.73 (#1)
Fact retrieval	65.22
Complex reasoning	58.63
Contextual summarization	69.54
Creative generation	57.08
Questions	2,010 across 20 novels

See docs/benchmark.md for methodology and reproduction.

Examples

#	Example	Description
1	`01_quickstart.py`	Minimal ingest & query
2	`02_pdf_with_schema.py`	PDF with custom schema
3	`03_custom_strategies.py`	Benchmark-winning pipeline
4	`04_custom_provider.py`	Custom LLM/Embedder
5	`05_notebook_demo.ipynb`	Interactive notebook walkthrough