sql-redis 0.3.0


pip install sql-redis

  Latest version

Released: Mar 16, 2026


Meta
Author: Redis Inc.
Requires Python: <3.14,>=3.9

Classifiers

License
  • OSI Approved :: MIT License

Programming Language
  • Python :: 3.9
  • Python :: 3.10
  • Python :: 3.11
  • Python :: 3.12
  • Python :: 3.13

sql-redis

A proof-of-concept SQL-to-Redis translator that converts SQL SELECT statements into Redis FT.SEARCH and FT.AGGREGATE commands.

Status

This is an early POC demonstrating feasibility, not a production-ready library. The goal is to explore design decisions and validate the approach before committing to a full implementation.

Quick Example

from redis import Redis
from sql_redis import Translator
from sql_redis.schema import SchemaRegistry
from sql_redis.executor import Executor

client = Redis()
registry = SchemaRegistry(client)
registry.load_all()  # Loads index schemas from Redis

executor = Executor(client, registry)

# Simple query
result = executor.execute("""
    SELECT title, price
    FROM products
    WHERE category = 'electronics' AND price < 500
    ORDER BY price ASC
    LIMIT 10
""")

for row in result.rows:
    print(row["title"], row["price"])

# Vector search with params
result = executor.execute("""
    SELECT title, vector_distance(embedding, :vec) AS score
    FROM products
    LIMIT 5
""", params={"vec": vector_bytes})

Design Decisions

Why SQL instead of a pandas-like Python DSL?

We considered several interface options:

Approach Example Trade-offs
SQL SELECT * FROM products WHERE price > 100 Universal, well-understood, tooling exists
Pandas-like df[df.price > 100] Pythonic but limited to Python, no standard
Builder pattern query.select("*").where(price__gt=100) Type-safe but verbose, learning curve

We chose SQL because:

  1. Universality — SQL is the lingua franca of data. Developers, analysts, and tools all speak it.
  2. No new DSL to learn — Users already know SQL. A pandas-like API requires learning our specific dialect.
  3. Tooling compatibility — SQL strings can be generated by ORMs, query builders, or AI assistants.
  4. Clear mapping — SQL semantics map reasonably well to RediSearch operations (SELECT→LOAD, WHERE→filter, GROUP BY→GROUPBY).

The downside is losing Python's type checking and IDE support, but for a query interface, the universality trade-off is worth it.

Why sqlglot instead of writing a custom parser?

Options considered:

  • Custom parser (regex, hand-rolled recursive descent)
  • PLY/Lark (parser generators)
  • sqlglot (production SQL parser)
  • sqlparse (tokenizer, not a full parser)

We chose sqlglot because:

  1. Battle-tested — Used in production by companies like Tobiko (SQLMesh). Handles edge cases we'd miss.
  2. Full AST — Provides a complete abstract syntax tree, not just tokens. We can traverse and analyze queries properly.
  3. Dialect support — Handles SQL variations. Users can write MySQL-style or PostgreSQL-style queries.
  4. Active maintenance — Regular releases, responsive maintainers, good documentation.

The alternative was writing a custom parser, which would be error-prone and time-consuming for a POC. sqlglot lets us focus on the translation logic rather than parsing edge cases.

Why schema-aware translation?

Redis field types determine query syntax:

Field Type Redis Syntax Example
TEXT @field:term @title:laptop
NUMERIC @field:[min max] @price:[100 500]
TAG @field:{value} @category:{books}

Without schema knowledge, we can't translate category = 'books' correctly — it could be @category:books (TEXT search) or @category:{books} (TAG exact match).

Our approach: The SchemaRegistry fetches index schemas via FT.INFO at startup. The translator uses this to generate correct syntax per field type.

This adds a Redis round-trip at initialization but ensures correct query generation.

Architecture: Why this layered design?

SQL String
    ↓
┌─────────────────┐
│   SQLParser     │  Parse SQL → ParsedQuery dataclass
└────────┬────────┘
         ↓
┌─────────────────┐
│ SchemaRegistry  │  Load field types from Redis
└────────┬────────┘
         ↓
┌─────────────────┐
│    Analyzer     │  Classify conditions by field type
└────────┬────────┘
         ↓
┌─────────────────┐
│  QueryBuilder   │  Generate RediSearch syntax per type
└────────┬────────┘
         ↓
┌─────────────────┐
│   Translator    │  Orchestrate pipeline, build command
└────────┬────────┘
         ↓
┌─────────────────┐
│    Executor     │  Execute command, parse results
└────────┬────────┘
         ↓
QueryResult(rows, count)

Why separate components?

  1. Testability — Each layer has focused unit tests. 100% coverage is achievable because responsibilities are clear.
  2. Single responsibility — Parser doesn't know about Redis. QueryBuilder doesn't know about SQL. Changes are localized.
  3. Extensibility — Adding a new field type (e.g., GEO) means updating Analyzer and QueryBuilder, not rewriting everything.

Why not a single monolithic translator?

Early prototypes combined parsing and translation. This led to:

  • Tests that required Redis connections for simple SQL parsing tests
  • Difficulty testing edge cases in isolation
  • Tangled code that was hard to modify

The layered approach emerged from TDD — writing tests first revealed natural boundaries.

What's Implemented

  • Basic SELECT with field selection
  • WHERE with TEXT, NUMERIC, TAG field types
  • Comparison operators: =, !=, <, <=, >, >=, BETWEEN, IN
  • Boolean operators: AND, OR
  • Aggregations: COUNT, SUM, AVG, MIN, MAX
  • GROUP BY with multiple aggregations
  • ORDER BY with ASC/DESC
  • LIMIT and OFFSET pagination
  • Computed fields: price * 0.9 AS discounted
  • Vector KNN search: vector_distance(field, :param)
  • Hybrid search (filters + vector)
  • Full-text search: LIKE 'prefix%' (prefix), fulltext(field, 'terms') function
  • GEO field queries with full operator support (see below)
  • Date functions: YEAR(), MONTH(), DAY(), DATE_FORMAT(), etc. (see below)

What's Not Implemented (Yet...)

  • JOINs (Redis doesn't support cross-index joins)
  • Subqueries
  • HAVING clause
  • DISTINCT
  • Index creation from SQL (CREATE INDEX)

DATE/DATETIME Handling

Redis does not have a native DATE field type. Dates are stored as NUMERIC fields with Unix timestamps.

sql-redis automatically converts ISO 8601 date literals to Unix timestamps:

-- Date literal (automatically converted to timestamp 1704067200)
SELECT * FROM events WHERE created_at > '2024-01-01'

-- Datetime literal with time
SELECT * FROM events WHERE created_at > '2024-01-01T12:00:00'

-- Date range with BETWEEN
SELECT * FROM events WHERE created_at BETWEEN '2024-01-01' AND '2024-01-31'

-- Multiple date conditions
SELECT * FROM events WHERE created_at > '2024-01-01' AND created_at < '2024-12-31'

Supported date formats:

  • Date: '2024-01-01' (interpreted as midnight UTC)
  • Datetime: '2024-01-01T12:00:00' or '2024-01-01 12:00:00'
  • Datetime with timezone: '2024-01-01T12:00:00Z', '2024-01-01T12:00:00+00:00'

Note: All dates without timezone are interpreted as UTC. You can also use raw Unix timestamps if preferred:

SELECT * FROM events WHERE created_at > 1704067200

Date Functions

Extract date parts using SQL functions that map to Redis APPLY expressions:

SQL Function Redis Function Description
YEAR(field) year(@field) Extract year (e.g., 2024)
MONTH(field) monthofyear(@field) Extract month (0-11)
DAY(field) dayofmonth(@field) Extract day of month (1-31)
HOUR(field) hour(@field) Round to hour
MINUTE(field) minute(@field) Round to minute
DAYOFWEEK(field) dayofweek(@field) Day of week (0=Sunday)
DAYOFYEAR(field) dayofyear(@field) Day of year (0-365)
DATE_FORMAT(field, fmt) timefmt(@field, fmt) Format timestamp

Examples:

-- Extract year and month
SELECT name, YEAR(created_at) AS year, MONTH(created_at) AS month FROM events

-- Filter by year
SELECT name FROM events WHERE YEAR(created_at) = 2024

-- Group by date parts
SELECT YEAR(created_at) AS year, COUNT(*) FROM events GROUP BY year

-- Format dates
SELECT name, DATE_FORMAT(created_at, '%Y-%m-%d') AS date FROM events

Note: Redis's monthofyear() returns 0-11 (not 1-12), and dayofweek() returns 0 for Sunday.

Limitations

  • NOT YEAR(field) = 2024 is not supported (raises ValueError)
  • DATE_FORMAT() is only supported in SELECT, not in WHERE (raises ValueError)
  • Date functions combined with OR are not supported (raises ValueError)

GEO Field Support

GEO fields are fully implemented with standard SQL-like syntax:

Feature Status
Coordinate order POINT(lon, lat) — matches Redis native format
Default unit ✅ Meters (m) — SQL standard
All operators <, <=, >, >=, BETWEEN
Distance calculation geo_distance() in SELECT clause
Combined filters ✅ GEO + TEXT/TAG/NUMERIC

Coordinate Order: POINT(lon, lat)

Use longitude first, matching Redis's native GEO format:

-- San Francisco coordinates: lon=-122.4194, lat=37.7749
SELECT name FROM stores WHERE geo_distance(location, POINT(-122.4194, 37.7749)) < 5000

Units

Unit Code Example
Meters m geo_distance(location, POINT(-122.4194, 37.7749)) < 5000
Kilometers km geo_distance(location, POINT(-122.4194, 37.7749), 'km') < 5
Miles mi geo_distance(location, POINT(-122.4194, 37.7749), 'mi') < 3
Feet ft geo_distance(location, POINT(-122.4194, 37.7749), 'ft') < 16400

Default is meters when no unit is specified.

Operators

All comparison operators are supported:

-- Less than (uses optimized GEOFILTER)
SELECT name FROM stores WHERE geo_distance(location, POINT(-122.4194, 37.7749)) < 5000

-- Less than or equal (uses optimized GEOFILTER)
SELECT name FROM stores WHERE geo_distance(location, POINT(-122.4194, 37.7749)) <= 5000

-- Greater than (uses FT.AGGREGATE with FILTER)
SELECT name FROM stores WHERE geo_distance(location, POINT(-122.4194, 37.7749)) > 100000

-- Greater than or equal (uses FT.AGGREGATE with FILTER)
SELECT name FROM stores WHERE geo_distance(location, POINT(-122.4194, 37.7749)) >= 100000

-- Between (uses FT.AGGREGATE with FILTER)
SELECT name FROM stores WHERE geo_distance(location, POINT(-122.4194, 37.7749), 'km') BETWEEN 10 AND 100

Distance Calculation in SELECT

Calculate distances for all results using geo_distance() in the SELECT clause:

-- Get distance to each store (returns meters)
SELECT name, geo_distance(location, POINT(-122.4194, 37.7749)) AS distance
FROM stores

-- With explicit unit
SELECT name, geo_distance(location, POINT(-122.4194, 37.7749), 'km') AS distance_km
FROM stores

Combined Filters

Combine GEO filters with other field types:

-- GEO + TAG filter
SELECT name FROM stores
WHERE category = 'retail' AND geo_distance(location, POINT(-122.4194, 37.7749)) < 5000

-- GEO + NUMERIC filter
SELECT name FROM stores
WHERE rating >= 4.0 AND geo_distance(location, POINT(-122.4194, 37.7749), 'mi') < 10

-- GEO + TEXT filter
SELECT name FROM stores
WHERE name = 'Downtown' AND geo_distance(location, POINT(-122.4194, 37.7749)) < 10000

Development

# Install dependencies
uv sync --all-extras

# Run tests (requires Docker for testcontainers)
uv run pytest

# Run with coverage
uv run pytest --cov=sql_redis --cov-report=html

Testing Philosophy

This project uses strict TDD with 100% test coverage as a hard requirement. The approach:

  1. Write failing tests first — Define expected behavior before implementation
  2. One test at a time — Implement just enough to pass each test
  3. No untestable code — If we can't test it, we don't write it
  4. Integration tests mirror raw Redistest_sql_queries.py verifies SQL produces same results as equivalent FT.AGGREGATE commands in test_redis_queries.py

Coverage is enforced in CI. Pragmas (# pragma: no cover) are forbidden — if code can't be tested, it shouldn't exist.

Wheel compatibility matrix

Platform Python 3
any

Files in release

Extras: None
Dependencies:
redis (>=5.0.0)
sqlglot (>=26.0.0)