indexed-zstd 1.7.1


pip install indexed-zstd

  Latest version

Released: Mar 18, 2026

Project Links

Meta
Author: Marco Martinelli with the help of Maximilian Knespel

Classifiers

License
  • OSI Approved :: MIT License

Development Status
  • 5 - Production/Stable

Intended Audience
  • Developers

Operating System
  • MacOS
  • POSIX
  • Unix
  • Microsoft :: Windows

Programming Language
  • Python :: 3
  • Python :: 3.10
  • Python :: 3.11
  • Python :: 3.12
  • Python :: 3.13
  • Python :: 3.14
  • C++
  • C

Topic
  • Software Development :: Libraries
  • Software Development :: Libraries :: Python Modules
  • System :: Archiving

PyPI version Conda (channel only) AUR version Python Version PyPI Platforms Conda Platforms Downloads PyPI Downloads Conda Downloads License Build Status C++17

indexed_zstd

A Python module for fast random access to zstd-compressed files without full decompression.

IndexedZstdFile implements Python's io.BufferedReader interface, so it works as a drop-in replacement for open() on .zst files — supporting seek(), read(), readline(), tell(), and context managers.

Under the hood it uses libzstd-seek to build a jump table of frame boundaries, enabling O(1) seeking to any position in multi-frame archives.

This project is based on indexed_bzip2 to target zstd specifically.

How it works

Zstd files can contain multiple independently compressed frames. indexed_zstd scans frame boundaries on first access and builds an in-memory jump table that maps uncompressed offsets to compressed positions. When you seek(), only the relevant frame is decompressed.

Seeking within a frame is emulated by decompressing from the frame start, so the more frames your archive has, the faster random access will be.

To create multi-frame archives use t2sz or the --stream-size option of the zstd CLI.

Installation

pip (recommended)

Pre-built wheels are available for Linux, macOS, and Windows:

pip install indexed-zstd

If no wheel is available for your platform, pip will build from source automatically. In that case you need zstd development headers and a C++17 compiler:

# Debian/Ubuntu
sudo apt install libzstd-dev

# macOS
brew install zstd

conda

conda install -c conda-forge indexed_zstd

Arch Linux (AUR)

yay -S python-indexed-zstd

Usage

Basic random access

from indexed_zstd import IndexedZstdFile

with IndexedZstdFile("example.zst") as f:
    f.seek(1024)
    data = f.read(256)
    print(f.tell())       # 1280
    print(f.seekable())   # True

Reading line by line

from indexed_zstd import IndexedZstdFile

with IndexedZstdFile("logfile.zst") as f:
    for line in f:
        if b"ERROR" in line:
            print(line.decode())

Opening by file descriptor

import os
from indexed_zstd import IndexedZstdFile

fd = os.open("example.zst", os.O_RDONLY)
with IndexedZstdFile(fd) as f:
    data = f.read()

Inspecting frame structure

from indexed_zstd import IndexedZstdFile

with IndexedZstdFile("example.zst") as f:
    print(f.size())              # uncompressed size in bytes
    print(f.number_of_frames())  # number of zstd frames
    print(f.is_multiframe())     # True if more than one frame
    print(f.block_offsets())     # {compressed_offset: uncompressed_offset, ...}

API reference

IndexedZstdFile inherits from io.BufferedReader and adds:

Method Description
size() Uncompressed file size in bytes
number_of_frames() Total number of zstd frames
is_multiframe() True if the file contains more than one frame
block_offsets() dict mapping compressed offsets to uncompressed offsets
available_block_offsets() Same as block_offsets(), but returns only the offsets discovered so far
set_block_offsets(offsets) Manually set the jump table from a dict
block_offsets_complete() True if the jump table has been fully built
tell_compressed() Current position in the compressed stream

All standard io.BufferedReader methods are available: read(), readline(), readlines(), seek(), tell(), seekable(), readable(), fileno(), close(), etc.

Testing

The test suite requires gen_seekable (built from the bundled submodule) and covers API, error paths, round-trip, reference, and heavy-data scenarios.

# Build gen_seekable from the submodule
cmake -S indexed_zstd/libzstd-seek -B indexed_zstd/libzstd-seek/build -DBUILD_TESTS=ON
cmake --build indexed_zstd/libzstd-seek/build --target gen_seekable

# Add it to PATH
export PATH="$PWD/indexed_zstd/libzstd-seek/build/tests:$PATH"

# Run the standard test suite (111 tests)
python -m pytest tests/ -v -m "not heavy and not reference"

Additional test categories (optional):

Marker Requirements Description
reference zstd CLI in PATH Compares library output against zstd -d
heavy t2sz in PATH Large realistic data tests (10-50 MB)
# Run everything including reference and heavy tests
python -m pytest tests/ -v

Building from source

Requires a C++17 compiler, Cython, and platform-specific zstd libraries.

# Clone with submodules (includes libzstd-seek)
git clone --recurse-submodules https://github.com/martinellimarco/indexed_zstd.git
cd indexed_zstd
pip install cython setuptools

Linux

sudo apt install libzstd-dev    # Debian/Ubuntu
pip install .

macOS

brew install zstd
pip install .

Windows

Requires Visual Studio Build Tools with the C++ workload.

python libzstd/_get_zstd.py    # downloads zstd headers and DLL
pip install .

License

MIT

Wheel compatibility matrix

Platform CPython 3.8 CPython 3.9 CPython 3.10 CPython 3.11 CPython 3.12 CPython 3.13 CPython 3.14 CPython (additional flags: t) 3.14
macosx_15_0_arm64
manylinux_2_24_x86_64
manylinux_2_28_x86_64
musllinux_1_2_x86_64
win_amd64

Files in release

indexed_zstd-1.7.1-cp310-cp310-macosx_15_0_arm64.whl (300.4KiB)
indexed_zstd-1.7.1-cp310-cp310-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (651.6KiB)
indexed_zstd-1.7.1-cp310-cp310-musllinux_1_2_x86_64.whl (1.7MiB)
indexed_zstd-1.7.1-cp310-cp310-win_amd64.whl (641.0KiB)
indexed_zstd-1.7.1-cp311-cp311-macosx_15_0_arm64.whl (300.5KiB)
indexed_zstd-1.7.1-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (668.7KiB)
indexed_zstd-1.7.1-cp311-cp311-musllinux_1_2_x86_64.whl (1.7MiB)
indexed_zstd-1.7.1-cp311-cp311-win_amd64.whl (641.1KiB)
indexed_zstd-1.7.1-cp312-cp312-macosx_15_0_arm64.whl (300.5KiB)
indexed_zstd-1.7.1-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (674.5KiB)
indexed_zstd-1.7.1-cp312-cp312-musllinux_1_2_x86_64.whl (1.7MiB)
indexed_zstd-1.7.1-cp312-cp312-win_amd64.whl (641.5KiB)
indexed_zstd-1.7.1-cp313-cp313-macosx_15_0_arm64.whl (299.8KiB)
indexed_zstd-1.7.1-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (666.5KiB)
indexed_zstd-1.7.1-cp313-cp313-musllinux_1_2_x86_64.whl (1.7MiB)
indexed_zstd-1.7.1-cp313-cp313-win_amd64.whl (641.1KiB)
indexed_zstd-1.7.1-cp314-cp314-macosx_15_0_arm64.whl (300.4KiB)
indexed_zstd-1.7.1-cp314-cp314-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (663.8KiB)
indexed_zstd-1.7.1-cp314-cp314-musllinux_1_2_x86_64.whl (1.7MiB)
indexed_zstd-1.7.1-cp314-cp314-win_amd64.whl (659.7KiB)
indexed_zstd-1.7.1-cp314-cp314t-macosx_15_0_arm64.whl (303.1KiB)
indexed_zstd-1.7.1-cp314-cp314t-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (668.3KiB)
indexed_zstd-1.7.1-cp314-cp314t-musllinux_1_2_x86_64.whl (1.7MiB)
indexed_zstd-1.7.1-cp314-cp314t-win_amd64.whl (666.8KiB)
indexed_zstd-1.7.1-cp38-cp38-macosx_15_0_arm64.whl (304.5KiB)
indexed_zstd-1.7.1-cp38-cp38-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (655.1KiB)
indexed_zstd-1.7.1-cp38-cp38-musllinux_1_2_x86_64.whl (1.7MiB)
indexed_zstd-1.7.1-cp38-cp38-win_amd64.whl (658.3KiB)
indexed_zstd-1.7.1-cp39-cp39-macosx_15_0_arm64.whl (300.9KiB)
indexed_zstd-1.7.1-cp39-cp39-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (650.8KiB)
indexed_zstd-1.7.1-cp39-cp39-musllinux_1_2_x86_64.whl (1.7MiB)
indexed_zstd-1.7.1-cp39-cp39-win_amd64.whl (641.7KiB)
indexed_zstd-1.7.1.tar.gz (148.8KiB)
No dependencies