xgrammar 0.1.26


pip install xgrammar

  Latest version

Released: Oct 20, 2025

Project Links

Meta
Author: MLC Team
Requires Python: <4,>=3.8

Classifiers

License
  • OSI Approved :: Apache Software License

Development Status
  • 4 - Beta

Intended Audience
  • Developers
  • Education
  • Science/Research

logo

Documentation License PyPI PyPI Downloads Ask DeepWiki

Efficient, Flexible and Portable Structured Generation

Get Started | Documentation | Blogpost | Technical Report

News

  • [2025/09] XGrammar has been officially integrated into OpenVINO GenAI
  • [2025/02] XGrammar has been officially integrated into Modular's MAX
  • [2025/01] XGrammar has been officially integrated into TensorRT-LLM.
  • [2024/12] XGrammar has been officially integrated into vLLM.
  • [2024/12] We presented research talks on XGrammar at CMU, UC Berkeley, MIT, THU, SJTU, Ant Group, LMSys, Qingke AI, Camel AI. The slides can be found here.
  • [2024/11] XGrammar has been officially integrated into SGLang.
  • [2024/11] XGrammar has been officially integrated into MLC-LLM.
  • [2024/11] We officially released XGrammar v0.1.0!

Overview

XGrammar is an open-source library for efficient, flexible, and portable structured generation.

It leverages constrained decoding to ensure 100% structural correctness of the output. It supports general context-free grammar to enable a broad range of structures, including JSON, regex, custom context-free grammar, etc.

XGrammar uses careful optimizations to achieve extremely low overhead in structured generation. It has achieved near-zero overhead in JSON generation, making it one of the fastest structured generation engines available.

XGrammar features universal deployment. It supports:

  • Platforms: Linux, macOS, Windows
  • Hardware: CPU, NVIDIA GPU, AMD GPU, Apple Silicon, TPU, etc.
  • Languages: Python, C++, and JavaScript APIs
  • Models: Qwen, Llama, DeepSeek, Phi, Gemma, etc.

XGrammar is very easy to integrate with LLM inference engines. It is the default structured generation backend for most LLM inference engines, including vLLM, SGLang, TensorRT-LLM, and MLC-LLM, as well as many other companies. You can also try out their structured generation modes!

Get Started

Install XGrammar:

pip install xgrammar

Import XGrammar:

import xgrammar as xgr

Please visit our documentation to get started with XGrammar.

Collaborators

XGrammar has been widely adopted in industry, open-source projects, and academia. Our collaborators include:

WebLLM

Citation

If you find XGrammar useful in your research, please consider citing our paper:

@article{dong2024xgrammar,
  title={Xgrammar: Flexible and efficient structured generation engine for large language models},
  author={Dong, Yixin and Ruan, Charlie F and Cai, Yaxing and Lai, Ruihang and Xu, Ziyi and Zhao, Yilong and Chen, Tianqi},
  journal={Proceedings of Machine Learning and Systems 7},
  year={2024}
}

Wheel compatibility matrix

Platform CPython 3.9 CPython 3.10 CPython 3.11 CPython 3.12 CPython 3.13
macosx_10_14_x86_64
macosx_11_0_arm64
manylinux2014_aarch64
manylinux2014_x86_64
manylinux_2_17_aarch64
manylinux_2_17_x86_64
win_amd64

Files in release

xgrammar-0.1.26-cp310-cp310-macosx_10_14_x86_64.whl (648.7KiB)
xgrammar-0.1.26-cp310-cp310-macosx_11_0_arm64.whl (622.4KiB)
xgrammar-0.1.26-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (8.3MiB)
xgrammar-0.1.26-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (8.5MiB)
xgrammar-0.1.26-cp310-cp310-win_amd64.whl (692.6KiB)
xgrammar-0.1.26-cp311-cp311-macosx_10_14_x86_64.whl (648.5KiB)
xgrammar-0.1.26-cp311-cp311-macosx_11_0_arm64.whl (622.2KiB)
xgrammar-0.1.26-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (8.3MiB)
xgrammar-0.1.26-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (8.5MiB)
xgrammar-0.1.26-cp311-cp311-win_amd64.whl (692.5KiB)
xgrammar-0.1.26-cp312-cp312-macosx_10_14_x86_64.whl (647.8KiB)
xgrammar-0.1.26-cp312-cp312-macosx_11_0_arm64.whl (621.2KiB)
xgrammar-0.1.26-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (8.3MiB)
xgrammar-0.1.26-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (8.5MiB)
xgrammar-0.1.26-cp312-cp312-win_amd64.whl (692.2KiB)
xgrammar-0.1.26-cp313-cp313-macosx_11_0_arm64.whl (621.2KiB)
xgrammar-0.1.26-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (8.5MiB)
xgrammar-0.1.26-cp313-cp313-win_amd64.whl (692.1KiB)
xgrammar-0.1.26-cp39-cp39-macosx_10_14_x86_64.whl (648.9KiB)
xgrammar-0.1.26-cp39-cp39-macosx_11_0_arm64.whl (622.6KiB)
xgrammar-0.1.26-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (8.3MiB)
xgrammar-0.1.26-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (8.5MiB)
xgrammar-0.1.26-cp39-cp39-win_amd64.whl (692.8KiB)
xgrammar-0.1.26.tar.gz (2.2MiB)
Extras:
Dependencies:
pydantic
torch (>=1.10.0)
transformers (>=4.38.0)
triton and
mlx-lm and
ninja
numpy
typing-extensions (>=4.9.0)