A spaCy package for the Rust tokenizations library
Project Links
Meta
Author: Explosion
Requires Python: <3.14,>=3.9
Classifiers
Development Status
- 4 - Beta
Environment
- Console
Intended Audience
- Developers
- Science/Research
Topic
- Scientific/Engineering
- Scientific/Engineering :: Artificial Intelligence
License
- OSI Approved :: MIT License
Operating System
- POSIX :: Linux
- MacOS :: MacOS X
- Microsoft :: Windows
Programming Language
- Rust
- Python
- Python :: 3
- Python :: 3.9
- Python :: 3.10
- Python :: 3.11
- Python :: 3.12
- Python :: 3.12
spacy-alignments: Align tokenizations for spaCy + transformers
A spaCy package for Yohei Tamura's Rust tokenizations library with Python bindings.
Installation
pip install -U pip setuptools wheel
pip install spacy-alignments
If no binary wheel is available for your platform, you will need to install
Rust in order to build
spacy-alignments
from source.
spacy-alignments vs. pytokenizations
The spacy_alignments
module is a drop-in replacement for tokenizations
:
import spacy_alignments as tokenizations
a2b, b2a = tokenizations.get_alignments(["å", "BC"], ["abc"])
assert a2b == [[0], [0]]
assert b2a == [[0, 1]]
The only difference between this package and the original
pytokenizations
is that it
switches the build system to setuptools-rust
to make it easier for us at
Explosion to build source and binary packages for a wider range of platforms.
Bug reports and other issues
Please use spaCy's issue tracker to report a bug, or open a new thread on the discussion board for any other issue.
Jun 03, 2025
0.9.2
Sep 25, 2023
0.9.1
Dec 19, 2022
0.9.0
Oct 17, 2022
0.8.6
Apr 06, 2022
0.8.5
Nov 08, 2021
0.8.4
Apr 09, 2021
0.8.3
Dec 08, 2020
0.7.2
Wheel compatibility matrix
Files in release
No dependencies