Programming Language
- Python :: 3.10
- Python :: 3.11
- Python :: 3.12
- Python :: 3.13
- Python :: 3.14
fastsafetensors
fastsafetensors is an efficient safetensors loader. If you develop your own code that loads large safetensors files, you can try fastsafetensors APIs (see docs). For example, vLLM and SGLang have --load-format fastsafetensors command-line argument to speed up their initialization.
This library supports Linux/CUDA, ROCm without GDS, Windows, 3FS, unified-memory systems such as DGX Spark, and so on. We welcome more platform/storage-specific optimizations like them by adding new copier backends. Our CI tests Python 3.10-3.14 with PyTorch 2.11.0.
Performance Highlights
Performance highlights from the CLOUD 2025 paper and benchmark docs:
- Standalone model loading was 4.8x-7.5x faster than the default
safetensorsdeserializer on Llama, Falcon, and Bloom models, and reached 26.4 GB/s NVMe read throughput for Llama-70B on four GPUs with GDS. - In the paper's vLLM integration experiment, startup time dropped from 12.39s to 4.74s for Llama-2-13B on 4x L40S GPUs, and from 16.04s to 6.88s on 1x A100.
- On AMD ROCm without GDS, the documented
nogdspath reached 6.02 GB/s for GPT-2 Medium versus 1.28 GB/s withmmap(4.7x throughput), and 2.62 GB/s for GPT-2 versus 1.01 GB/s withmmap(2.6x throughput). See the report for more details.
Quick Start
pip install fastsafetensors
pip install vllm # for quick demo
vllm serve Qwen/Qwen3-0.6B --load-format fastsafetensors
...
Loading safetensors using Fastsafetensor loader: 0% Completed | 0/1 [00:00<?, ?it/s]
Loading safetensors using Fastsafetensor loader: 100% Completed | 1/1 [00:00<00:00, 1.23it/s]
Design Details
See Overview for features, basic API usage, and configuration.
Code of Conduct
Please refer to Foundation Model Stack Community Code of Conduct.
Development
See Development.
Publication
Takeshi Yoshimura, Tatsuhiro Chiba, Manish Sethi, Daniel Waddington, Swaminathan Sundararaman. (2025) Speeding up Model Loading with fastsafetensors arXiv:2505.23072 and IEEE CLOUD 2025.
Wheel compatibility matrix
| Platform | CPython 3.10 | CPython 3.11 | CPython 3.12 | CPython 3.13 | CPython 3.14 |
|---|---|---|---|---|---|
| manylinux_2_27_x86_64 | |||||
| manylinux_2_28_x86_64 |