mlserver 1.7.1


pip install mlserver

  Latest version

Released: Jun 06, 2025

Project Links

Meta
Author: Seldon Technologies Ltd.
Requires Python: >=3.9,<3.13

Classifiers

License
  • OSI Approved :: Apache Software License

Operating System
  • MacOS
  • POSIX

Programming Language
  • Python :: 3
  • Python :: 3.9
  • Python :: 3.10
  • Python :: 3.11
  • Python :: 3.12

MLServer

An open source inference server for your machine learning models.

video_play_icon

Overview

MLServer aims to provide an easy way to start serving your machine learning models through a REST and gRPC interface, fully compliant with KFServing's V2 Dataplane spec. Watch a quick video introducing the project here.

  • Multi-model serving, letting users run multiple models within the same process.
  • Ability to run inference in parallel for vertical scaling across multiple models through a pool of inference workers.
  • Support for adaptive batching, to group inference requests together on the fly.
  • Scalability with deployment in Kubernetes native frameworks, including Seldon Core and KServe (formerly known as KFServing), where MLServer is the core Python inference server used to serve machine learning models.
  • Support for the standard V2 Inference Protocol on both the gRPC and REST flavours, which has been standardised and adopted by various model serving frameworks.

You can read more about the goals of this project on the initial design document.

Usage

You can install the mlserver package running:

pip install mlserver

Note that to use any of the optional inference runtimes, you'll need to install the relevant package. For example, to serve a scikit-learn model, you would need to install the mlserver-sklearn package:

pip install mlserver-sklearn

For further information on how to use MLServer, you can check any of the available examples.

Inference Runtimes

Inference runtimes allow you to define how your model should be used within MLServer. You can think of them as the backend glue between MLServer and your machine learning framework of choice. You can read more about inference runtimes in their documentation page.

Out of the box, MLServer comes with a set of pre-packaged runtimes which let you interact with a subset of common frameworks. This allows you to start serving models saved in these frameworks straight away. However, it's also possible to write custom runtimes.

Out of the box, MLServer provides support for:

Framework Supported Documentation
Scikit-Learn MLServer SKLearn
XGBoost MLServer XGBoost
Spark MLlib MLServer MLlib
LightGBM MLServer LightGBM
CatBoost MLServer CatBoost
Tempo github.com/SeldonIO/tempo
MLflow MLServer MLflow
Alibi-Detect MLServer Alibi Detect
Alibi-Explain MLServer Alibi Explain
HuggingFace MLServer HuggingFace

MLServer is licensed under the Apache License, Version 2.0. However please note that software used in conjunction with, or alongside, MLServer may be licensed under different terms. For example, Alibi Detect and Alibi Explain are both licensed under the Business Source License 1.1. For more information about the legal terms of products that are used in conjunction with or alongside MLServer, please refer to their respective documentation.

Supported Python Versions

🔴 Unsupported

🟠 Deprecated: To be removed in a future version

🟢 Supported

🔵 Untested

Python Version Status
3.7 🔴
3.8 🔴
3.9 🟢
3.10 🟢
3.11 🟢
3.12 🟢
3.13 🔴

Examples

To see MLServer in action, check out our full list of examples. You can find below a few selected examples showcasing how you can leverage MLServer to start serving your machine learning models.

Developer Guide

Versioning

Both the main mlserver package and the inference runtimes packages try to follow the same versioning schema. To bump the version across all of them, you can use the ./hack/update-version.sh script.

We generally keep the version as a placeholder for an upcoming version.

For example:

./hack/update-version.sh 0.2.0.dev1

Testing

To run all of the tests for MLServer and the runtimes, use:

make test

To run run tests for a single file, use something like:

tox -e py3 -- tests/batch_processing/test_rest.py
1.7.1 Jun 06, 2025
1.7.1rc1 Jun 04, 2025
1.7.0 Apr 11, 2025
1.7.0rc5 Apr 11, 2025
1.7.0rc4 Apr 11, 2025
1.7.0rc3 Apr 11, 2025
1.7.0rc2 Mar 13, 2025
1.7.0rc1 Feb 06, 2025
1.6.2rc1 Nov 26, 2024
1.6.1 Sep 10, 2024
1.6.1rc2 Sep 10, 2024
1.6.1rc1 Sep 10, 2024
1.6.0 Jun 26, 2024
1.6.0rc1 Jun 19, 2024
1.6.0.dev2 Jun 10, 2024
1.6.0.dev1 Jun 07, 2024
1.5.0 Mar 05, 2024
1.5.0rc2 Mar 04, 2024
1.5.0rc1 Mar 04, 2024
1.4.0 Feb 28, 2024
1.4.0rc5 Sep 07, 2023
1.4.0rc4 Sep 07, 2023
1.4.0rc3 Sep 01, 2023
1.4.0rc2 Aug 31, 2023
1.4.0rc1 Aug 31, 2023
1.4.0.dev3 Jul 05, 2023
1.4.0.dev2 May 04, 2023
1.3.5 Jul 06, 2023
1.3.5rc1 Jul 04, 2023
1.3.4 Jun 21, 2023
1.3.4rc2 Jun 20, 2023
1.3.4rc1 Jun 16, 2023
1.3.3 Jun 01, 2023
1.3.3rc2 May 31, 2023
1.3.3rc1 May 26, 2023
1.3.2 May 09, 2023
1.3.2rc3 May 04, 2023
1.3.2rc2 May 03, 2023
1.3.2rc1 Apr 20, 2023
1.3.1 Apr 14, 2023
1.3.0 Apr 12, 2023
1.3.0rc2 Apr 06, 2023
1.3.0rc1 Apr 06, 2023
1.3.0.dev15 Apr 14, 2023
1.3.0.dev14 Apr 13, 2023
1.3.0.dev13 Apr 13, 2023
1.3.0.dev12 Apr 13, 2023
1.3.0.dev11 Apr 12, 2023
1.3.0.dev10 Apr 12, 2023
1.3.0.dev9 Apr 12, 2023
1.3.0.dev8 Apr 12, 2023
1.3.0.dev7 Apr 12, 2023
1.3.0.dev4 Mar 22, 2023
1.3.0.dev3 Mar 02, 2023
1.3.0.dev2 Jan 09, 2023
1.2.4 Mar 06, 2023
1.2.4rc1 Mar 02, 2023
1.2.3 Jan 16, 2023
1.2.3rc1 Jan 12, 2023
1.2.2 Jan 09, 2023
1.2.2rc1 Jan 09, 2023
1.2.1 Dec 15, 2022
1.2.1rc1 Dec 14, 2022
1.2.0 Nov 24, 2022
1.2.0rc5 Nov 22, 2022
1.2.0rc4 Nov 21, 2022
1.2.0rc3 Nov 21, 2022
1.2.0rc2 Nov 21, 2022
1.2.0rc1 Nov 17, 2022
1.2.0.dev14 Nov 15, 2022
1.2.0.dev13 Nov 04, 2022
1.2.0.dev12 Oct 26, 2022
1.2.0.dev11 Oct 25, 2022
1.2.0.dev10 Oct 19, 2022
1.2.0.dev9 Oct 19, 2022
1.2.0.dev8 Oct 12, 2022
1.2.0.dev7 Oct 12, 2022
1.2.0.dev6 Aug 31, 2022
1.2.0.dev5 Aug 25, 2022
1.2.0.dev4 Aug 25, 2022
1.2.0.dev3 Aug 01, 2022
1.2.0.dev2 Aug 01, 2022
1.2.0.dev1 Jul 21, 2022
1.1.1rc2 Oct 04, 2022
1.1.0 Jun 15, 2022
1.1.0rc1 Jun 10, 2022
1.1.0.dev6 May 24, 2022
1.1.0.dev5 May 16, 2022
1.1.0.dev4 May 11, 2022
1.1.0.dev3 Mar 22, 2022
1.1.0.dev2 Mar 21, 2022
1.1.0.dev1 Dec 07, 2021
1.0.1 Mar 08, 2022
1.0.0 Feb 11, 2022
1.0.0rc2 Feb 03, 2022
1.0.0rc1 Dec 07, 2021
0.6.0.dev4 Nov 24, 2021
0.6.0.dev3 Nov 15, 2021
0.6.0.dev2 Nov 11, 2021
0.6.0.dev0 Oct 04, 2021
0.5.3 Oct 29, 2021
0.5.2 Oct 22, 2021
0.5.1 Oct 22, 2021
0.5.0 Sep 21, 2021
0.4.1 Sep 10, 2021
0.4.1.dev1 Sep 02, 2021
0.4.0 Aug 05, 2021
0.4.0.dev1 Jul 28, 2021
0.3.2 May 17, 2021
0.3.2.dev0 Apr 30, 2021
0.3.1 Apr 30, 2021
0.3.1.dev7 Mar 31, 2021
0.3.1.dev5 Feb 26, 2021
0.3.1.dev4 Feb 25, 2021
0.3.1.dev3 Feb 24, 2021
0.3.1.dev1 Feb 12, 2021
0.3.0.dev1 Feb 03, 2021
0.2.1 Jan 11, 2021
0.2.1.dev0 Dec 09, 2020
0.2.0 Dec 08, 2020

Wheel compatibility matrix

Platform Python 3
any

Files in release

Extras: None
Dependencies:
aiofiles
aiokafka
click
fastapi (!=0.89.0,<0.116.0,>=0.88.0)
gevent
geventhttpclient
grpcio (>=1.67.1)
importlib-resources (<7.0,>=5.12)
numpy
opentelemetry-exporter-otlp-proto-grpc (<2.0.0,>=1.22.0)
opentelemetry-instrumentation-fastapi (>=0.43b0)
opentelemetry-instrumentation-grpc (>=0.43b0)
opentelemetry-sdk (<2.0.0,>=1.22.0)
orjson
pandas
protobuf (<7.0.0,>=5.27.2)
py-grpc-prometheus
pydantic (<3.0.0,>=2.7.1)
pydantic-settings (<3.0.0,>=2.3.0)
python-dotenv
python-multipart
starlette-exporter
tritonclient[http] (>=2.42)
uvicorn
uvloop and