pyspark 4.1.2


pip install pyspark

  Latest version

Released: May 21, 2026

Project Links

Meta
Author: Spark Developers
Requires Python: >=3.10

Classifiers

Development Status
  • 5 - Production/Stable

Programming Language
  • Python :: 3.10
  • Python :: 3.11
  • Python :: 3.12
  • Python :: 3.13
  • Python :: 3.14
  • Python :: Implementation :: CPython
  • Python :: Implementation :: PyPy

Typing
  • Typed

Apache Spark

Spark is a unified analytics engine for large-scale data processing. It provides high-level APIs in Scala, Java, Python, and R, and an optimized engine that supports general computation graphs for data analysis. It also supports a rich set of higher-level tools including Spark SQL for SQL and DataFrames, pandas API on Spark for pandas workloads, MLlib for machine learning, GraphX for graph processing, and Structured Streaming for stream processing.

https://spark.apache.org/

Online Documentation

You can find the latest Spark documentation, including a programming guide, on the project web page

Python Packaging

This README file only contains basic information related to pip installed PySpark. This packaging is currently experimental and may change in future versions (although we will do our best to keep compatibility). Using PySpark requires the Spark JARs, and if you are building this from source please see the builder instructions at "Building Spark".

The Python packaging for Spark is not intended to replace all of the other use cases. This Python packaged version of Spark is suitable for interacting with an existing cluster (be it Spark standalone, YARN) - but does not contain the tools required to set up your own standalone Spark cluster. You can download the full version of Spark from the Apache Spark downloads page.

NOTE: If you are using this with a Spark standalone cluster you must ensure that the version (including minor version) matches or you may experience odd errors.

Python Requirements

At its core PySpark depends on Py4J, but some additional sub-packages have their own extra requirements for some features (including numpy, pandas, and pyarrow). See also Dependencies for production, and dev/requirements.txt for development.

4.2.0.dev5 May 02, 2026
4.2.0.dev4 Apr 10, 2026
4.2.0.dev3 Mar 12, 2026
4.2.0.dev2 Feb 08, 2026
4.2.0.dev1 Jan 12, 2026
4.1.2 May 21, 2026
4.1.1 Jan 09, 2026
4.1.0 Dec 16, 2025
4.1.0.dev4 Nov 20, 2025
4.1.0.dev3 Oct 30, 2025
4.1.0.dev2 Sep 28, 2025
4.1.0.dev1 Jul 14, 2025
4.0.2 Feb 05, 2026
4.0.1 Sep 06, 2025
4.0.0 May 23, 2025
4.0.0.dev2 Sep 27, 2024
4.0.0.dev1 Jun 03, 2024
3.5.8 Jan 15, 2026
3.5.7 Sep 23, 2025
3.5.6 May 27, 2025
3.5.5 Feb 27, 2025
3.5.4 Dec 20, 2024
3.5.3 Sep 24, 2024
3.5.2 Aug 12, 2024
3.5.1 Feb 26, 2024
3.5.0 Sep 26, 2023
3.4.4 Oct 25, 2024
3.4.3 Apr 18, 2024
3.4.2 Nov 30, 2023
3.4.1 Jun 23, 2023
3.4.0 Apr 13, 2023
3.3.4 Dec 16, 2023
3.3.3 Aug 21, 2023
3.3.2 Feb 15, 2023
3.3.1 Oct 25, 2022
3.3.0 Jun 15, 2022
3.2.4 Apr 13, 2023
3.2.3 Nov 28, 2022
3.2.2 Jul 15, 2022
3.2.1 Jan 26, 2022
3.2.0 Oct 18, 2021
3.1.3 Feb 18, 2022
3.1.2 May 27, 2021
3.1.1 Mar 02, 2021
3.0.3 Jun 23, 2021
3.0.2 Feb 19, 2021
3.0.1 Sep 07, 2020
3.0.0 Jun 16, 2020
2.4.8 May 15, 2021
2.4.7 Sep 12, 2020
2.4.6 Jun 06, 2020
2.4.5 Feb 06, 2020
2.4.4 Aug 31, 2019
2.4.3 May 07, 2019
2.4.2 Apr 24, 2019
2.4.1 Apr 01, 2019
2.4.0 Nov 05, 2018
2.3.4 Sep 09, 2019
2.3.3 Feb 15, 2019
2.3.2 Sep 25, 2018
2.3.1 Jun 08, 2018
2.3.0 Feb 28, 2018
2.2.3 Jan 13, 2019
2.2.2 Jul 03, 2018
2.2.1 Jan 06, 2018
2.2.0.post0 Jul 12, 2017
2.2.0
2.1.3 Jun 29, 2018
2.1.2 Oct 25, 2017
Extras:
Dependencies:
py4j (<0.10.9.10,>=0.10.9.7)