awkward 0.6.0


pip install awkward==0.6.0

Project Links

Meta
Author: Jim Pivarski (DIANA-HEP)
Maintainer: Jim Pivarski (DIANA-HEP)

Classifiers

Development Status
  • 4 - Beta

Intended Audience
  • Developers
  • Information Technology
  • Science/Research

License
  • OSI Approved :: BSD License

Operating System
  • MacOS
  • POSIX
  • Unix

Programming Language
  • Python
  • Python :: 2.7
  • Python :: 3.4
  • Python :: 3.5
  • Python :: 3.6
  • Python :: 3.7

Topic
  • Scientific/Engineering
  • Scientific/Engineering :: Information Analysis
  • Scientific/Engineering :: Mathematics
  • Scientific/Engineering :: Physics
  • Software Development
  • Utilities

awkward-array is a pure Python+Numpy library for manipulating complex data structures as you would Numpy arrays. Even if your data structures

  • contain variable-length lists (jagged or ragged),

  • are deeply nested (records or structs),

  • have different data types in the same list (heterogeneous),

  • are masked, bit-masked, or index-mapped (nullable),

  • contain cross-references or even cyclic references,

  • need to be Python class instances on demand,

  • are not defined at every point (sparse),

  • are not contiguous in memory,

  • should not be loaded into memory all at once (lazy),

this library can access them with the efficiency of Numpy arrays. They may be converted from JSON or Python data, loaded from “awkd” files, HDF5, Parquet, or ROOT files, or they may be views into memory buffers like Arrow.

Consider this monstrosity:

import awkward
array = awkward.fromiter([[1.1, 2.2, None, 3.3, None],
                          [4.4, [5.5]],
                          [{"x": 6, "y": {"z": 7}}, None, {"x": 8, "y": {"z": 9}}]
                         ])

It’s a list of lists; the first contains numbers and None, the second contains a sub-sub-list, and the third defines nested records. If we print this out, we see that it is called a JaggedArray:

array
# returns <JaggedArray [[1.1 2.2 None 3.3 None] [4.4 [5.5]] [<Row 0> None <Row 1>]] at 79093e598f98>

and we get the full Python structure back by calling array.tolist():

array.tolist()
# returns [[1.1, 2.2, None, 3.3, None],
#          [4.4, [5.5]],
#          [{'x': 6, 'y': {'z': 7}}, None, {'x': 8, 'y': {'z': 9}}]]

But we can also manipulate it as though it were a Numpy array. We can, for instance, take the first two elements of each sub-list (slicing the second dimension):

array[:, :2]
# returns <JaggedArray [[1.1 2.2] [4.4 [5.5]] [<Row 0> None]] at 79093e5ab080>

or the last two:

array[:, -2:]
# returns <JaggedArray [[3.3 None] [4.4 [5.5]] [None <Row 1>]] at 79093e5ab3c8>

Internally, the data has been rearranged into a columnar form, with all values at a given level of hierarchy in the same array. Numpy-like slicing, masking, and fancy indexing are translated into Numpy operations on these internal arrays: they are not implemented with Python for loops!

To see some of this structure, ask for the content of the array:

array.content
# returns <IndexedMaskedArray [1.1 2.2 None ... <Row 0> None <Row 1>] at 79093e598ef0>

Notice that the boundaries between sub-lists are gone: they exist only at the JaggedArray level. This IndexedMaskedArray level handles the None values in the data. If we dig further, we’ll find a UnionArray to handle the mixture of sub-lists and sub-sub-lists and record structures. If we dig deeply enough, we’ll find the numerical data:

array.content.content.contents[0]
# returns array([1.1, 2.2, 3.3, 4.4])
array.content.content.contents[1].content
# returns array([5.5])

Perhaps most importantly, Numpy’s universal functions (operations that apply to every element in an array) can be used on our array. This, too, goes straight to the columnar data and preserves structure.

array + 100
# returns <JaggedArray [[101.1 102.2 None 103.3 None]
#                       [104.4 [105.5]]
#                       [<Row 0> None <Row 1>]] at 724509ffe2e8>

(array + 100).tolist()
# returns [[101.1, 102.2, None, 103.3, None],
#          [104.4, [105.5]],
#          [{'x': 106, 'y': {'z': 107}}, None, {'x': 108, 'y': {'z': 109}}]]

numpy.sin(array)
# returns <JaggedArray [[0.8912073600614354 0.8084964038195901 None -0.1577456941432482 None]
#                       [-0.951602073889516 [-0.70554033]]
#                       [<Row 0> None <Row 1>]] at 70a40c3a61d0>

Rather than matching the speed of compiled code, this can exceed the speed of compiled code on non-columnar data because the operation may be vectorized on awkward-array’s underlying columnar arrays.

Installation

Install awkward-array like any other Python package:

pip install awkward

or similar (use sudo, --user, virtualenv, or pip-in-conda if you wish).

Strict dependencies:

2.8.3 May 15, 2025
2.8.2 May 03, 2025
2.8.1 Mar 24, 2025
2.8.0 Mar 19, 2025
2.7.4 Jan 31, 2025
2.7.3 Jan 25, 2025
2.7.2 Dec 05, 2024
2.7.1 Nov 19, 2024
2.7.0 Nov 08, 2024
2.6.10 Nov 07, 2024
2.6.9 Oct 07, 2024
2.6.8 Sep 12, 2024
2.6.7 Aug 02, 2024
2.6.6 Jun 26, 2024
2.6.5 May 28, 2024
2.6.4 May 03, 2024
2.6.3 Apr 01, 2024
2.6.3rc2 Mar 22, 2024
2.6.2 Mar 05, 2024
2.6.1 Feb 05, 2024
2.6.0 Feb 02, 2024
2.5.2 Jan 12, 2024
2.5.1 Dec 12, 2023
2.5.1rc1 Dec 06, 2023
2.5.0 Nov 16, 2023
2.5.0rc0 Oct 27, 2023
2.4.10 Nov 09, 2023
2.4.9 Nov 06, 2023
2.4.8 Nov 05, 2023
2.4.7 Oct 27, 2023
2.4.6 Oct 12, 2023
2.4.5 Oct 06, 2023
2.4.4 Sep 29, 2023
2.4.3 Sep 19, 2023
2.4.2 Sep 06, 2023
2.4.1 Sep 04, 2023
2.4.0 Sep 04, 2023
2.3.3 Aug 17, 2023
2.3.2 Aug 11, 2023
2.3.1 Jul 05, 2023
2.3.0 Jul 04, 2023
2.2.4 Jun 22, 2023
2.2.3 Jun 15, 2023
2.2.2 Jun 09, 2023
2.2.1 May 19, 2023
2.2.0 May 10, 2023
2.1.4 Apr 25, 2023
2.1.3 Apr 13, 2023
2.1.2 Apr 08, 2023
2.1.1 Mar 18, 2023
2.1.0 Mar 07, 2023
2.0.10 Mar 07, 2023
2.0.9 Mar 03, 2023
2.0.8 Feb 16, 2023
2.0.7 Feb 04, 2023
2.0.6 Jan 13, 2023
2.0.5 Jan 01, 2023
2.0.4 Dec 23, 2022
2.0.3 Dec 23, 2022
2.0.2 Dec 16, 2022
2.0.1 Dec 15, 2022
2.0.0 Dec 10, 2022
2.0.0rc8 Dec 09, 2022
2.0.0rc7 Dec 08, 2022
2.0.0rc6 Dec 06, 2022
2.0.0rc5 Dec 06, 2022
2.0.0rc4 Nov 19, 2022
1.10.5 Oct 05, 2023
1.10.4 Jul 19, 2023
1.10.3 Mar 14, 2023
1.10.2 Nov 08, 2022
1.10.1 Sep 22, 2022
1.10.0 Sep 19, 2022
1.9.0 Sep 02, 2022
1.8.0 Mar 02, 2022
1.7.0 Dec 02, 2021
1.5.1 Oct 14, 2021
1.5.0 Sep 12, 2021
1.4.0 Jul 02, 2021
1.3.0 Jun 01, 2021
1.2.3 May 10, 2021
1.2.2 Apr 12, 2021
1.2.1 Apr 07, 2021
1.2.0 Apr 01, 2021
1.1.2 Feb 11, 2021
1.1.1 Feb 09, 2021
1.1.0 Feb 09, 2021
1.0.2 Jan 06, 2021
1.0.1 Dec 14, 2020
1.0.0 Dec 05, 2020
0.14.0 Nov 03, 2020
0.13.0 Jul 20, 2020
0.12.22 Jul 06, 2020
0.12.21 May 08, 2020
0.12.20 Jan 30, 2020
0.12.19 Jan 04, 2020
0.12.18 Dec 16, 2019
0.12.17 Nov 22, 2019
0.12.16 Nov 15, 2019
0.12.15 Nov 12, 2019
0.12.14 Oct 18, 2019
0.12.13 Oct 08, 2019
0.12.12 Sep 30, 2019
0.12.11 Sep 27, 2019
0.12.10 Sep 16, 2019
0.12.9 Sep 10, 2019
0.12.8 Aug 31, 2019
0.12.7 Aug 27, 2019
0.12.6 Aug 06, 2019
0.12.5 Aug 01, 2019
0.12.4 Jul 29, 2019
0.12.3 Jul 17, 2019
0.12.2 Jul 14, 2019
0.12.1 Jul 10, 2019
0.12.0 Jul 10, 2019
0.12.0rc2 Jul 09, 2019
0.12.0rc1 Jul 09, 2019
0.11.1 Jun 17, 2019
0.11.0 Jun 17, 2019
0.11.0rc8 Jun 14, 2019
0.11.0rc7 Jun 14, 2019
0.11.0rc4 Jun 14, 2019
0.11.0rc3 Jun 14, 2019
0.10.3 May 27, 2019
0.10.2 May 23, 2019
0.10.1 May 20, 2019
0.10.0 May 20, 2019
0.9.0 Apr 12, 2019
0.9.0rc3 Apr 12, 2019
0.9.0rc2 Apr 12, 2019
0.9.0rc1 Apr 11, 2019
0.8.16 Jul 29, 2019
0.8.15 Apr 11, 2019
0.8.14 Mar 29, 2019
0.8.13 Mar 29, 2019
0.8.12 Mar 25, 2019
0.8.11 Mar 11, 2019
0.8.10 Mar 10, 2019
0.8.9 Mar 09, 2019
0.8.8 Mar 09, 2019
0.8.7 Mar 08, 2019
0.8.6 Feb 27, 2019
0.8.5 Feb 27, 2019
0.8.4 Feb 07, 2019
0.8.3 Feb 05, 2019
0.8.2 Feb 01, 2019
0.8.1 Jan 30, 2019
0.8.0 Jan 29, 2019
0.8.0rc13 Jan 29, 2019
0.8.0rc12 Jan 27, 2019
0.8.0rc11 Jan 25, 2019
0.8.0rc10 Jan 25, 2019
0.8.0rc9 Jan 25, 2019
0.8.0rc8 Jan 25, 2019
0.8.0rc7 Jan 25, 2019
0.8.0rc6 Jan 25, 2019
0.8.0rc5 Jan 25, 2019
0.8.0rc4 Jan 25, 2019
0.8.0rc3 Jan 25, 2019
0.8.0rc2 Jan 25, 2019
0.7.3 Jan 27, 2019
0.7.2 Jan 17, 2019
0.7.1 Jan 04, 2019
0.7.0 Dec 13, 2018
0.6.2 Dec 13, 2018
0.6.1 Dec 12, 2018
0.6.0 Dec 08, 2018
0.5.6 Dec 07, 2018
0.5.5 Dec 06, 2018
0.5.4 Dec 03, 2018
0.5.3 Dec 03, 2018
0.5.2 Nov 30, 2018
0.5.1 Nov 29, 2018
0.5.0 Nov 28, 2018
0.4.5 Nov 28, 2018
0.4.4 Nov 19, 2018
0.4.3 Nov 01, 2018
0.4.2 Oct 29, 2018
0.4.1 Oct 26, 2018
0.4.0 Oct 26, 2018
0.3.0 Oct 24, 2018
0.2.1 Oct 16, 2018
0.2.0 Oct 12, 2018
0.1.0 Oct 04, 2018
0.0.10 Sep 28, 2018
0.0.9 Aug 31, 2018
0.0.8 Aug 30, 2018
0.0.7 Aug 25, 2018
0.0.6 Aug 24, 2018
0.0.5 Aug 10, 2018
0.0.4 Aug 08, 2018
0.0.3 Jun 22, 2018
0.0.2 Jun 17, 2018
0.0.1 Jun 17, 2018
0.0rc0 Jan 25, 2019

Wheel compatibility matrix

Platform Python 2 Python 3
any

Files in release

Extras: None
Dependencies:
numpy (>=1.13.1)