API Reference¶

Reference documentation is generated from Python docstrings in the grumpy package. For narrative tutorials, start at Home and follow the section links at the bottom of each page.

Top-level API¶

Grumpy: high-performance numerical computing on ragged and nested data.

Grumpy provides Awkward-like layout semantics with strong typing, explicit nullability, mutable arrays, Zarr-backed I/O, and optional compilation of streaming transforms.

Layouts

Arrays use either list-chains (ListOffset -> … -> Leaf) or **UnionScalarList``** (mixed scalar/list rows at one axis). Both are constructed with :func:array`, persisted to Zarr, streamed, and used in dataframes.

Notes

Streaming supports axis-0 and batch_on batching, shuffle, DDP, and I/O prefetch on both layout paths.
gr.compile accepts a restricted subset of Python (see :func:compile); scalar elementwise opcodes fuse on union batches as well as list-chains.

bool_ `module-attribute` ¶

bool_ = bool_()

char `module-attribute` ¶

char = char()

float16 `module-attribute` ¶

float16 = float16()

float32 `module-attribute` ¶

float32 = float32()

float64 `module-attribute` ¶

float64 = float64()

int16 `module-attribute` ¶

int16 = int16()

int32 `module-attribute` ¶

int32 = int32()

int64 `module-attribute` ¶

int64 = int64()

int8 `module-attribute` ¶

int8 = int8()

string `module-attribute` ¶

string = string()

uint16 `module-attribute` ¶

uint16 = uint16()

uint32 `module-attribute` ¶

uint32 = uint32()

uint64 `module-attribute` ¶

uint64 = uint64()

uint8 `module-attribute` ¶

uint8 = uint8()

abs ¶

abs(x: GrumpyArray) -> GrumpyArray

add ¶

add(a: GrumpyArray, b: GrumpyArray, out: GrumpyArray | None = None) -> GrumpyArray

Elementwise add with optional pre-allocated out.

angle ¶

angle(x: GrumpyArray) -> GrumpyArray

argwhere ¶

argwhere(cond: GrumpyArray) -> GrumpyArray

array ¶

array(obj, dtype: DType | None = None) -> GrumpyArray

Construct a GrumpyArray from Python scalars / nested lists or tuples.

Parameters:

Name	Type	Description	Default
`obj`		Python scalar or nested Python sequences (lists/tuples) of arbitrary depth.	required
`dtype`	`DType \| None`	Optional explicit dtype. If omitted, dtype is inferred from non-null leaves.	`None`

bincount ¶

bincount(x: GrumpyArray, weights: GrumpyArray | None = None, minlength: int = 0) -> GrumpyArray

can_cast ¶

can_cast(from_dtype: DType, to_dtype: DType, casting: str = 'safe') -> bool

Return whether from_dtype can be cast to to_dtype under casting.

cat ¶

cat(arrays: list[GrumpyArray], dim: int = 0) -> GrumpyArray

Concatenate arrays along a ragged dimension.

ceil ¶

ceil(x: GrumpyArray) -> GrumpyArray

cos ¶

cos(x: GrumpyArray) -> GrumpyArray

cross ¶

cross(a: GrumpyArray, b: GrumpyArray) -> GrumpyArray

dataframe ¶

dataframe(mapping: dict, schema=None)

det ¶

det(a: GrumpyArray)

digitize ¶

digitize(x: GrumpyArray, bins: GrumpyArray, right: bool = False) -> GrumpyArray

dot ¶

dot(a: GrumpyArray, b: GrumpyArray)

einsum ¶

einsum(subscripts: str, *operands)

equal ¶

equal(a: GrumpyArray, b: GrumpyArray) -> GrumpyArray

exp ¶

exp(x: GrumpyArray) -> GrumpyArray

floor ¶

floor(x: GrumpyArray) -> GrumpyArray

full_like ¶

full_like(x: GrumpyArray, fill_value, dtype: DType | None = None) -> GrumpyArray

Create an array with the same ragged structure as x, filled with fill_value.

gpu_available ¶

gpu_available() -> bool

Return True when a GPU backend (Metal or CUDA) is available.

gpu_backend ¶

gpu_backend() -> str | None

Return 'metal', 'cuda', or None if no GPU backend is active.

greater ¶

greater(a: GrumpyArray, b: GrumpyArray) -> GrumpyArray

greater_equal ¶

greater_equal(a: GrumpyArray, b: GrumpyArray) -> GrumpyArray

grid_pool ¶

grid_pool(x: GrumpyArray, grid_size: tuple[int, int, int], *, origin: tuple[float, float, float] | None = None, voxel_size: tuple[float, float, float] | None = None, dim: int = 1) -> GrumpyArray

Voxelize point clouds by counting points per grid cell (occupancy pooling).

Returns (n_groups, nx*ny*nz) occupancy grids per group.

histogram ¶

histogram(x: GrumpyArray, bins: int = 10, range: tuple[float, float] | None = None, density: bool = False, weights: GrumpyArray | None = None) -> tuple[GrumpyArray, GrumpyArray]

inner ¶

inner(a: GrumpyArray, b: GrumpyArray)

inv ¶

inv(a: GrumpyArray) -> GrumpyArray

isfinite ¶

isfinite(x: GrumpyArray) -> GrumpyArray

isin ¶

isin(x: GrumpyArray, test_elements: GrumpyArray) -> GrumpyArray

isinf ¶

isinf(x: GrumpyArray) -> GrumpyArray

isnan ¶

isnan(x: GrumpyArray) -> GrumpyArray

less ¶

less(a: GrumpyArray, b: GrumpyArray) -> GrumpyArray

less_equal ¶

less_equal(a: GrumpyArray, b: GrumpyArray) -> GrumpyArray

load ¶

load(path: str)

log ¶

log(x: GrumpyArray) -> GrumpyArray

log10 ¶

log10(x: GrumpyArray) -> GrumpyArray

log2 ¶

log2(x: GrumpyArray) -> GrumpyArray

logical_and ¶

logical_and(a: GrumpyArray, b: GrumpyArray) -> GrumpyArray

logical_not ¶

logical_not(a: GrumpyArray) -> GrumpyArray

logical_or ¶

logical_or(a: GrumpyArray, b: GrumpyArray) -> GrumpyArray

logical_xor ¶

logical_xor(a: GrumpyArray, b: GrumpyArray) -> GrumpyArray

median ¶

median(x: GrumpyArray, dim: int = 0) -> GrumpyArray

multiply ¶

multiply(a: GrumpyArray, b: GrumpyArray, out: GrumpyArray | None = None) -> GrumpyArray

Elementwise multiply with optional pre-allocated out (NumPy out= style).

nanmedian ¶

nanmedian(x: GrumpyArray, dim: int = 0) -> GrumpyArray

nanpercentile ¶

nanpercentile(x: GrumpyArray, q: float, dim: int = 0) -> GrumpyArray

nanquantile ¶

nanquantile(x: GrumpyArray, q: float, dim: int = 0) -> GrumpyArray

nanstd ¶

nanstd(x: GrumpyArray, dim: int = 0, ddof: int = 0) -> GrumpyArray

nanvar ¶

nanvar(x: GrumpyArray, dim: int = 0, ddof: int = 0) -> GrumpyArray

neighbors ¶

neighbors(query: GrumpyArray, data: GrumpyArray, k: int | None = None, radius: float | None = None, dim: int = 0, loop: bool = True, return_distances: bool = False, gpu: bool | str = False)

Compute neighbors and return an edge index (and optionally distances).

Parameters:

Name	Type	Description	Default
`query`	`GrumpyArray`	Coordinate arrays. For `dim=1`, shape is `(n_groups, n_points, d)`.	required
`data`	`GrumpyArray`	Coordinate arrays. For `dim=1`, shape is `(n_groups, n_points, d)`.	required
`k`	`int \| None`	Number of nearest neighbors (mutually exclusive with `radius`).	`None`
`radius`	`float \| None`	Include all neighbors within this distance (mutually exclusive with `k`).	`None`
`dim`	`int`	Axis along which groups of points live (`0` or `1` for grouped clouds).	`0`
`loop`	`bool`	If `True`, include self-matches where query and data share the same point index.	`True`
`return_distances`	`bool`	If `True`, also return distances aligned with the neighbor axis.	`False`
`gpu`	`bool \| str`	`False` (default) uses CPU. `True` always uses GPU when available (Metal on macOS, CUDA on Linux with `--features cuda`). `'auto'` selects GPU only for large enough batches. Use :func:`gpu_available` to check runtime support.	`False`

Returns:

Name	Type	Description
`edge_index`		Ragged edge index with last axis length 2: `[src, dst]`.
	`distances (optional):`	Returned when `return_distances=True`.

nonzero ¶

nonzero(x: GrumpyArray) -> GrumpyArray

norm ¶

norm(a: GrumpyArray)

not_equal ¶

not_equal(a: GrumpyArray, b: GrumpyArray) -> GrumpyArray

ones_like ¶

ones_like(x: GrumpyArray, dtype: DType | None = None) -> GrumpyArray

Create an array with the same ragged structure as x, filled with ones.

outer ¶

outer(a: GrumpyArray, b: GrumpyArray) -> GrumpyArray

pairwise_distances ¶

pairwise_distances(x: GrumpyArray, *, dim: int = 1) -> GrumpyArray

All-pairs Euclidean distances within each point cloud (group).

For dim=1, input shape is (n_groups, n_points, d); output is (n_groups, n_points, n_points) distance matrices.

percentile ¶

percentile(x: GrumpyArray, q: float, dim: int = 0) -> GrumpyArray

promote_types ¶

promote_types(a: DType, b: DType) -> DType

NumPy-style binary result dtype for two dtypes.

quantile ¶

quantile(x: GrumpyArray, q: float, dim: int = 0) -> GrumpyArray

reciprocal ¶

reciprocal(x: GrumpyArray) -> GrumpyArray

rng ¶

rng(seed: int = 0) -> Generator

Create a reproducible random :class:~grumpy.Generator.

round ¶

round(x: GrumpyArray) -> GrumpyArray

save ¶

save(obj, path: str, chunk_size: int = 1024, chunk_dim=None)

Save a GrumpyArray/DataFrame, or incrementally write batches from a generator.

search_sorted ¶

search_sorted(x: GrumpyArray, v: GrumpyArray, right: bool = False) -> GrumpyArray

setdiff ¶

setdiff(a: GrumpyArray, b: GrumpyArray) -> GrumpyArray

setunion ¶

setunion(a: GrumpyArray, b: GrumpyArray) -> GrumpyArray

setxor ¶

setxor(a: GrumpyArray, b: GrumpyArray) -> GrumpyArray

sign ¶

sign(x: GrumpyArray) -> GrumpyArray

sin ¶

sin(x: GrumpyArray) -> GrumpyArray

sqrt ¶

sqrt(x: GrumpyArray) -> GrumpyArray

std ¶

std(x: GrumpyArray, dim: int = 0, ddof: int = 0) -> GrumpyArray

subtract ¶

subtract(a: GrumpyArray, b: GrumpyArray, out: GrumpyArray | None = None) -> GrumpyArray

Elementwise subtract with optional pre-allocated out.

tan ¶

tan(x: GrumpyArray) -> GrumpyArray

tensordot ¶

tensordot(a: GrumpyArray, b: GrumpyArray, axes: int = 2)

trace ¶

trace(a: GrumpyArray)

unique ¶

unique(x: GrumpyArray) -> GrumpyArray

var ¶

var(x: GrumpyArray, dim: int = 0, ddof: int = 0) -> GrumpyArray

where ¶

where(cond: GrumpyArray, x: GrumpyArray | None = None, y: GrumpyArray | None = None)

zeros_like ¶

zeros_like(x: GrumpyArray, dtype: DType | None = None) -> GrumpyArray

Create an array with the same ragged structure as x, filled with zeros.

Core types¶

Streaming¶

Streaming iterators and parallel batch transforms for saved Grumpy datasets.

This module provides :class:Stream and :class:StreamApply for batching over Zarr-backed stores written by :func:grumpy.save.

Features

Axis-0 batching with optional batch_on schema-level packing (list-chain and union layouts)
Reproducible batch-order shuffle and within-batch shuffle on a schema level
DDP sharding via world_size / rank
I/O prefetch via workers (distinct from StreamApply transform parallelism)
Partial batch reads (leaf ranges only) via the Rust StreamBatchesIter
Compact union partial I/O: slice tags/index and referenced scalar/list pools only
Subset iteration via st[index] (int, slice, or sequence of batch indices)

Notes

Indexed layouts are not yet supported for streaming slice loads.
Compiled Rust scheduling supports a restricted opcode set (see compiler.py); scalar elementwise opcodes work on UnionScalarList batches.

Stream `dataclass` ¶

Stream(path: str, batch_size: int, drop_last: bool = False, batch_on: Optional[str] = None, shuffle: Optional[Union[str, bool]] = None, seed: Optional[int] = None, workers: int = 0, in_memory: bool = False, gpu: Union[bool, str] = False, world_size: int = 1, rank: int = 0, batch_indices: Optional[tuple[int, ...]] = None)

Iterator over batches of a saved :class:~grumpy.GrumpyArray or dataframe.

Parameters:

Name	Type	Description	Default
`path`	`str`	Path passed to :func:`grumpy.save` (Zarr directory store).	required
`batch_size`	`int`	Maximum number of axis-0 elements (or `batch_on` entities) per batch.	required
`drop_last`	`bool`	If `True`, drop the final partial batch.	`False`
`batch_on`	`Optional[str]`	Optional schema level name (e.g. `'molecule'`) to pack batches by entity count at that nesting depth instead of axis 0.	`None`
`shuffle`	`Optional[Union[str, bool]]`	If set (e.g. `'molecule'`), shuffle batch order with `seed` and optionally shuffle within each batch on that schema axis after loading.	`None`
`seed`	`Optional[int]`	Random seed for `shuffle` (required for reproducible training).	`None`
`workers`	`int`	Number of background I/O prefetch slots and parallel loader threads (`0` = synchronous loads).	`0`
`in_memory`	`bool`	If `True`, load the entire dataset into RAM once at stream open; batches are zero-copy slices.	`False`
`world_size`	`int`	DDP world size; batches are partitioned as `index % world_size == rank`.	`1`
`rank`	`int`	DDP rank in `[0, world_size)`.	`0`

Examples:

>>> import grumpy as gr
>>> gr.save(gr.array(list(range(100))), 'data.gr')
>>> st = gr.stream('data.gr', batch_size=32)
>>> len(st)
4

batch_indices `class-attribute` `instance-attribute` ¶

batch_indices: Optional[tuple[int, ...]] = None

batch_on `class-attribute` `instance-attribute` ¶

batch_on: Optional[str] = None

batch_size `instance-attribute` ¶

batch_size: int

drop_last `class-attribute` `instance-attribute` ¶

drop_last: bool = False

gpu `class-attribute` `instance-attribute` ¶

gpu: Union[bool, str] = False

in_memory `class-attribute` `instance-attribute` ¶

in_memory: bool = False

path `instance-attribute` ¶

path: str

rank `class-attribute` `instance-attribute` ¶

rank: int = 0

seed `class-attribute` `instance-attribute` ¶

seed: Optional[int] = None

shuffle `class-attribute` `instance-attribute` ¶

shuffle: Optional[Union[str, bool]] = None

workers `class-attribute` `instance-attribute` ¶

workers: int = 0

world_size `class-attribute` `instance-attribute` ¶

world_size: int = 1

getitem ¶

__getitem__(index: Union[int, slice, Sequence[int]]) -> 'Stream'

Return a stream over a subset of batches (after DDP sharding).

iter ¶

__iter__() -> Iterator

Yield consecutive batches loaded from disk.

len ¶

__len__() -> int

Return the number of batches (after DDP sharding, before shuffle).

__post_init__ ¶

__post_init__() -> None

apply ¶

apply(fns: Union[Callable[[T], T], Sequence[Callable[[T], T]]], cpu: int = 1, prefetch: Optional[int] = None, compile: Union[bool, str] = 'auto', scheduler: str = 'auto') -> 'StreamApply[T]'

Apply one or more batch transforms, optionally compiled and parallelized.

Parameters:

Name	Type	Description	Default
`fns`	`Union[Callable[[T], T], Sequence[Callable[[T], T]]]`	Callable or sequence of callables `fn(batch) -> batch`.	required
`cpu`	`int`	Worker count for parallel apply (`1` = serial).	`1`
`prefetch`	`Optional[int]`	Max in-flight batches for threaded scheduling (default `2 * cpu`).	`None`
`compile`	`Union[bool, str]`	`True`/`'force'`, `False`/`'never'`, or `'auto'`.	`'auto'`
`scheduler`	`str`	`'auto'`, `'python'`, or `'rust'` (Rayon for fully compiled ops).	`'auto'`

Returns:

Type	Description
`StreamApply`	Lazy iterable of transformed batches.

StreamApply `dataclass` ¶

StreamApply(base: Stream, fns: list[Callable[[T], T]], cpu: int = 1, prefetch: Optional[int] = None, compile: Union[bool, str] = 'auto', scheduler: str = 'auto', gpu: Union[bool, str] = False)

Bases: Iterable[T]

Lazy iterable of transformed batches produced from a :class:Stream.

base `instance-attribute` ¶

base: Stream

compile `class-attribute` `instance-attribute` ¶

compile: Union[bool, str] = 'auto'

cpu `class-attribute` `instance-attribute` ¶

cpu: int = 1

fns `instance-attribute` ¶

fns: list[Callable[[T], T]]

gpu `class-attribute` `instance-attribute` ¶

gpu: Union[bool, str] = False

prefetch `class-attribute` `instance-attribute` ¶

prefetch: Optional[int] = None

scheduler `class-attribute` `instance-attribute` ¶

scheduler: str = 'auto'

iter ¶

__iter__() -> Iterator[T]

Compilation¶

Compile restricted batch transforms into fused Rust execution plans.

The compiler analyzes straight-line Python functions (typically def f(batch): ...) and builds :class:~grumpy._core.CompiledPlan opcode lists for use with :meth:~grumpy.stream.Stream.apply or the :func:compile decorator.

Supported inputs

List-chain and UnionScalarList batches for scalar elementwise opcodes (batch * 2, batch + 1, …).
Reduction and neighbor opcodes when the underlying Rust kernel supports the layout.

Known limitations

No control flow (if/for/try), no imports, single batch parameter.
Rust scheduling supports only a fixed opcode set (see stream.py).

CompiledTransform ¶

CompiledTransform(fn: Callable[[Any], Any], result: _CompileResult)

Callable wrapper that runs a compiled Rust plan when possible.

Instances are returned by :func:compile and used internally by :meth:~grumpy.stream.Stream.apply.

Attributes:

Name	Type	Description
`is_compiled`	`bool`	Whether a Rust :class:`~grumpy._core.CompiledPlan` was built.
`compile_error`	`str or None`	Compilation failure message when `is_compiled` is `False`.

Examples:

>>> import grumpy as gr
>>> @gr.compile
... def scale(batch):
...     return batch * 2
...
>>> scale.is_compiled
True
>>> scale(gr.array([1, 2])).to_list()
[2, 4]

doc `instance-attribute` ¶

__doc__ = getattr(fn, '__doc__', None)

name `instance-attribute` ¶

__name__ = getattr(fn, '__name__', 'compiled_transform')

qualname `instance-attribute` ¶

__qualname__ = getattr(fn, '__qualname__', __name__)

compile_error `property` ¶

compile_error: Optional[str]

Return the compilation error message, or None on success.

Returns:

Type	Description
`str or None`	Error text when compilation failed.

Examples:

>>> import grumpy as gr
>>> @gr.compile
... def ok(b): return b * 2
...
>>> ok.compile_error is None
True

is_compiled `property` ¶

is_compiled: bool

Return True when a Rust :class:~grumpy._core.CompiledPlan was built.

Returns:

Type	Description
`bool`	Compilation success flag.

Examples:

>>> import grumpy as gr
>>> @gr.compile
... def f(b): return b
...
>>> f.is_compiled
True

call ¶

__call__(batch)

Run the compiled plan or fall back to the original Python function.

Parameters:

Name	Type	Description	Default
`batch`	`GrumpyArray or GrumpyDataFrame`	Input batch.	required

Returns:

Type	Description
`GrumpyArray or GrumpyDataFrame`	Transformed batch.

Examples:

>>> import grumpy as gr
>>> @gr.compile
... def double(batch):
...     return batch * 2
...
>>> double(gr.array([1, 2])).to_list()
[2, 4]

compile ¶

compile(fn: Callable[[Any], Any]) -> CompiledTransform

Compile a restricted batch transform into a Rust execution plan.

Parameters:

Name	Type	Description	Default
`fn`	`callable`	Function `fn(batch) -> batch` with straight-line Python only.	required

Returns:

Type	Description
`CompiledTransform`	Callable wrapper that executes the plan when compilation succeeds.

Examples:

>>> import grumpy as gr
>>> @gr.compile
... def scale(batch):
...     batch = batch * 2
...     return batch
...
>>> scale(gr.array([1, 2])).to_list()
[2, 4]

Next: Developer — repository layout, implementation notes, and error handling.

API Reference¶

Top-level API¶

bool_ module-attribute ¶

char module-attribute ¶

float16 module-attribute ¶

float32 module-attribute ¶

float64 module-attribute ¶

int16 module-attribute ¶

int32 module-attribute ¶

int64 module-attribute ¶

int8 module-attribute ¶

string module-attribute ¶

uint16 module-attribute ¶

uint32 module-attribute ¶

uint64 module-attribute ¶

uint8 module-attribute ¶

abs ¶

add ¶

angle ¶

argwhere ¶

array ¶

bincount ¶

can_cast ¶

cat ¶

ceil ¶

cos ¶

cross ¶

dataframe ¶

det ¶

digitize ¶

dot ¶

einsum ¶

equal ¶

exp ¶

floor ¶

full_like ¶

gpu_available ¶

gpu_backend ¶

greater ¶

greater_equal ¶

grid_pool ¶

histogram ¶

inner ¶

inv ¶

isfinite ¶

isin ¶

isinf ¶

isnan ¶

less ¶

less_equal ¶

load ¶

log ¶

log10 ¶

log2 ¶

logical_and ¶

logical_not ¶

logical_or ¶

logical_xor ¶

median ¶

multiply ¶

nanmedian ¶

nanpercentile ¶

nanquantile ¶

nanstd ¶

nanvar ¶

neighbors ¶

nonzero ¶

norm ¶

not_equal ¶

ones_like ¶

outer ¶

pairwise_distances ¶

percentile ¶

promote_types ¶

quantile ¶

reciprocal ¶

rng ¶

round ¶

save ¶

search_sorted ¶

bool_ `module-attribute` ¶

char `module-attribute` ¶

float16 `module-attribute` ¶

float32 `module-attribute` ¶

float64 `module-attribute` ¶

int16 `module-attribute` ¶

int32 `module-attribute` ¶

int64 `module-attribute` ¶

int8 `module-attribute` ¶

string `module-attribute` ¶

uint16 `module-attribute` ¶

uint32 `module-attribute` ¶

uint64 `module-attribute` ¶

uint8 `module-attribute` ¶

Stream `dataclass` ¶

batch_indices `class-attribute` `instance-attribute` ¶

batch_on `class-attribute` `instance-attribute` ¶

batch_size `instance-attribute` ¶

drop_last `class-attribute` `instance-attribute` ¶

gpu `class-attribute` `instance-attribute` ¶

in_memory `class-attribute` `instance-attribute` ¶

path `instance-attribute` ¶

rank `class-attribute` `instance-attribute` ¶

seed `class-attribute` `instance-attribute` ¶

shuffle `class-attribute` `instance-attribute` ¶

workers `class-attribute` `instance-attribute` ¶

world_size `class-attribute` `instance-attribute` ¶

getitem ¶

iter ¶

len ¶

StreamApply `dataclass` ¶

base `instance-attribute` ¶

compile `class-attribute` `instance-attribute` ¶

cpu `class-attribute` `instance-attribute` ¶

fns `instance-attribute` ¶

gpu `class-attribute` `instance-attribute` ¶

prefetch `class-attribute` `instance-attribute` ¶

scheduler `class-attribute` `instance-attribute` ¶

iter ¶

doc `instance-attribute` ¶

name `instance-attribute` ¶

qualname `instance-attribute` ¶

compile_error `property` ¶

is_compiled `property` ¶

call ¶