Compilation¶

Python batch loops spend time crossing the interpreter boundary on every operation. @gr.compile and Stream.apply(compile=...) analyze a restricted subset of your transform, build a CompiledPlan of Rust opcodes, and execute them in one fused pass per batch — often with Rayon scheduling when cpu > 1.

Compilation matters most in Zarr streaming pipelines where the same transform runs thousands of times across epochs. Eager one-off calls on in-memory arrays rarely need it.

What compilation does¶

When compilation succeeds, Grumpy replaces your Python function body with a fixed opcode sequence — scalar elementwise math, reductions, kNN neighbors, and certain dataframe dot-assignments — executed entirely in Rust while the GIL is released.

Decorate a function or pass it to apply:

import grumpy as gr

@gr.compile
def scale(batch):
    return batch * 2.0 + 1.0

x = gr.array([[1, 2], [3]], dtype=gr.float64)
print(scale(x).to_list())   # [[3.0, 5.0], [7.0]]
print(scale.is_compiled)    # True

The same function inside a stream:

gr.save(x, "data.gr")

st = gr.stream("data.gr", batch_size=1)
for out in st.apply(scale, compile="auto"):
    train_step(out)

If analysis fails, Grumpy falls back to plain Python and emits a one-time warning; the transform still runs correctly.

When compilation kicks in¶

Stream.apply accepts compile=:

Value	Behavior
`"auto"` (default)	Compile when the full pipeline fuses into one supported plan
`True` / `"force"`	Require compilation; warn or fall back if unsupported
`False` / `"never"`	Always run Python callables

Scheduling is separate via scheduler=:

Value	Behavior
`"auto"`	Use Rust Rayon batch scheduling when the plan is fully compiled and `cpu > 1`
`"python"`	`ThreadPoolExecutor` over batches
`"rust"`	Require Rust scheduling (falls back with a warning if the plan is not fully compiled)

Compilation pays off primarily when multiple ops fuse and cpu > 1 with scheduler="auto" — the homepage compile benchmark chart compares Python vs compiled paths on a protein-like stream.

Union batches support the same scalar elementwise opcodes as list-chains when loaded from stream or memory:

u = gr.array([1.0, [2.0, 3.0], 4.0], dtype=gr.float64)
gr.save(u, "u.gr", chunk_size=2)

@gr.compile
def double(batch):
    return batch * 2.0

st = gr.stream("u.gr", batch_size=1)
out = list(st.apply(double, compile=True, scheduler="rust"))

Writing compilable functions¶

Follow these rules so static analysis can build a plan:

Straight-line code only — no if, for, while, try, imports, or nested function definitions.
Single argument named by convention batch (the stream batch object).
Supported statements — see list below.

Supported constructs (MVP)¶

batch <op> scalar for +, -, *, /, % (list-chain and union layouts)
batch.sum(), batch.mean(), batch.min(), batch.max(), batch.ptp() with optional dim=
gr.neighbors(batch, batch, k=..., dim=..., loop=...)
Dataframe dot assignments, e.g.
batch.residue.center = batch.residue.coords.mean(dim=-1)

Chaining fuses into one plan:

@gr.compile
def normalize_and_pool(batch):
    batch = batch * 0.01
    batch = batch + 1.0
    return batch.mean(dim=1)

Multi-function pipelines in one apply call fuse when each step is compilable:

def stage_a(batch):
    return batch * 2.0

def stage_b(batch):
    return batch + 1.0

for out in st.apply([stage_a, stage_b], compile="auto", cpu=4, scheduler="auto"):
    train_step(out)

Unsupported (falls back to Python)¶

Control flow and exception handling
Arbitrary method calls outside the supported set
Multiple parameters or closures capturing external state
Ops not yet implemented for the batch layout (see Developer for layout notes)

Inspect compilation status on the wrapper:

@gr.compile
def maybe(batch):
    return batch * 2.0

print(maybe.is_compiled, maybe.compile_error)

Next: API Reference — generated documentation for every public function and core type.