Roadmap - numpy-ts

The goal for numpy-ts is to be the best possible NumPy implementation for JavaScript and TypeScript. To get there, there’s have a long list of features, optimizations, and API improvements to build. Here’s a high-level roadmap of what’s coming next:

Async / Worker offloading

Heavy operations like matmul, svd, fft, and convolve can block the main thread for tens of milliseconds on large inputs. We’re designing an opt-in np.async.* namespace that transparently offloads these to a Web Worker pool:

// Proposed API
const result = await np.async.matmul(A, B);
const { u, s, vt } = await np.async.linalg.svd(largeMatrix);

The worker pool will support two transport paths:

SharedArrayBuffer (zero-copy) when COOP/COEP headers are present
postMessage (universal fallback) for environments without cross-origin isolation

Multi-threaded WASM

Extend the WASM acceleration layer to use multiple threads via WebAssembly.Memory with shared: true and Web Workers. This would allow large matrix operations to be parallelized across CPU cores without leaving the WASM execution context.

Ufunc framework

A generalized ufunc (universal function) system that would allow users to define custom element-wise and reduction operations that automatically get broadcasting, dtype promotion, and axis handling:

// Proposed API
const clampedAdd = np.ufunc((a, b) => Math.min(a + b, 255), 2);
clampedAdd(imageA, imageB); // broadcasts, handles dtypes, etc.

Masked arrays

Support for arrays with a boolean mask that marks invalid or missing entries. Operations would automatically skip masked elements, similar to NumPy’s numpy.ma module:

// Proposed API
const a = np.ma.array([1, 2, 3, 4], { mask: [false, false, true, false] });
np.mean(a); // 2.333... (skips index 2)

Structured arrays / record arrays

Arrays with named, heterogeneous fields — useful for tabular data without pulling in a full DataFrame library:

// Proposed API
const dt = np.dtype([['name', 'U10'], ['age', 'int32'], ['score', 'float64']]);
const records = np.zeros(100, dt);

This is a significant undertaking and may be scoped to a subset of NumPy’s structured array features.

Graph-based chaining / fused kernels

v1.3.0 introduced WASM-backed array storage so that data lives in WASM memory between kernel calls — eliminating per-kernel copy-in/copy-out for the common path. The next step is kernel fusion: collapsing chained operations like a.add(b).multiply(c) into a single kernel pass to eliminate intermediate writes to memory entirely.

Strided ops in WASM

A handful of operations (notably non-contiguous strided variants) still fall back to TypeScript when the input layout precludes the fast WASM path. The plan is to extend the WASM kernels to handle strided inputs directly so we can drop the JS fallback entirely.

Complete fancy indexing

vindex (added in v1.3.0) covers the bulk of NumPy’s integer array indexing. Remaining work: full parity with NumPy’s combined basic + advanced indexing semantics, including in-place assignment via vindex and broadcasting of mixed integer/slice/boolean indexers.

WASM `memory64` build

Currently the WASM linear memory is 32-bit, capping the pool at 4 GiB (with a 256 MiB default). A memory64 build option would lift this ceiling on supporting runtimes, enabling much larger arrays for scientific workloads. The 32-bit build will remain the default for portability.

Codebase modularization

Split the monolithic core into smaller, independently versionable modules — including the .zig source. This will make the library easier to contribute to, easier to subset for ultra-light deployments, and lets us iterate on individual modules (e.g. linalg, fft) without rebuilding the world.

This roadmap reflects current thinking, not commitments. Items may be reprioritized, combined, or dropped based on what the community actually needs. The best way to influence the roadmap is to open an issue with your use case.

​Async / Worker offloading

​Multi-threaded WASM

​Ufunc framework

​Masked arrays

​Structured arrays / record arrays

​Graph-based chaining / fused kernels

​Strided ops in WASM

​Complete fancy indexing

​WASM memory64 build

​Codebase modularization

Async / Worker offloading

Multi-threaded WASM

Ufunc framework

Masked arrays

Structured arrays / record arrays

Graph-based chaining / fused kernels

Strided ops in WASM

Complete fancy indexing

WASM `memory64` build

Codebase modularization