Ray crashes on Raspberry Pi 5 (64-bit) due to unsupported jemalloc page size (64K) — unhandled runtime failure on ARM64

Title:

Ray crashes on Raspberry Pi 5 (64-bit) due to unsupported jemalloc page size (64K) — unhandled runtime failure on ARM64


1. Severity of the issue:
High: Completely blocks me.


2. Environment:

  • Ray version: 2.44.1 (tested with both wheel and from-source build)
  • Python version: 3.9.2 and 3.11.2 (venv)
  • OS: Debian 12 (Bookworm) on Raspberry Pi OS 64-bit
  • Cloud/Infrastructure: 8-node bare-metal Raspberry Pi 5 cluster (PoE-powered, headless setup)
  • Other libs/tools: jemalloc (bundled), Ray Serve, Ray Dashboard, built with Bazel on ARM64

3. What happened vs. what you expected:

  • Expected:
    Successful head node startup, accessible dashboard, and working Ray cluster across multiple ARM64 devices.
  • Actual:
    GCS server crashes on startup with repeated
    <jemalloc>: Unsupported system page size
    errors.
    No output in gcs_server.out, and gcs_server.err terminates with:
    terminate called without an active exception.
    The crash happens early in the startup and blocks any use of Ray on ARM64 with 64K page size systems.
    No guidance or fallbacks are documented for this scenario.

Additional context:

  • Raspberry Pi 5 uses a 64K memory page size by default on its 64-bit OS variant.
  • jemalloc appears not to support 64K pages, which leads to silent crashes during startup.
  • The issue is not visible or mentioned in official documentation, and Ray silently fails without error handling or suggestions.
  • Downgrading the OS or recompiling the kernel with 4K pages is not feasible in this setup.
  • It would be very helpful to at least fail gracefully or warn during installation/startup that this configuration is unsupported.

While attempting to deploy Ray on a Raspberry Pi 5 cluster (ARM64, 8 nodes, Debian Bookworm), I ran into a critical runtime issue that ultimately rendered the entire setup unusable. The underlying problem stems from Ray’s reliance on jemalloc, which does not support 64K memory page sizes — a configuration used by default on the 64-bit Raspberry Pi OS kernel (6.1.0-rpi5). This results in a hard crash of the GCS server shortly after starting the Ray head node, with error logs containing repeated messages like : Unsupported system page size followed by terminate called without an active exception. There is no output in the .out logs, making the issue particularly cryptic at first.

The problem is not caught during installation or even startup — Ray installs successfully (both from source and via prebuilt wheels), and even accepts ray start --head, but the crash happens shortly thereafter. There is no built-in check or warning for this incompatibility, which can lead to hours of wasted debugging time.

Building Ray from source on ARM64 added further complexity: it required Bazel (with no native support or guidance), the pyproject.toml was missing required metadata like version, and the editable install via pip install -e . failed without proper Bazel configuration. Even after resolving the build process and installing the package, the runtime crash persisted due to the jemalloc page size issue.

I initially chose Ray expecting a Python-native, lightweight distributed computing framework. Coming from a K3s-based setup, I was hoping for a simpler architecture that didn’t require Kubernetes. However, the deep dependency on native components like jemalloc, the lack of documentation about system requirements (like supported page sizes), and the absence of meaningful error handling for such issues made this journey unreasonably frustrating.

In the end, I had to abandon the project after investing significant time into building, configuring, and testing. For future improvement, I suggest Ray explicitly detects unsupported page sizes during installation or startup, provides clear documentation on jemalloc limitations, and offers guidance on workarounds or build options for ARM64/embedded systems. A lightweight, ARM-compatible build profile would be extremely helpful for those of us working in edge or IoT environments.