ARM64 Support/CI integration

I believe it makes sense to have ARM64 CI support/nightly wheels. I’ve gone through and cross-compiled all the dependencies to produce a functional version of ray, so it is doable but this is a one-off and is not sustainable longterm.

I think it makes sense to run ray on ARM64 devices for two reasons:

  1. AWS Graviton instances are extremely efficient for learning, but they are ARM64. I think as time goes on we will see more ARM64 (e.g. the new MacBooks)
  2. rllib for robotics: Most SoCs/embedded devices are ARM64 rather than x86/64. This is a pretty big blocker in applying rllib to real-world problems.

I’m willing to help walk you through my build process, if this is something you are interested in supporting.

1 Like

Hmm, I think this is viable to support on our CI.

Do you have the arm64 build process documented somewhere?

Sure, I’ve just shared my code at GitHub - smorad/pi_xcompile: Set up x86 cross compiling for ARM64 targets

I tested this on Ubuntu 20.04 x86-64. From here, it’s mostly building python3 packages from source using pip bdist_wheel.

Our raspberry pis would OOM during compilation, which is why I set up the cross-compile framework.
If you have an ARM machine with enough memory, you should be able to pip install most ray deps (except py-spy which requires rust/cargo to install). I installed torch from source, but it seems that from torch1.8 they now support arm!

I also have all the dependencies (along with an older version of ray itself) compiled here: GitHub - smorad/arm64-popular-wheels: Popular python 3.8 packages (Ubuntu 20.04) built for ARM64/AArch64 (Raspberry Pi) in case you run into issues or feel lazy.

Just run on my Jetson from fresh OS install (Ubuntu20.04):

# required by pyarrow
apt-key add < KEYS
DISTRO=$(lsb_release --codename --short)
add-apt-repository "deb [arch=arm64] $DISTRO main"
sudo apt install python3-pip build-essential curl unzip psmisc liblapack-dev libblas-dev llvm libarrow-dev libarrow-python-dev libhdf5-dev
pip3 install cython pytest torch torchvision
git clone
# Install bazel, 4.0 has bug with protobuf so use 3.7
chmod +x ./bazel-3.7.0-linux-arm64
# Make sure bazel works
# Move it as ray python build expects it here
mkdir -p ~/.bazel/bin
mv bazel-3.7.0-linux-arm64 ~/.bazel/bin/bazel
# dm-tree needs this in path
sudo ln -s ~/.bazel/bin/bazel /usr/local/bin/bazel
# ray ui
sudo apt install npm
pushd ray/dashboard/client
npm install
npm run build
# rllib
cd ray/python
# Wheel builds successfully, can stop here if wheel is all you want
python3 bdist_wheel
# Now let's install the wheel (and deps) to our current machine
# Tensorflow and opencv will fail due to some dumb issues. No problem
# since torch works fine. Seems like building tf from source on arm64 is supported and no big deal
# We'll install a newer opencv version which works fine
pip3 install dist/ray-2.0.0.dev0-cp38-cp38-linux_aarch64.whl
cat python/requirements.txt python/requirements_rllib.txt | grep -v opencv | grep -v tensorflow | grep -v bazel | grep -v scikit-learn | grep -v reclaim  pip3 install -r /dev/stdin
pip3 install opencv-python-headless scikit-learn lz4

Then test using

>>> from ray.rllib.agents.ppo import PPOTrainer
>>> from ray import tune
>>>, config={"env": "CartPole-v0", "framework": "torch"})
1 Like

OK. So I think this is great, and I’d be happy to advocate for this to be built in our CI.

I’ll send you a DM to discuss more details.

There was already an aarch64 build available in as a Christmas gift. It seemed to work ok - but I’m a just a Ray beginner, so don’t take my word too seriously.

It would be really nice to also have regular aarch64 builds available that are compatible with current Python from miniforge platform Releases · conda-forge/miniforge · GitHub and probably also Ubuntu 20.04 LTS?
For playing with Ray and learning with a SBC cluster it’s fast enough…

Thank you for the great work with Ray!

@arayaday the above bash script should build it from source for Ubuntu20.04 aarch64. Run it in a conda env and it will build for your specific conda python. Still waiting for all the deps to install before I can run rllib --run DQN ... to verify.

ok, thank you, will try on the weekend - but I guess it will take more than one try and compiles on SBC take extra time :wink:

I had 8GB of ram and was down to ~400MB during setup, so I wouldn’t try on anything with less than 8GB of memory. I think this is a good reason to setup CI, so this can run on some AWS graviton instance with 64GB of ram.

oh… only 4GB here, I’m out… :-/