[Java] GLIBC issue during Ray.init() call

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

I am experimenting with the Java bindings for Ray. In my application, I am building with Gradle with the following Ray dependencies:

    "ray-api": "io.ray:ray-api:2.0.0",
    "ray-runtime": "io.ray:ray-runtime:2.0.0",
    "ray": "pypi:ray:2.2.0"

I have a simple Java test, which only consists the “Ray.init()” call. When I run this test, I got the following GLIBC error:

java.lang.UnsatisfiedLinkError: Unable to load library '/tmp/ray/1682382173023/libcore_worker_library_java.so':
    /lib64/libc.so.6: version `GLIBC_2.25' not found (required by /tmp/ray/1682382173023/libcore_worker_library_java.so)
    /lib64/libc.so.6: version `GLIBC_2.25' not found (required by /tmp/ray/1682382173023/libcore_worker_library_java.so)
    Native library (tmp/ray/1682382173023/libcore_worker_library_java.so) not found in resource path (/home/ysu/.gradle/caches/6.9.2/workerMain/gradle-worker.jar:......)
    at com.sun.jna.NativeLibrary.loadLibrary(NativeLibrary.java:301)
    at com.sun.jna.NativeLibrary.getInstance(NativeLibrary.java:461)
    at com.sun.jna.NativeLibrary.getInstance(NativeLibrary.java:403)
    at io.ray.runtime.util.JniUtils.loadLibrary(JniUtils.java:72)
    at io.ray.runtime.RayNativeRuntime.start(RayNativeRuntime.java:80)
    at io.ray.runtime.DefaultRayRuntimeFactory.createRayRuntime(DefaultRayRuntimeFactory.java:35)
    at io.ray.api.Ray.init(Ray.java:32)
    at io.ray.api.Ray.init(Ray.java:19)
    at com.linkedin.rayoffspringexperiment.factories.RayActorTest.testRayActor(RayActorTest.java:18)

Apparently, Ray tries to load GLIBC 2.25, which is not available on my host. As upgrading GLIBC could be disastrous to the OS, I am trying to find a workaround. Since jemalloc could be used as a replacement for GLIBC, I am trying to make Ray pick up jemalloc when loading the library.

I found that simply setting LD_PRELOAD doesn’t do the trick. So I am wondering if anybody could advise how to make Ray pick up jemalloc, instead of GLIBC? Or in general, what is the correct way to pass an env var to Ray in Java?

Update: it is found that the GLIBC issue is caused by the fact that there is a static libcore_worker_library_java.so file in the ray-runtime jar. This so file explicitly requires a set of libs, including GLIBC 2.25, which are not available on our hosts. To solve this issue, we re-built Ray on our own host so that the new libcore_worker_library_java.so is compatible with our environment.

However, immediately after the GLIBC error is resolved, we started to see a new error thrown 2 lines after (i.e. the GLIBC error is thrown at line 72 of RayNativeRuntime.java, and this NoSuchFieldError is thrown at line 74):

java.lang.NoSuchFieldError: isAsync
        at java.base/java.lang.ClassLoader$NativeLibrary.load0(Native Method)
        at java.base/java.lang.ClassLoader$NativeLibrary.load(ClassLoader.java:2442)
        at java.base/java.lang.ClassLoader$NativeLibrary.loadLibrary(ClassLoader.java:2498)
        at java.base/java.lang.ClassLoader.loadLibrary0(ClassLoader.java:2694)
        at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2627)
        at java.base/java.lang.Runtime.load0(Runtime.java:768)
        at java.base/java.lang.System.load(System.java:1837)
        at io.ray.runtime.util.JniUtils.loadLibrary(JniUtils.java:74)
        at io.ray.runtime.RayNativeRuntime.start(RayNativeRuntime.java:80)
        at io.ray.runtime.DefaultRayRuntimeFactory.createRayRuntime(DefaultRayRuntimeFactory.java:37)
        at io.ray.api.Ray.init(Ray.java:32)
        at io.ray.api.Ray.init(Ray.java:19)

As we keep seeing such incompatible errors, wondering if we are missing anything fundamentally with setting up Ray Java? Any advice would be appreciated. Thanks.

@GuyangSong can you take a look at this?

1 Like

Update again: the NoSuchFieldError is gone by rebuilding ray-api-2.4.0.jar and ray-runtime-2.4.0.jar on our host and then making our application depend on these internally built libs. This temporarily resolved the issue, but wondering if the Ray team has a plan to add a centos 7 build?

Hi @ysu

Our build is based on many linux 2014, which is based on centos 7. So it’s actually a centos 7 build. Is your centos 7 version too old?

Does the ray python example code work in your machine? I just want to know if the ray package is compatible with your environment.

Hi @yic , thanks for your response. We are depending on CentOS 7.9.2009, which was released in Nov 2020, and is not too old.

Hi @GuyangSong , yes, the Ray python example code (i.e. the example from this Ray Core Quick Start page) works with ray 2.2.0 in my machine.

Seems some issues about the compiling way of libcore_worker_library_java.so? I will try to check it.

@GuyangSong Yes, the libcore_worker_library_java.so included in the released ray-runtime jar is not compatible with CentOS 7.9. Appreciate it if you could take a look. Thanks.

Hi @GuyangSong, wondering if there is any update on this?

In case you’d like to know how we build ray from source, here are the steps that we take:

  1. Create and activate the Python environment
$ /export/apps/python/3.9.5/bin/python3 -m venv ray-venv
$ source ray-venv/bin/activate
  1. Install devtoolset-10 for centos and set devtoolset-10 gcc as the default bazel gcc
$ sudo yum install devtoolset-10
$ export CC=/opt/rh/devtoolset-10/root/usr/bin/gcc
  1. Install Bazel(isk) and use version 5.4.0
$ wget https://github.com/bazelbuild/bazelisk/releases/download/v1.16.0/bazelisk-linux-amd64
$ chmod +x bazelisk-linux-amd64
$ cp bazelisk-linux-amd64 /usr/local/linkedin/bin
$ export USE_BAZEL_VERSION=5.4.0
$ bazel version
  1. Clone ray source code from GitHub and build it
$ git clone https://github.com/ray-project/ray.git --branch ray-2.4.0 --single-branch
$ RAY_INSTALL_JAVA=1 pip install -e . --verbose
$ ./build-jar-multiplatform.sh build_jars_linux

@ysu I had forwarded this issue to my teammate. We will update the progress in a few days. Thanks for your feedback!

@ysu let me answer these questions for you
Q1: /lib64/libc.so.6: version `GLIBC_2.25’ not found (required by /tmp/ray/1682382173023/libcore_worker_library_java.so)
A1: Maybe you take a look for version of GLIBC, compile ‘‘libcore_worker_library_java.so’’ without specifying GLIBC version. I guess it’s caused by the compilation environment where GLIBC is 2.25, and you can take a look for version of GLIBC by “ldd --version”.

Q2: java.lang.NoSuchFieldError: isAsync
As you build your project with the following Ray depemdencures:
“ray-api”: “io.ray:ray-api:2.0.0”,
“ray-runtime”: “io.ray:ray-runtime:2.0.0”,
“ray”: “pypi:ray:2.2.0”
The Field if IsAsync is added to ray by [Ray][xlang]Setting async flag for Python actor actor in Java by XiaodongLv · Pull Request #28149 · ray-project/ray · GitHub and released in Ray-2.1.0, so it’s incompatible.

As you build your project with Ray dependences:
“ray-api”: “io.ray:ray-api:2.0.0”,
“ray-runtime”: “io.ray:ray-runtime:2.0.0”,
“ray”: “pypi:ray:2.2.0”

The “isAsync” is added by [Ray][xlang]Setting async flag for Python actor actor in Java by XiaodongLv · Pull Request #28149 · ray-project/ray · GitHub and released in Ray-2.1.0, so it’s icompatible.

Building “libcore_worker_library_java.so” without specifying the version of GLiBC, I guess that the GLIBC version of the machine used to compile Ray is 2.25. Maybe you can take a look at version of GLIBC by command “ldd --version”. As you see, you can rebuild ray to solve the question.

@XiaodongLv Thanks a lot for this info. So just to confirm: if I update “ray-api” and “ray-runtime” to any released version >= 2.1.0, the libcore_worker_library_java.so would be compatible with our machine, right?

@XiaodongLv I tried to use the publicly released ray-api and ray-runtime 2.4.0, but unfortunately, the GLIBC issue occurs again. So the libcore_worker_library_java.so in the new version of ray-runtime is still not compatible with CentOS 3.9. Wondering if you could help take a look at the compiling way of that so file? Thanks.

I think the root cause is that we build jar in the CI environment, not in the manylinux2014 image. ray/pipeline.build.yml at master · ray-project/ray · GitHub

We release the jars in this steps ray/java-release-guide.md at master · ray-project/ray · GitHub.

@XiaodongLv Let’s move the compiling to the manylinux2014 image!

I agree with @GuyangSong , because the glibc version of the machine where ray is released is 2.25, so you build ray on your computer will be successful.

@GuyangSong @XiaodongLv Thank you both for your response. So when can we expect the new version based on the compiling in the manylinux2014 image to be released?