A key disadvantage of Python is the Global Interpreter Lock and it’s implications on threading. For example, the GIL guarantees that all threads run on one core which, in many senses, makes threading pointless.
Java implements true threading in the JVM, meaning that a comm thread can be in a wait state and other stuff continue unabated.
I’m curious about the Java implementation of Ray. Does it support full Java threading?
I think I’ve detected some footprints of Python in the Java implementation. Is that an error on my part? If not, where does it show up?
A white paper on the Java implementation would be very useful, describing the spin-up of JVM’s on remote nodes.
Yup, multiple Ray Java “worker processes” can live within a single jvm (this essentially comes with all the benefits and shortcomings you can imagine). In general, Ray shouldn’t interfere with other code you write (including spawning new threads).
There are definitely traces of python and cython in the java worker, but I think they’re mostly confined to the startup/bootstrapping process, or abstracted behind the core worker.
I’m not sure we have a paper that does a deep dive on the ray and java (cc @raulchen )
AFIAK, the footprint of Python comes from Java calling the ray CLI to start background cluster and bootstrap other processes like raylet, gcs, etc. The CLI is implemented in Python. It won’t actually use Python for any compute though.
Also about threading, Ray’s workers are implemented with cpp backend + frontend (python or Java). cpp backend is called a “core worker”, and each core worker process can have multiple core worker threads (and for python it is 1, and in Java, it is bigger than that).
I suppose, then, that in a distributed environment, each host must still have the same version of python installed, even when the application code is Java.
Is there a restriction that each host also run the same version of Java? I’m running SE15, but I can’t guarantee that all hosts run SE15.
Let me explain about multi-threading in Java workers first.
Users can definitely spawn JVM threads in Java code. There’s no such restriction in Ray Java.
We also provide another kind of multi-threading feature in Ray Java (let’s call it multi-worker Java) by allowing running multiple workers in a single Java process, which means that you can run multiple tasks or actors in the same process. This is mainly aimed to mitigate JVM memory overhead and to allow sharing large-memory static variables between actors. However, this feature is now disabled by default due to some lifetime management issues. (e.g. If you call actor.kill(), the whole process will be killed, so all tasks and actors running on this process will be aborted.)
Also, if you want to spawn your own threads and invoke Ray Java API in your own threads while the multi-worker Java feature is turned on, you’ll need to use Ray.getAsyncContext() and Ray.setAsyncContext() to pass Ray thread context between threads, otherwise Ray will throw an exception when you invoke Ray Java API in your own threads because Ray doesn’t know which worker the current thread belongs to.
I don’t suggest using the multi-worker Java feature unless you know there’s a huge benefit to your application and you know all the cons. If you still want to try this feature, you can set ray.job.num-java-workers-per-process (not exposed in doc) to, let’s say 5, to run 5 workers in a Java process. See Configuring Ray — Ray v1.1.0 about how to configure Java driver options.
Now the Java version question. Ray Java requires Java 8 or higher version, but we only test it on Java 8 and Java 11. Ideally, you should be able to use different JVM versions on different nodes. AFAIK, there’s no JVM version restriction on node-to-node communication or serialization. But we haven’t tested this either. Please let me know the result if you tried.