All remote tasks are scheduled to one node

Here’s my ray cluster set up and use case:

  1. ray cluster is running on K8s
  2. This cluster has a remote python function called ‘calc’
  3. In a running pod, I run a Java app invoking ‘calc’ 100 times and waiting for their completion.
  4. The code snippet in Java is like below:
 fun rpcCollect(jobIds: List<String>) {
        Ray.init()
        val tasks = batches.map {
            ....
            Ray.task(PyFunction.of("pricer", "calc", String::class.java), data).remote()
        }
        ....
        val results = Ray.get(tasks).map {...) }
        ...
    }
  1. The jvm args for Ray.init is something like this:
    export JAVA_OPTS=’-Dray.address=ray_head:6379 -Dray.job.code-search-path=…’, in which ray_head is the ip of the head node of my cluster.

When I run my java app, what I observed from Ray Dashboard is that all tasks are scheduled to one node, which my java app runs on. However, instead of calling ‘calc’ from Java, if I invoke it from python, tasks are sent to different nodes as expected.

Any suggestions?