RLLib not using worker nodes in Ray Cluster

I created a 4 node Ray Cluster by using ray start on each node. For the head node I specified num_cpus=0 and num_gpus=0. For the other worker nodes I set the num_cpus and num_gpus to the physical CPU and GPU cores on those respective systems.

I verified in the logs that the Ray Cluster started correctly.

$ ray status

Node status

Healthy:
1 node_…
1 node_…
1 node_…
1 node_…
Pending:
(no pending nodes)
Recent failures:
(no failures)

Resources

Usage:
0.0/84.0 CPU
0B/674.93GiB memory
0B/293.25GiB object_store_memory

As a basic test I created a remote task that simply gets the ip address and returns it. This works exactly as I would expect. The remote actors are randomly sent to different worker nodes and gets their IP address. I see an equal distribution of IP addresses across the worker nodes. The head node is ignored because I specified in Ray Start to not use it.

The problem is that when I start using RLLib it will only schedule tasks to run on the head node and not any of the worker nodes. I tested with running Cartpole with PPO and Distributed PPO. It does not matter whether I change the number of rollout_workers, or envs_per_worker. It appears all CPU utilization is being performed on the head node and the worker nodes are ignored.

From reading the docs the rollout worker is a remote actor and should be distributed across all worker nodes. They should not be sent to the head node because I explicitly told ray start that the num_cpus=0. Can someone explain why standard Ray code works as I expect but RLLib is not working the way the documentation says it should work?