Access worker node environment variables in head node

shyampatel · March 2, 2023, 5:20am

How severe does this issue affect your experience of using Ray?

Medium: It contributes to significant difficulty to complete my task, but I can work around it.

I am using cluster.yaml file to create cluster. One of my requirement here is to access environment variable of worker node in head node.

jjyao · March 3, 2023, 5:16pm

Could you give an example of what you want to do?

Can you set the same environment variables before starting head and worker nodes?

shyampatel · March 4, 2023, 12:00pm

So, I am using cluster.yaml file for cluster creation and all the servers I am using as nodes are on-premise.

Now, some of the servers are with cuda GPU, some are with i5 processor, and some are with i7 processor. So I am defining an environment variable for job capacity based on the kind of CPU or GPU, a node has.

Now, to define the environment variables at the node level, I have written a shell script in each worker node to start the worker node with defined variables, and I am calling this script in the worker node setup section in the cluster.yaml file.

Now, one of my requirements is, I want to read those environment variables in the head node, which are defined in the worker node.

jjyao · March 6, 2023, 5:41pm

So each worker is started with different env vars and you want to know the env vars for each worker from the head node? Like you want to get a map from worker node id to env vars? Is my understanding correct?

shyampatel · March 7, 2023, 5:53am

Exactly, I want the same.

jjyao · March 8, 2023, 10:43pm

Could you elaborate more on how these env vars will be used? I’m trying to see if you need env vars or Ray custom resources.

shyampatel · March 9, 2023, 3:54am

@jjyao
Actually, we have started with defining custom resources only, and it was working fine till the time. But now, for some new requirements, custom resources are restricting us to build generalised solution. With custom resources it’s possible, but it’s increasing complexity of our solution, which we don’t want. That’s why we come up with idea of using Environment variables.

So, there are two main things which defines number of jobs a node can run: 1) memory (RAM) 2) type of CPU/GPU.

Now based on our async actor implementation, our single detection actor can handle multiple jobs. So, for any node, manually we will test that how many instance of detection actor it can handle in memory and how many jobs per instance of detection actor it can handle based on computation power. Based on efficient results from testing, finally we will have four environment variables, which can define #instance & #jobs_per_instance per CPU & GPU.

shyampatel · March 13, 2023, 4:12am

@jjyao I hope I answered your question properly. Please let me know, if any further info I can provide.

jjyao · March 14, 2023, 4:18pm

Sorry for the late reply. Let’s say now you know how many instances of detection actor you can run per node, how are you going to launch different number of actors to different node?

shyampatel · March 16, 2023, 11:08am

I can find number of active instance of a actor based on list_actors() state api, where I can group actors based on node ip. Now, based on capacity I can check, if I can run new instance or not. While creating actor, I am including info in actor name itself, Ex. detactor_ip_address_GPU, which will help me for counting actor instances for CPU and GPU. Now to assign it to specific node, I am using custom resource “node:ip_address”.

jjyao · March 16, 2023, 9:14pm

I may not have the full picture but I still think we should use custom resources at least for how many instances of detection actor each node can run.

Also instead of using node ip to pin a task to node, you can use NodeAffinityScheduingStrategy (Scheduling — Ray 2.3.0).

If you want to get worker’s env vars, I think you can launch a task to each node and returns the env vars.

Jules_Damji · March 16, 2023, 10:04pm

@shyampatel I hope @jjyao has provided you with sufficient guidance and answers. As pointed out, will the NodeAffinitySchedulingStrategy work for you?

shyampatel · March 17, 2023, 6:48am

@Jules_Damji I have never used NodeAffinitySchedulingStrategy, so will have to go through the functionality and how I can integrate in our pipeline flow. But still, my requirement will be still needed to access environment variables of worker node.

As @jjyao mentioned, one method is I can create a task and assigned it to the respective node, which can return values of environment variables. Still, I am looking for an easy way to do this, if possible.

shyampatel · March 17, 2023, 6:52am

@Jules_Damji one quick help required here. Can I assign specific actor to specific worker ip based on NodeAffinitySchedulingStrategy ?

jjyao · March 20, 2023, 7:42pm

Yea, you can assign a specific actor to a specific worker id (not ip) based on NodeAffinitySchedulingStrategy

Jules_Damji · March 20, 2023, 11:21pm

@shyampatel Does that answer your question?

shyampatel · March 21, 2023, 4:16am

Yes, That’s what I want to know. I have started going through the functionalities of NodeAffinitySchedulingStrategy, it will take time for me to update complete flow of my pipeline.

But, still with this also, it’s not solving my complete requirement which I have mentioned in below answer:

Here, I have mentioned the use case of my environment variables:

jjyao · March 22, 2023, 6:26pm

Ray doesn’t support getting environment variables of worker nodes so you need to do your own thing. One possibility is launching a task to each worker node to collect that.

Topic		Replies	Views
How to set Environment Variable per node? Ray Core	4	632	September 1, 2022
Multiple jobs with different dependencies Ray Core	6	463	January 22, 2021
Environment variables for running Ray on top of k8s	1	580	February 23, 2021
How to assign different custom resources for each worker nodes? Ray Clusters	9	2245	July 28, 2022
Setting environment variables in a ray cluster	3	2362	October 17, 2024

Access worker node environment variables in head node

Related topics