How severe does this issue affect your experience of using Ray?
High: It blocks me to complete my task.
How can I measure the performance of bandwidth used by Ray? I run Ray in a slurm cluster. I find the benchmark test in the Github repository. I see the IOPS tested in the paper. It only focuses on plasma but not other parts in the Ray system.
I want to know the bandwidth performance when I run the program on Ray. How can I figure out the bandwidth using when Ray schedules the Actor and the Worker?
May I know why not test it? It is meanless or difficult to design how to test it.
As you mentioned, the benchmark tests show in the project and I try them. It works fine but I can’t figure out the relevance to the bandwidth. In the test, it shows how many tasks are scheduled and how long they cost, but does not show how many data bytes transit.
If I monitor the bandwidth out of the program, the result can’t represent the performance of ray in bandwidth usage.
I see, yeah, unfortunately we haven’t had network bandwidth benchmark set up properly. We are continuing to improving efficiency and performance of Ray in the coming quarters. So we will have more info by then.
If this is a huge blocker for you, feel free to send me a DM to let me know what you are trying to do with Ray and your project in more details, and I could see if there’s something we could help.