I start a ray cluster with 3 nodes, and running a sklearn model, but in dashboard I only see one node CPU with high usage, other two nodes seem like not working, is this normal? Thank you very much.
my code is something like this:
print('Init ray')
address = 'auto'
ray.init(address=address)
print('Init success')
print('Start tsne')
t0 = time.time()
with joblib.parallel_backend('ray'):
Y = tsne.fit_transform(vector)
hi @yh_Zhao, there could be multiple reasons your job is not utilizing all the resources on the nodes, such as misconfiguration, or algorithm not being distributed.; however, i’d recommand you to use Monitoring Ray States — Ray 2.2.0 to first see if there are multiple tasks/actors running.
@Chen_Shen maybe misconfiguration, I didn’t set any load balance strategy.
BTW, could you teach me how to stop a Ray Server? I didn’t find a method in the document, maybe I miss something, thanks a lot