Hello folks,
I’m using Ray version 1.3.0 to do data partition and to exchange partitions between nodes with following code:
@ray.remote(num_cpus=1)
Class Worker(object):
...
def partition_data():
res = []
for worker in workers:
node_grid, worker_grid = xxx, xxx
# da is an array of the whole dataset
# data is the partition for another worker
data = da[:, 0, worker_grid[0]:worker_grid[1], node_grid[0]:node_grid[1]].to_dataframe(name='df')
data_id = ray.put(data)
res.append(worker.store_partition.remote(data_id))
Then I run worker.partition_data.remote()
for every Worker.
However, I got "ray.exceptions.RayActorError: The actor died unexpectedly before finishing this task."
after finishing several partitions.
Here is the detailed error msg:
(pid=1000184) <class ‘pandas.core.frame.DataFrame’>
2021-08-06 00:24:19,148 WARNING worker.py:1115 – A worker died or was killed while executing task ffffffffffffffff57725ccfa235f9852aacd78e06000000.
Traceback (most recent call last):
File “download.py”, line 454, in
futures = ray.get(res)
File “/home/ubuntu/anaconda3/envs/nasa/lib/python3.7/site-packages/ray/_private/client_mode_hook.py”, line 47, in wrapper
return func(*args, **kwargs)
File “/home/ubuntu/anaconda3/envs/nasa/lib/python3.7/site-packages/ray/worker.py”, line 1483, in get
raise value
ray.exceptions.RayActorError: The actor died unexpectedly before finishing this task.
Any hints why this happening? Thanks!