I’m using Ray version 1.3.0 to do data partition and to exchange partitions between nodes with following code:
@ray.remote(num_cpus=1) Class Worker(object): ... def partition_data(): res =  for worker in workers: node_grid, worker_grid = xxx, xxx # da is an array of the whole dataset # data is the partition for another worker data = da[:, 0, worker_grid:worker_grid, node_grid:node_grid].to_dataframe(name='df') data_id = ray.put(data) res.append(worker.store_partition.remote(data_id))
Then I run
worker.partition_data.remote() for every Worker.
However, I got
"ray.exceptions.RayActorError: The actor died unexpectedly before finishing this task." after finishing several partitions.
Here is the detailed error msg:
(pid=1000184) <class ‘pandas.core.frame.DataFrame’>
2021-08-06 00:24:19,148 WARNING worker.py:1115 – A worker died or was killed while executing task ffffffffffffffff57725ccfa235f9852aacd78e06000000.
Traceback (most recent call last):
File “download.py”, line 454, in
futures = ray.get(res)
File “/home/ubuntu/anaconda3/envs/nasa/lib/python3.7/site-packages/ray/_private/client_mode_hook.py”, line 47, in wrapper
return func(*args, **kwargs)
File “/home/ubuntu/anaconda3/envs/nasa/lib/python3.7/site-packages/ray/worker.py”, line 1483, in get
ray.exceptions.RayActorError: The actor died unexpectedly before finishing this task.
Any hints why this happening? Thanks!