Hello,I am using ray.data customize a reader to read hdf5 file now.
But these h5 files are on weka, sometimes it’s stucked infinited, but it will works fine if I retry the reading.
So I wish to set the timeout for read tasks, and retry if the task last more than certain seconds
I search the forum and dig into the source codes,but I still don’t know how to achive this
I found you can set _remote_args for the read task, but ray.options don’t accept timeout,and I don’t know what to do now
@tarjintor Thanks for posting. Sharing with the Ray Data group for any insights
cc; @chengsu Any idea if we can do that with a timeout in the ray.data_read_xxx(....timeout=??)
As you already implemented a custom Datasource
class, you can pass any arbitrary argument through the read_args
- https://github.com/ray-project/ray/blob/master/python/ray/data/read_api.py#L297 . The read_args
is passed through to Datasource.create_reader()
- https://github.com/ray-project/ray/blob/master/python/ray/data/read_api.py#L2286C23-L2286C23 . So you can get the timeout argument and implement the logic for it in Datasource.create_reader()
.
Thanks to reply
I guess you suggest me to implement the timeout logic in my custom codes,but I have some problem to achive this
1.As I understand, ray Reader has a method get_read_tasks return List[ReadTask],and these ReadTask also ray task, as we can set timeout for ray tasks, so if there is a way to just pass an arg such as timeout to ReadTask init, it’s will be a more general solution for all datasource more than my own task only
2.python h5py lib read h5 file don’t have a timeout args as I know, so I can only kill the read process and start another one,but ray task cancel method can do this
But I also found it’s hard to do it in ray, since ReadTask yield blocks rather than just return one block,if a task yield some blocks and failed in the middle, the retry logic will be very tricky to set.
Also as I understand, ray task yield is fake, you will need to finish a generator, collects all it results then the caller can iter the result.If so, the problem I considered is not a problem any more