Cannot pickle BatchInferModel when ds.map_batches(BatchInferModel)

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

I read the tutorials in https://docs.ray.io/en/latest/data/pipelining-compute.html#example-pipelined-batch-inference, in which model is initialized in BatchInferModel’s init(), and inferencing is processed in call().

I write a similar test, in init():
self.session = onnxruntime.InferenceSession("/path/to/model.onnx")
in call():
self.session.run(...)
when I run:
ds.map_batches()

ERROR:
cannot pickle ‘onnxruntime.capi.onnxruntime_pybind11_state.InferenceSession’ object

This error seems to be because the model is initialized in one worker, serialized by pickle and then transferred to different workers.
How should I write so that initialization can be performed on their respective workers?

Hi @Jiayi_Li, your usage looks right. The issue is that InferenceSession itself is not able to pickle. You may take a look at this for workaround: Fixes #643, implements __getstate__ in python API by xadupre · Pull Request #800 · microsoft/onnxruntime · GitHub