I am trying to parallelize Rasa’s DaskGraphRunner class in order to run machine model training on a Ray multi-node cluster.
As per Dask on Ray documentation, I replaced line 101 containing the scheduler dask.get with Ray’s scheduler ray_dask_get but ran into problems with serializing GraphNode class, specifically pickling SQLAlchemy objects:
TypeError: can't pickle sqlalchemy.cprocessors.UnicodeResultProcessor objects
...
TypeError: Could not serialize the argument <rasa.engine.graph.GraphNode object at 0x7f6bb3840390> for a task or actor ray.util.dask.scheduler.dask_task_wrapper.
Check https://docs.ray.io/en/master/serialization.html
#troubleshooting for more information.
My guess is that Rasa is including an open database connection in GraphNode, and that connection contains this UnicodeResultProcess c-extension that’s not picklable. And it looks like Rasa hasn’t hit this issue since they’re using the single-threaded synchronous Dask scheduler, as you well know, under which no pickling/serialization needs to take place.
There are a few options here:
Fix the issue in upstream Rasa. This would involve them implementing/using a serializable database connection.
Register a custom serializer with Ray that serializes the connection (or even the entire GraphNode). See option (2) here.
thanks for suggestions. It makes sense that serialization does not need to take place since it’s single-threaded. Before registering and writing the custom serializer (option 2), I need to find where this connection takes place. For that I have the same thread on Rasa’s forum.