Hi,
I am trying to use ray for scaling out the ‘task’ in the my notebook. The library ‘mod_a’ is in development so we install that and another dependency ‘mod_b’ in pip’s ‘-e’ mode. The remote cluster is deployed on Azure and when we run test examples without external dependencies, then it works fine.
The connection to remote cluster is defined using the following statement:
ray.init(f"ray://127.0.0.1:{LOCAL_PORT}",
runtime_env={
"working_dir": "../src", # to access files
})
When I try to to use a basic version of our library’s code in the notebook and try to put the object from ‘mod_a’ in the cluster, it throws the following error:
C:\ProgramData\Anaconda3\envs\icrm\lib\site-packages\ray\util\client\worker.py in put(self, val, client_ref_id)
396 "call 'put' on it (or return it).")
397 data = dumps_from_client(val, self._client_id)
--> 398 return self._put_pickled(data, client_ref_id)
399
400 def _put_pickled(self, data, client_ref_id: bytes):
C:\ProgramData\Anaconda3\envs\icrm\lib\site-packages\ray\util\client\worker.py in _put_pickled(self, data, client_ref_id)
405 if not resp.valid:
406 try:
--> 407 raise cloudpickle.loads(resp.error)
408 except (pickle.UnpicklingError, TypeError):
409 logger.exception("Failed to deserialize {}".format(resp.error))
ModuleNotFoundError: No module named 'mod_b'
The mod_b
is installed in editable mode in the local conda’s pip environment from where the jupyter kernel is running.
When using py_modules to define mod_a, mod_b as dependencies, the put
command complains about ModuleNotFoundError
for the dependencies of the mod_b.
How do I resolve this error?