How severe does this issue affect your experience of using Ray?
- Medium: It contributes to significant difficulty to complete my task, but I can work around it.
When training model with torch.compile:
trainer = pl.Trainer(...)
prepare_trainer(trainer)
trainer.fit(torch.compile(model), train_dataloaders=train_ds_loader)
having this error:
ray.exceptions.RayTaskError: ray::_Inner.train() (pid=1760620, ip=10.21.66.6, actor_id=415fcd5f620cca6e3999e6ea59000000, repr=TorchTrainer)
File "/usr/local/lib/python3.10/dist-packages/ray/tune/trainable/trainable.py", line 342, in train
raise skipped from exception_cause(skipped)
File "/usr/local/lib/python3.10/dist-packages/ray/train/_internal/utils.py", line 43, in check_for_failure
ray.get(object_ref)
ray.exceptions.RaySystemError: System error: Failed to unpickle serialized exception
traceback: Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/ray/exceptions.py", line 46, in from_ray_exception
return pickle.loads(ray_exception.serialized_exception)
TypeError: BackendCompilerFailed.__init__() missing 1 required positional argument: 'inner_exception'
I am using ray=2.8.0, do you think it is a verisoning problem with the torch & lightning or how can I solve this?
This is how lightning using it: Training Compiled PyTorch 2.0 with PyTorch Lightning