Training with torch.compile

How severe does this issue affect your experience of using Ray?

  • Medium: It contributes to significant difficulty to complete my task, but I can work around it.

When training model with torch.compile:

trainer = pl.Trainer(...)
prepare_trainer(trainer) 
trainer.fit(torch.compile(model), train_dataloaders=train_ds_loader)

having this error:

ray.exceptions.RayTaskError: ray::_Inner.train() (pid=1760620, ip=10.21.66.6, actor_id=415fcd5f620cca6e3999e6ea59000000, repr=TorchTrainer)
  File "/usr/local/lib/python3.10/dist-packages/ray/tune/trainable/trainable.py", line 342, in train
    raise skipped from exception_cause(skipped)
  File "/usr/local/lib/python3.10/dist-packages/ray/train/_internal/utils.py", line 43, in check_for_failure
    ray.get(object_ref)
ray.exceptions.RaySystemError: System error: Failed to unpickle serialized exception
traceback: Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/ray/exceptions.py", line 46, in from_ray_exception
    return pickle.loads(ray_exception.serialized_exception)
TypeError: BackendCompilerFailed.__init__() missing 1 required positional argument: 'inner_exception'

I am using ray=2.8.0, do you think it is a verisoning problem with the torch & lightning or how can I solve this?

This is how lightning using it: Training Compiled PyTorch 2.0 with PyTorch Lightning