How severe does this issue affect your experience of using Ray?
- High: It blocks me to complete my task.
Hi all,
is it possible to run remote tasks inside the training function of a TorchTrainer? Here is a minimal example which does not reach the print(“end”) command. So, I wonder how the train function can obtain enough resources to call the remote task f().
import ray
from ray.train.torch import TorchTrainer
from ray.air.config import ScalingConfig
@ray.remote
def f():
import time
time.sleep(1)
print("check")
def train_func():
print("start")
ray.get([f.remote() for i in range(10)])
print("end")
def main():
ray.init()
scaling_config = ScalingConfig(num_workers=2)
trainer = TorchTrainer(
train_loop_per_worker=train_func,
scaling_config=scaling_config)
trainer.fit()
if __name__ == "__main__":
main()
Huaiwei (Ray team) has confirmed this issue in the Ray slack channel.