No trial resources are available for launching the actor

High

  • None: Just asking a question out of curiosity
  • Low: It annoys or frustrates me for a moment.
  • Medium: It contributes to significant difficulty to complete my task, but I can work around it.
  • High: It blocks me to complete my task.

My code:

trainable_with_resources = tune.with_resources(
    experiment_fn,
    resources={
        "gpu": 1, 
        "cpu": 4}
    )

tuner = tune.Tuner(
    trainable_with_resources,
    tune_config=tune.TuneConfig(
        num_samples=20,
        scheduler=ASHAScheduler(metric="eval_reward", mode="max"),
    ),
    param_space=search_space
)
results = tuner.fit()

Error:

No trial resources are available for launching the {submitted} `{name}`. "
ray.tune.error.TuneError: No trial resources are available for launching the actor `ray.rllib.evaluation.rollout_worker.RolloutWorker.__init__`. To resolve this, specify the Tune option:

>  resources_per_trial=tune.PlacementGroupFactory(
>    [{'GPU': 1.0, 'CPU': 4.0}] + [{'CPU': 1.0}] * N
>  )

In error traceback it is said that i could find solution here: A Guide To Parallelism and Resources — Ray 2.1.0

But i can not find one.

Does your trainable also launch other remote actors? It seems so. From error message, there are also rollout workers.
The current way you have is asking Ray to allocate gpu:1 + cpu: 4 to the trainable head and thus there is none left for rollout workers.
How to fix it:
Maybe you could take a look at rllib/examples/custom_experiment.py, especially this line

tune.with_resources(experiment, ppo.PPO.default_resource_request(config)),

In other words i allocate 1 gpu for trainable actor only. Rest 4 CPUs will be used by rollout workers.
Trainable fits model
Rollout workers make samples for it. Right?

I tried to allocate 1 gpu only same error.

Im not sure where i can find this file, because i run everything in colab botebook.

here is the line: ray/custom_experiment.py at master · ray-project/ray · GitHub

1 Like

It works poerfect. But it tells me that
2022-11-14 02:46:16,788 WARNING insufficient_resources_manager.py:128 -- Ignore this message if the cluster is autoscaling. You asked for 1.0 cpu and 1.9999999999999998 gpu per trial

Sounds like i still have three questions:

  1. How can i fix number of requested resources or read smth about resources_per_trial or num_workers for rllib.
  2. How can i specify number of trials for hyperparameter search. Right now only experiment seems to be on
  3. How can i add ASHA (hyperparameter improvement)
    Getting Started — Ray 2.1.0

p.s. where can i find smth aboout tune.with resources (i didnt find anything in docs)