Trials with remote function calls not scheduled

xiaohan2012 · December 1, 2021, 8:37am

Hi,

Say I have a function run_trial to be called by tune.run with different hyperparameter configurations. Further, inside run_trial, a remote function is called and the result is retrieved via ray.get.

However when executing tune.run(run_trial), none of the trials is finished and Ray outputs that tasks cannot be scheduled.

An MWE is:

import ray
import time
from ray import tune


@ray.remote
def remote_hello(name):
    """a function to be called inside MyClass"""
    time.sleep(1)
    return "Hello, {}!".format(name)

class MyClass:
    def hello(self, name):
        ret_id = remote_hello.remote(name)
        return ray.get(ret_id)

def run_trial(config):
    obj = MyClass()
    ret = obj.hello(config['name'])
    tune.report(ret=ret)

trial_config = {
    'name': tune.choice(['Ray', 'Tune'])
}

analysis = tune.run(
    run_trial,
    config=trial_config,
    num_samples=2,
)

The output is something like:

The actor or task with ID {some id} cannot be scheduled right now. You can ignore this message if this Ray cluster is expected to auto-scale or if you specified a runtime_env for this actor or task, which may take time to install.  Otherwise, this is likely due to all cluster resources being claimed by actors. To resolve the issue, consider creating fewer actors or increasing the resources available to this Ray cluster.
Required resources for this actor or task: {CPU_group_0: 1.000000}

xiaohan2012 · December 1, 2021, 9:30am

The reason that the tasks cannot be scheduled is: to call remote functions, you need to specify extra resources, which I didn’t do.

One solution is to use tune.PlacementGroupFactory, which I found here.

analysis = tune.run(
    run_trial,
    config=trial_config,
    num_samples=2,
    resources_per_trial=tune.PlacementGroupFactory(
            [{"CPU": 1}, {"CPU": 1}], strategy="PACK"
))

The first {"CPU": 1} specifies the number of CPUs for the main function,
while the second {"CPU": 1} specifies the number of CPUs for remote function calls.

Topic		Replies	Views
Clarification about invocation of remote tasks within trainable	0	94	February 20, 2024
Deadlock with Ray Remote Function + Tune Ray Tune	3	392	June 21, 2021
Change the config in tune.scheduler will call the setup function of Trainable class Ray Tune	4	368	February 27, 2023
Can 'tune.run' just run a function on multiple GPUs with different configs without "trials"? Ray Tune	1	439	January 22, 2021
Running Tune within a remote function Ray Tune	1	267	January 22, 2024

Trials with remote function calls not scheduled

Related topics