How to use ray, hyperopt with OpenFOAM?

1. Severity of the issue: (select one)
Medium: Significantly affects my productivity but can find a workaround.

2. Environment:

  • Ray version: 2.47
  • Python version: 3.10, 3.12
  • OS: win11 wsl
  • Cloud/Infrastructure: na
  • Other libs/tools (if relevant):

3. What happened vs. what you expected:

  • Expected:

Hi, I’m running ray with hyperopt based on the code from the eg. on ray’s website. My objective is to run OpenFOAM, an open-source CFD numerical solver to obtain highest lift of an airfoil by varying some variables, like its angle of attack AoA. So at each iteration, hyperopt is supposed to suggest an AoA, after which OpenFOAM will run to give its lift. Based on the lift, hyperopt will suggest a new AoA and this will iterate based on the number of iterations. I will like to run with a max_concurrent = 4 to 10, so hyperopt can run in parallel concurrently. At the same time, OpenFOAM can run in parallel and for my case, I would like to run each case with 8 to 16 CPUs. Since my workstation has 64 CPUs, ideally I would like it to run with max_concurrent = 8 and each case or iteration uses 8 CPUs to max things out.

  • Actual:

However, we can only set the max_concurrent value. So if I set it as 8, then does it mean that ray is only using 8 CPUs? I tried running 16 total iterations with OpenFOAM using 1 CPU, and also with OpenFOAM using 4 CPUs per case. max_concurrent both set at 8.

Result shows that with max_concurrent =4 and 1 CPU, it takes 30mins to run 16 iterations and cases. With max_concurrent = 4 and OpenFOAM running parallel with 4 CPUs for each case, it takes 15mins. Ideally, it should take abt 8 to 9mins, but of course there’s also other overhead. So it seems to be working since there’s time reduction, but I’m not sure if it’s using the correct no. of CPUs (16). CPU load varies avg 16 to 25 and 20 to 50 for 1 and 4 OpenFOAM CPUs cases respectively.

Part of my code is given below

Hi @quarkz,
I was unable to see the code you posted, do you mind trying agian? :slight_smile:

Hi Christina, here’s part of my code:

def obj_fn_max_clcd(var):

os.system(run grid generation based on new parameters)

# parallel
    os.system("decomposePar -case " + trial_dir + "/of_case -force > " \
              + trial_dir + "/of_case/log_decomposePar.txt")
        
    os.system("mpirun -np " + str(of_cpu_no) + " renumberMesh -parallel -case " + trial_dir + "/of_case > " \
              + trial_dir + "/of_case/log_renumberMesh.txt")
        
    os.system("mpirun -np " + str(of_cpu_no) + " foamRun -solver incompressibleFluid -parallel -case " + trial_dir + "/of_case > " \
              + trial_dir + "/of_case/log_foamRun.txt")
...

result = -float(lift)/float(drag)

tune.report({"mean_loss": result, 'status': STATUS_OK})

if __name__ == "__main__":

initial_params = [
            {"inj_size": 0.8125, "suc_size": 1.625, "inj_pos": 5., "suc_pos" : 75.},
    ]
    algo = HyperOptSearch(points_to_evaluate=initial_params)
    algo = ConcurrencyLimiter(algo, max_concurrent=4)
    
   hyperopt_iter = 16
    
    search_config = {
        "v1": tune.uniform(0.325, 1.3),
        "v2": tune.uniform(0.65, 2.6),
        "v3": tune.uniform(2., 8.),
        "v4": tune.uniform(65., 85.),
    }
    
    tuner = tune.Tuner(
        obj_fn_max_clcd,
        tune_config=tune.TuneConfig(
            metric="mean_loss",
            mode="min",
            search_alg=algo,
            num_samples=hyperopt_iter,
        ),
            
        run_config=tune.RunConfig(
            storage_path=cur_dir,
            name="ray_test",
        ),
        
        
        
        param_space=search_config,
    )
    results = tuner.fit()
    
    print("Best hyperparameters found were: ", results.get_best_result().config)


Moreover, I use ran another set of cases with max_concurrent=4 and 8, with each case of OpenFOAM using 8 CPUs. Hence, ideally total cpus usage should be 32 and 64. 64 is the no. of cores on my workstation. However, the time taken is 11 and 10min, which is quite bad scaling. I wonder what’s the reason.

Ok, thanks for giving me the code!!

So to answer your questions:

However, we can only set the max_concurrent value. So if I set it as 8, then does it mean that ray is only using 8 CPUs?

Here is the doc we have on max-concurrent: ray.tune.search.ConcurrencyLimiter — Ray 2.47.1
The max_concurrent parameter in ConcurrencyLimiter limits the number of trials that can run concurrently. This does not directly translate to CPU usage, but rather, the number of trials that can be running at the same time.

Ray manages resources based on what you tell it. If you want each trial to use a specific number of CPUs, you should specify this in the resource requirements for each task or trial. You can use tune.with_resources to specify the number of CPUs each trial should use. (Docs for that here: ray.tune.with_resources — Ray 2.47.1), and you can also use ray.init(num_cpus=64) to explicitly specify the number of CPUs Ray should consider available since you mentioned you had 64.

Theres also ways to debug to further figure out what is taking up your CPU so you can optimize better.

We also have this neat guide for optimizing, lmk if it’s helpful for your use case :slight_smile:

Hi christina,

Thanks for the advice. I’ll give it a try!