1. Severity of the issue: (select one)
High: Completely blocks me.
2. Environment:
- Ray version: 2.43.0
- Python version: 3.11.11
- OS: Linux-5.14.0-427.42.1.el9_4.x86_64-x86_64-with-glibc2.34
- Cloud/Infrastructure: N/A
- Other libs/tools (if relevant): Pandas==2.2.3
3. What happened vs. what you expected:
- Expected: Intermediate result saved in specified storage path
- Actual: result is saved in artifact directory that’s in tmp directory
My Tuning task required me saving temporary predictions (pd.DataFrame) that doesn’t make sense to store in scores, so I was able to do it by specifying storage path in ray.tune.RunConfig as args in tune.Tuner. It was working fine and I was on version 2.9.1. I upgraded ray from 2.9.1 to 2.43.0 (I know I was bit outdated but that’s why I upgraded…), and instead of saving to the storage path I specified, the predictions is saved in artifact directory under the corresponding session directory that’s in the ray_tmp/ (as I specified in ray.init).
code to reproduce -
from ray import train, tune
import ray
import json
import pandas as pd
def objective(x, a, b): # Define an objective function.
return a * (x**0.5) + b
def trainable(config): # Pass a "config" dictionary into your trainable.
for x in range(20): # "Train" for 20 iterations and compute intermediate scores.
score = objective(x, config["a"], config["b"])
df = pd.DataFrame(data={'col1': [1, 2], 'col2': [3, 4]})
df.to_parquet('pred.parquet')
tune.report({"score": score}) # Send the score to Tune.
if __name__ == '__main__':
ray.init(
_temp_dir="/home/ray_tmps",
_system_config={
"automatic_object_spilling_enabled": True,
"object_spilling_config": json.dumps(
{"type": "filesystem", "params": {"directory_path": "home/ray_tmps"}},)}
)
tuner = tune.Tuner(trainable, param_space={"a": tune.grid_search([1,2]), "b": 4}, run_config=tune.RunConfig(storage_path="~/tune_results", name="just_a_test"))
tuner.fit()