[RLlib,Tune] Relevance of __ref_ph in sample_collector experiment state

How severe does this issue affect your experience of using Ray?

  • Low: It annoys or frustrates me for a moment.
  • Medium: It contributes to significant difficulty to complete my task, but I can work around it.

Dear community,

for a couple of reasons, I am still using ray 2.10 on Windows. Migration to higher versions is in progress, but needs thorough testing and changes in my coding to adopt to new API stack.

In a bigger PPO hyperparameter tuning run involving 10 trials, tune paused after termination of the first 3 trials and then running 3 others. This pause was recognizable by the fact that the CLI progress reporter did not show any iter increase for 2 hours.

As guided by Tune CLI output after quitting the run, I wanted to use the great Tuner.restore() functionality to resume on the unfinished trials.

tuner = tune.Tuner.restore(
        path="experiment-directory",
        trainable="PPO",
        restart_errored=False,
        resume_errored=False,
        resume_unfinished=True,
    )
results = tuner.fit()

After finishing one of the unresumed trials, the system throws

ValueError: f176708f must be type or string

Digging deeper, this strange alpha-numeric code leads to following entry in the experiment_state.json file:

"sample_collector": [\n "__ref_ph",\n "f176708f"\n ],\n

This entry is set for every trial.

  1. What is the purpose of this key?
  2. Why does it disturb the restore and resume?
def deserialize_type(
    module: Union[str, Type], error: bool = False
) -> Optional[Union[str, Type]]:
    """Resolves a class path to a class.
    If the given module is already a class, it is returned as is.
    If the given module is a string, it is imported and the class is returned.

    Args:
        module: The classpath (str) or type to resolve.
        error: Whether to throw a ValueError if `module` could not be resolved into
            a class. If False and `module` is not resolvable, returns None.

    Returns:
        The resolved class or `module` (if `error` is False and no resolution possible).

    Raises:
        ValueError: If `error` is True and `module` cannot be resolved.
    """
    # Already a class, return as-is.
    if isinstance(module, type):
        return module
    # A string.
    elif isinstance(module, str):
        # Try interpreting (as classpath) and importing the given module.
        try:
            module_path, class_name = module.rsplit(".", 1)
            module = importlib.import_module(module_path)
            return getattr(module, class_name)
        # Module not found OR not a module (but a registered string?).
        except (ModuleNotFoundError, ImportError, AttributeError, ValueError) as e:
            # Ignore if error=False.
            if error:
                raise ValueError(
                    f"Could not deserialize the given classpath `module={module}` into "
                    "a valid python class! Make sure you have all necessary pip "
                    "packages installed and all custom modules are in your "
                    "`PYTHONPATH` env variable."
                ) from e
    else:
        raise ValueError(f"`module` ({module} must be type or string (classpath)!")

    return module