Population Based Training Does Not Clone When Loggin Is Disabled

1. Severity of the issue: (select one)
Low: Annoying but doesn’t hinder my work.

2. Environment:

  • Ray version: 2.52.0
  • Python version: 3.12.3
  • OS: macOS 15.6.1
  • Cloud/Infrastructure: /
  • Other libs/tools (if relevant): torch 2.9.1

3. What happened vs. what you expected:

  • Expected: Population Based training should clone high-performing trials into low-performing trials. This should be independent from logging.
  • Actual: When logging is set to logging_level='error' no cloning is performed. Trials are just kept running with their starting parameters.

Population Based Training (PBT) in Ray should clone high-performing trials into low-performing ones, regardless of logging settings. However, your observation that setting logging_level='error' prevents cloning is not documented or expected in the official sources. The PBT scheduler’s core logic for exploitation and exploration is independent of logging configuration, as seen in the implementation and documentation—logging is only for informational output, not for controlling trial cloning or mutation behavior (PopulationBasedTraining docs, pbt.py source).

If PBT is not cloning when logging is set to ‘error’, this is likely a bug or unintended side effect, not an intended feature. There is no mention in the official documentation or code that logging level should affect PBT’s core functionality. Would you like a step-by-step breakdown of the relevant code paths or suggestions for debugging this issue?

Sources:

Hint: Mention @RunLLM in the post for followups.