How to Configure Ray to Use Standard Python Logger for Multiprocessing?

How severe does this issue affect your experience of using Ray?

  • Medium: It contributes to significant difficulty to complete my task, but I can work around it.

I am trying to use Ray to improve the speed of a process and I want the log messages to be passed to the standard Python logger. This way, the application can handle formatting, filtering, and saving the log messages. However, when I use Ray, the log messages are not formatted according to my logger configuration and are not passed back to the root logger. I tried setting log_to_driver=True and configure_logging=True in ray.init() , but it didn’t solve the problem. How can I configure Ray to use the standard Python logger for multiprocessing?"

Here is an example that should demonstrate the issue:

from ray.util.multiprocessing import Pool
import pathlib
import logging
import json

def setup_logging(config_file: pathlib.Path):
    with open(config_file) as f_in:
        config = json.load(f_in)
    logging.config.dictConfig(config)

logger = logging.getLogger(__name__)
config_file = pathlib.Path(__file__).parent / "log_setup/config_logging.json"
setup_logging(config_file=config_file)

def f(index):
    logger.warning(f"index: {index}")
    return (index, "model")

if __name__ == "__main__":
    logger.warning("Starting")
    pool = Pool(1)
    results = pool.map(f, range(10))
    print(list(results))

where I have the config of the logger as:

{
    "version": 1,
    "disable_existing_loggers": false,
    "formatters": {
      "detailed": {
        "format": "[%(levelname)s|%(name)s|%(module)s|L%(lineno)d] %(asctime)s: %(message)s",
        "datefmt": "%Y-%m-%dT%H:%M:%S%z"
      }
    },
    "handlers": {
      "stdout": {
        "class": "logging.StreamHandler",
        "level": "INFO",
        "formatter": "detailed"
      }
    },
    "loggers": {
      "root": {
        "level": "DEBUG",
        "handlers": [
          "stdout"
        ]
      }
    }
  }

If I just use the python map, I would get the following printed:

[WARNING|__main__|ray_trial|L28] 2024-03-27T15:14:21+0100: Starting
[WARNING|__main__|ray_trial|L22] 2024-03-27T15:14:21+0100: index: 0
[WARNING|__main__|ray_trial|L22] 2024-03-27T15:14:21+0100: index: 1
[WARNING|__main__|ray_trial|L22] 2024-03-27T15:14:21+0100: index: 2
[WARNING|__main__|ray_trial|L22] 2024-03-27T15:14:21+0100: index: 3
[WARNING|__main__|ray_trial|L22] 2024-03-27T15:14:21+0100: index: 4
[WARNING|__main__|ray_trial|L22] 2024-03-27T15:14:21+0100: index: 5
[WARNING|__main__|ray_trial|L22] 2024-03-27T15:14:21+0100: index: 6
[WARNING|__main__|ray_trial|L22] 2024-03-27T15:14:21+0100: index: 7
[WARNING|__main__|ray_trial|L22] 2024-03-27T15:14:21+0100: index: 8
[WARNING|__main__|ray_trial|L22] 2024-03-27T15:14:21+0100: index: 9

But when I use Ray I get:

2024-03-27 14:54:55,064 INFO worker.py:1743 -- Started a local Ray instance. View the dashboard at 127.0.0.1:8265 
(PoolActor pid=43261) index: 0
(PoolActor pid=43261) index: 1
(PoolActor pid=43261) index: 2
(PoolActor pid=43261) index: 3
(PoolActor pid=43261) index: 4
(PoolActor pid=43261) index: 5
(PoolActor pid=43261) index: 6
(PoolActor pid=43261) index: 7
(PoolActor pid=43261) index: 8
(PoolActor pid=43261) index: 9