Ray actors not sending logs to main process

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

Hi folks,
I was trying to build a custom gym env for my use-case, and I was using ray for distributed training. However, I was not able to get the logs from python’s logging module when ray was initialized in distributed mode, i.e., when ray was initialized using ray.init(). Here is an example of the issue I am facing.

The code that I was using is in a github gist here: simplecorridor.py · GitHub

  1. Case-1: When ray is initialized using ray.init(local_mode=True), the python logger works within the environment, with the following output:

INFO:__main__: The environment initialization is doneee!!!!!!!!! (2022-07-21 10:30:30; simplecorridor.py:46)
2022-07-21 10:30:36,042	INFO services.py:1476 -- View the Ray dashboard at http://127.0.0.1:8266
2022-07-21 10:30:38,368	INFO trainer.py:2333 -- Your framework setting is 'tf', meaning you are using static-graph mode. Set framework='tf2' to enable eager execution with tf2.x. You may also then want to set eager_tracing=True in order to reach similar execution speed as with static-graph mode.
2022-07-21 10:30:38,491	WARNING worker.py:526 -- `ray.get_gpu_ids()` will always return the empty list when called from the driver. This is because Ray does not manage GPU allocations to the driver process.
:actor_name:RolloutWorker
:actor_name:RolloutWorker
2022-07-21 10:30:38,518	WARNING env.py:136 -- Your env doesn't have a .spec.max_episode_steps attribute. This is fine if you have set 'horizon' in your config dictionary, or `soft_horizon`. However, if you haven't, 'horizon' will default to infinity, and your environment will not be reset.
INFO:__main__: The reset function is executed (2022-07-21 10:30:38; simplecorridor.py:52)
INFO:__main__: The step function is executed (2022-07-21 10:30:38; simplecorridor.py:71)
2022-07-21 10:30:38,767	INFO tf_policy.py:171 -- TFPolicy (worker=1) running on CPU.
2022-07-21 10:30:38,909	INFO dynamic_tf_policy.py:727 -- Adding extra-action-fetch `vf_preds` to view-reqs.
2022-07-21 10:30:38,910	INFO dynamic_tf_policy.py:727 -- Adding extra-action-fetch `action_dist_inputs` to view-reqs.
2022-07-21 10:30:38,912	INFO dynamic_tf_policy.py:727 -- Adding extra-action-fetch `action_logp` to view-reqs.
2022-07-21 10:30:38,913	INFO dynamic_tf_policy.py:727 -- Adding extra-action-fetch `action_prob` to view-reqs.
2022-07-21 10:30:38,913	INFO dynamic_tf_policy.py:736 -- Testing `postprocess_trajectory` w/ dummy batch.
:actor_name:RolloutWorker
2022-07-21 10:30:39,869	WARNING env.py:136 -- Your env doesn't have a .spec.max_episode_steps attribute. This is fine if you have set 'horizon' in your config dictionary, or `soft_horizon`. However, if you haven't, 'horizon' will default to infinity, and your environment will not be reset.
**INFO:__main__: The reset function is executed (2022-07-21 10:30:39; simplecorridor.py:52)**
**INFO:__main__: The step function is executed (2022-07-21 10:30:39; simplecorridor.py:71)**

The logs from the environment are printed when a trainer instance is initialized in local mode.

  1. Case 2: When ray is initialized without local mode, the logs are as follows:

2022-07-21 10:33:49,515	INFO services.py:1476 -- View the Ray dashboard at http://127.0.0.1:8266
2022-07-21 10:33:52,544	INFO trainer.py:2333 -- Your framework setting is 'tf', meaning you are using static-graph mode. Set framework='tf2' to enable eager execution with tf2.x. You may also then want to set eager_tracing=True in order to reach similar execution speed as with static-graph mode.
(pid=16484) 
(pid=16489) 
(pid=16486) 
(pid=16771)   np_resource = np.dtype([("resource", np.ubyte, 1)])
(RolloutWorker pid=16764) 2022-07-21 10:33:55,009	WARNING env.py:136 -- Your env doesn't have a .spec.max_episode_steps attribute. This is fine if you have set 'horizon' in your config dictionary, or `soft_horizon`. However, if you haven't, 'horizon' will default to infinity, and your environment will not be reset.
(RolloutWorker pid=16765) 2022-07-21 10:33:54,994	WARNING env.py:136 -- Your env doesn't have a .spec.max_episode_steps attribute. This is fine if you have set 'horizon' in your config dictionary, or `soft_horizon`. However, if you haven't, 'horizon' will default to infinity, and your environment will not be reset.
(RolloutWorker pid=16764) 2022-07-21 10:33:55,137	INFO tf_policy.py:171 -- TFPolicy (worker=1) running on CPU.
(RolloutWorker pid=16764) 2022-07-21 10:33:55,170	INFO dynamic_tf_policy.py:727 -- Adding extra-action-fetch `vf_preds` to view-reqs.
(RolloutWorker pid=16764) 2022-07-21 10:33:55,170	INFO dynamic_tf_policy.py:727 -- Adding extra-action-fetch `action_dist_inputs` to view-reqs.
(RolloutWorker pid=16764) 2022-07-21 10:33:55,171	INFO dynamic_tf_policy.py:727 -- Adding extra-action-fetch `action_logp` to view-reqs.
(RolloutWorker pid=16764) 2022-07-21 10:33:55,172	INFO dynamic_tf_policy.py:727 -- Adding extra-action-fetch `action_prob` to view-reqs.
(RolloutWorker pid=16764) 2022-07-21 10:33:55,172	INFO dynamic_tf_policy.py:736 -- Testing `postprocess_trajectory` w/ dummy batch.
(RolloutWorker pid=16765) 2022-07-21 10:33:55,138	INFO tf_policy.py:171 -- TFPolicy (worker=2) running on CPU.
(RolloutWorker pid=16765) 2022-07-21 10:33:55,187	INFO dynamic_tf_policy.py:727 -- Adding extra-action-fetch `vf_preds` to view-reqs.
(RolloutWorker pid=16765) 2022-07-21 10:33:55,188	INFO dynamic_tf_policy.py:727 -- Adding extra-action-fetch `action_dist_inputs` to view-reqs.
(RolloutWorker pid=16765) 2022-07-21 10:33:55,189	INFO dynamic_tf_policy.py:727 -- Adding extra-action-fetch `action_logp` to view-reqs.
(RolloutWorker pid=16765) 2022-07-21 10:33:55,191	INFO dynamic_tf_policy.py:727 -- Adding extra-action-fetch `action_prob` to view-reqs.
(RolloutWorker pid=16765) 2022-07-21 10:33:55,191	INFO dynamic_tf_policy.py:736 -- Testing `postprocess_trajectory` w/ dummy batch.
(RolloutWorker pid=16766) 2022-07-21 10:33:55,102	WARNING env.py:136 -- Your env doesn't have a .spec.max_episode_steps attribute. This is fine if you have set 'horizon' in your config dictionary, or `soft_horizon`. However, if you haven't, 'horizon' will default to infinity, and your environment will not be reset.
(RolloutWorker pid=16765) Instructions for updating:
(RolloutWorker pid=16765) Use tf.where in 2.0, which has the same broadcast rule as np.where
(RolloutWorker pid=16766) 2022-07-21 10:33:55,267	INFO tf_policy.py:171 -- TFPolicy (worker=3) running on CPU.
(RolloutWorker pid=16766) 2022-07-21 10:33:55,306	INFO dynamic_tf_policy.py:727 -- Adding extra-action-fetch `vf_preds` to view-reqs.
(RolloutWorker pid=16766) 2022-07-21 10:33:55,307	INFO dynamic_tf_policy.py:727 -- Adding extra-action-fetch `action_dist_inputs` to view-reqs.
(RolloutWorker pid=16766) 2022-07-21 10:33:55,308	INFO dynamic_tf_policy.py:727 -- Adding extra-action-fetch `action_logp` to view-reqs.
(RolloutWorker pid=16766) 2022-07-21 10:33:55,308	INFO dynamic_tf_policy.py:727 -- Adding extra-action-fetch `action_prob` to view-reqs.
(RolloutWorker pid=16766) 2022-07-21 10:33:55,308	INFO dynamic_tf_policy.py:736 -- Testing `postprocess_trajectory` w/ dummy batch.
(RolloutWorker pid=16766) Instructions for updating:
(RolloutWorker pid=16766) Use tf.where in 2.0, which has the same broadcast rule as np.where
(RolloutWorker pid=16771) 2022-07-21 10:33:55,384	WARNING env.py:136 -- Your env doesn't have a .spec.max_episode_steps attribute. This is fine if you have set 'horizon' in your config dictionary, or `soft_horizon`. However, if you haven't, 'horizon' will default to infinity, and your environment will not be reset.
(RolloutWorker pid=16771) 2022-07-21 10:33:55,536	INFO tf_policy.py:171 -- TFPolicy (worker=4) running on CPU.
(RolloutWorker pid=16771) 2022-07-21 10:33:55,568	INFO dynamic_tf_policy.py:727 -- Adding extra-action-fetch `vf_preds` to view-reqs.
(RolloutWorker pid=16771) 2022-07-21 10:33:55,569	INFO dynamic_tf_policy.py:727 -- Adding extra-action-fetch `action_dist_inputs` to view-reqs.
(RolloutWorker pid=16771) 2022-07-21 10:33:55,569	INFO dynamic_tf_policy.py:727 -- Adding extra-action-fetch `action_logp` to view-reqs.
(RolloutWorker pid=16771) 2022-07-21 10:33:55,570	INFO dynamic_tf_policy.py:727 -- Adding extra-action-fetch `action_prob` to view-reqs.
(RolloutWorker pid=16771) 2022-07-21 10:33:55,570	INFO dynamic_tf_policy.py:736 -- Testing `postprocess_trajectory` w/ dummy batch.
(RolloutWorker pid=16771) Instructions for updating:
(RolloutWorker pid=16771) Use tf.where in 2.0, which has the same broadcast rule as np.where
2022-07-21 10:33:55,992	INFO worker_set.py:163 -- Inferred observation/action spaces from remote worker (local worker has no env): {'default_policy': (Box([0.], [20.], (1,), float32), Discrete(2)), '__env__': (Box([0.], [20.], (1,), float32), Discrete(2))}
2022-07-21 10:33:56,266	INFO tf_policy.py:171 -- TFPolicy (worker=local) running on CPU.
2022-07-21 10:33:56,315	INFO dynamic_tf_policy.py:727 -- Adding extra-action-fetch `vf_preds` to view-reqs.
2022-07-21 10:33:56,316	INFO dynamic_tf_policy.py:727 -- Adding extra-action-fetch `action_dist_inputs` to view-reqs.
2022-07-21 10:33:56,317	INFO dynamic_tf_policy.py:727 -- Adding extra-action-fetch `action_logp` to view-reqs.
2022-07-21 10:33:56,318	INFO dynamic_tf_policy.py:727 -- Adding extra-action-fetch `action_prob` to view-reqs.
2022-07-21 10:33:56,319	INFO dynamic_tf_policy.py:736 -- Testing `postprocess_trajectory` w/ dummy batch.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where (2022-07-21 10:33:56; deprecation.py:323)
2022-07-21 10:33:57,322	INFO rollout_worker.py:1793 -- Built policy map: {}
2022-07-21 10:33:57,322	INFO rollout_worker.py:1794 -- Built preprocessor map: {'default_policy': <ray.rllib.models.preprocessors.NoPreprocessor object at 0x7f1f7c542410>}
2022-07-21 10:33:57,323	INFO rollout_worker.py:670 -- Built filter map: {'default_policy': <ray.rllib.utils.filter.NoFilter object at 0x7f1f7c542390>}
2022-07-21 10:33:57,340	WARNING util.py:65 -- Install gputil for GPU system monitoring.

In this case, there are no logs from the custom gym environment that appear.

Can someone please help me figure out how I can get the logs from the custom environment from ray workers same as in case 1 (in bold text at the end of case 1 logs) once it is not running in local mode?

Thanks.

So I see that when in the code I change the logger level from logger.info to either logger.warning or logger.critical, then the logs from the environment also appear in the console and the log file.

Is there any reason for why this is the case or is this a bug?