Tensorboard events missing from rock-paper-scissors example?

Hello,

I am trying to figure out why I am not getting tensorboard event files in my own code. I am stuck debugging my own code, so I tried one of the examples.

I’m running the rock-paper-scissors multi-agent example from the documentation. I expect that there are tensorboard event files, but they don’t exists even though the usual progress.csv and result.json do:

➜  PG ls
PG_RockPaperScissors_767ac_00000_0_2021-10-12_19-59-25 experiment_state-2021-10-12_19-59-25.json
basic-variant-state-2021-10-12_19-59-25.json
➜  PG ls PG_RockPaperScissors_767ac_00000_0_2021-10-12_19-59-25
params.json  params.pkl   progress.csv result.json

In rock_paper_scissors_multiagent.py, lines 172, 175 and 178 do generate tensorboard event files, but line 169 does not. Above is the list of files in its output folder. Is that the correct behavior? It’s progress.csv has 151 rows and timesteps_total reaches 60,000.

Runtime environment: macOS. Python 3.8.10. Tensorflow 2.6.0. Ray 1.5.2. Freshly installed in pyenv virtualenv.
RLlib was installed by:

pip install 'ray[default]'==1.5.2
pip install 'ray[tune]'==1.5.2
pip install 'ray[rllib]'==1.5.2

Edit: There are no *TUNE* environment variables set.
Edit 2: runtime.

I didn’t save the running log. Here is a new run (still no tensorboard event file):

2021-10-12 22:02:53,828	INFO services.py:1245 -- View the Ray dashboard at http://127.0.0.1:8265
== Status ==
Memory usage on this node: 17.6/32.0 GiB
Using FIFO scheduling algorithm.
Resources requested: 0/16 CPUs, 0/0 GPUs, 0.0/9.05 GiB heap, 0.0/4.52 GiB objects
Result logdir: /Users/rick.lan/ray_results/PG
Number of trials: 1/1 (1 PENDING)


(pid=39584) 2021-10-12 22:02:59,963	INFO trainer.py:706 -- Tip: set framework=tfe or the --eager flag to enable TensorFlow eager execution
(pid=39584) 2021-10-12 22:02:59,963	INFO trainer.py:718 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags.
(pid=39584) 2021-10-12 22:03:00,892	WARNING util.py:55 -- Install gputil for GPU system monitoring.
== Status ==
Memory usage on this node: 17.6/32.0 GiB
Using FIFO scheduling algorithm.
Resources requested: 1.0/16 CPUs, 0/0 GPUs, 0.0/9.05 GiB heap, 0.0/4.52 GiB objects
Result logdir: /Users/rick.lan/ray_results/PG
Number of trials: 1/1 (1 RUNNING)


== Status ==
Memory usage on this node: 17.6/32.0 GiB
Using FIFO scheduling algorithm.
Resources requested: 1.0/16 CPUs, 0/0 GPUs, 0.0/9.05 GiB heap, 0.0/4.52 GiB objects (0.0/1.0 CPU_group_cabc3d4b400ba4b0e56329cb32b99ad3, 0.0/1.0 CPU_group_0_cabc3d4b400ba4b0e56329cb32b99ad3)
Result logdir: /Users/rick.lan/ray_results/PG
Number of trials: 1/1 (1 RUNNING)


== Status ==
Memory usage on this node: 17.6/32.0 GiB
Using FIFO scheduling algorithm.
Resources requested: 1.0/16 CPUs, 0/0 GPUs, 0.0/9.05 GiB heap, 0.0/4.52 GiB objects (0.0/1.0 CPU_group_cabc3d4b400ba4b0e56329cb32b99ad3, 0.0/1.0 CPU_group_0_cabc3d4b400ba4b0e56329cb32b99ad3)
Result logdir: /Users/rick.lan/ray_results/PG
Number of trials: 1/1 (1 RUNNING)


== Status ==
Memory usage on this node: 17.6/32.0 GiB
Using FIFO scheduling algorithm.
Resources requested: 1.0/16 CPUs, 0/0 GPUs, 0.0/9.05 GiB heap, 0.0/4.52 GiB objects (0.0/1.0 CPU_group_cabc3d4b400ba4b0e56329cb32b99ad3, 0.0/1.0 CPU_group_0_cabc3d4b400ba4b0e56329cb32b99ad3)
Result logdir: /Users/rick.lan/ray_results/PG
Number of trials: 1/1 (1 RUNNING)


== Status ==
Memory usage on this node: 17.5/32.0 GiB
Using FIFO scheduling algorithm.
Resources requested: 0/16 CPUs, 0/0 GPUs, 0.0/9.05 GiB heap, 0.0/4.52 GiB objects (0.0/1.0 CPU_group_cabc3d4b400ba4b0e56329cb32b99ad3, 0.0/1.0 CPU_group_0_cabc3d4b400ba4b0e56329cb32b99ad3)
Result logdir: /Users/rick.lan/ray_results/PG
Number of trials: 1/1 (1 TERMINATED)


2021-10-12 22:03:21,358	INFO tune.py:550 -- Total run time: 26.04 seconds (25.84 seconds for the tuning loop).
run_same_policy: ok.

Hi @RickLan,

Do you have tensorboardX installed?

Yes, the following code works.

from tensorboardX import SummaryWriter

writer = SummaryWriter()

for i in range(10):
  writer.add_scalar('y', i**2, i)

writer.close()

After upgrading to Ray 1.6.0 solves the problem.
Upgrading to Ray 1.7.0 also.