Tensorboard events missing from rock-paper-scissors example?

RickLan · October 12, 2021, 12:27pm

Hello,

I am trying to figure out why I am not getting tensorboard event files in my own code. I am stuck debugging my own code, so I tried one of the examples.

I’m running the rock-paper-scissors multi-agent example from the documentation. I expect that there are tensorboard event files, but they don’t exists even though the usual progress.csv and result.json do:

➜  PG ls
PG_RockPaperScissors_767ac_00000_0_2021-10-12_19-59-25 experiment_state-2021-10-12_19-59-25.json
basic-variant-state-2021-10-12_19-59-25.json
➜  PG ls PG_RockPaperScissors_767ac_00000_0_2021-10-12_19-59-25
params.json  params.pkl   progress.csv result.json

In rock_paper_scissors_multiagent.py, lines 172, 175 and 178 do generate tensorboard event files, but line 169 does not. Above is the list of files in its output folder. Is that the correct behavior? It’s progress.csv has 151 rows and timesteps_total reaches 60,000.

Runtime environment: macOS. Python 3.8.10. Tensorflow 2.6.0. Ray 1.5.2. Freshly installed in pyenv virtualenv.
RLlib was installed by:

pip install 'ray[default]'==1.5.2
pip install 'ray[tune]'==1.5.2
pip install 'ray[rllib]'==1.5.2

Edit: There are no *TUNE* environment variables set.
Edit 2: runtime.

RickLan · October 12, 2021, 1:05pm

I didn’t save the running log. Here is a new run (still no tensorboard event file):

2021-10-12 22:02:53,828	INFO services.py:1245 -- View the Ray dashboard at http://127.0.0.1:8265
== Status ==
Memory usage on this node: 17.6/32.0 GiB
Using FIFO scheduling algorithm.
Resources requested: 0/16 CPUs, 0/0 GPUs, 0.0/9.05 GiB heap, 0.0/4.52 GiB objects
Result logdir: /Users/rick.lan/ray_results/PG
Number of trials: 1/1 (1 PENDING)


(pid=39584) 2021-10-12 22:02:59,963	INFO trainer.py:706 -- Tip: set framework=tfe or the --eager flag to enable TensorFlow eager execution
(pid=39584) 2021-10-12 22:02:59,963	INFO trainer.py:718 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags.
(pid=39584) 2021-10-12 22:03:00,892	WARNING util.py:55 -- Install gputil for GPU system monitoring.
== Status ==
Memory usage on this node: 17.6/32.0 GiB
Using FIFO scheduling algorithm.
Resources requested: 1.0/16 CPUs, 0/0 GPUs, 0.0/9.05 GiB heap, 0.0/4.52 GiB objects
Result logdir: /Users/rick.lan/ray_results/PG
Number of trials: 1/1 (1 RUNNING)


== Status ==
Memory usage on this node: 17.6/32.0 GiB
Using FIFO scheduling algorithm.
Resources requested: 1.0/16 CPUs, 0/0 GPUs, 0.0/9.05 GiB heap, 0.0/4.52 GiB objects (0.0/1.0 CPU_group_cabc3d4b400ba4b0e56329cb32b99ad3, 0.0/1.0 CPU_group_0_cabc3d4b400ba4b0e56329cb32b99ad3)
Result logdir: /Users/rick.lan/ray_results/PG
Number of trials: 1/1 (1 RUNNING)


== Status ==
Memory usage on this node: 17.6/32.0 GiB
Using FIFO scheduling algorithm.
Resources requested: 1.0/16 CPUs, 0/0 GPUs, 0.0/9.05 GiB heap, 0.0/4.52 GiB objects (0.0/1.0 CPU_group_cabc3d4b400ba4b0e56329cb32b99ad3, 0.0/1.0 CPU_group_0_cabc3d4b400ba4b0e56329cb32b99ad3)
Result logdir: /Users/rick.lan/ray_results/PG
Number of trials: 1/1 (1 RUNNING)


== Status ==
Memory usage on this node: 17.6/32.0 GiB
Using FIFO scheduling algorithm.
Resources requested: 1.0/16 CPUs, 0/0 GPUs, 0.0/9.05 GiB heap, 0.0/4.52 GiB objects (0.0/1.0 CPU_group_cabc3d4b400ba4b0e56329cb32b99ad3, 0.0/1.0 CPU_group_0_cabc3d4b400ba4b0e56329cb32b99ad3)
Result logdir: /Users/rick.lan/ray_results/PG
Number of trials: 1/1 (1 RUNNING)


== Status ==
Memory usage on this node: 17.5/32.0 GiB
Using FIFO scheduling algorithm.
Resources requested: 0/16 CPUs, 0/0 GPUs, 0.0/9.05 GiB heap, 0.0/4.52 GiB objects (0.0/1.0 CPU_group_cabc3d4b400ba4b0e56329cb32b99ad3, 0.0/1.0 CPU_group_0_cabc3d4b400ba4b0e56329cb32b99ad3)
Result logdir: /Users/rick.lan/ray_results/PG
Number of trials: 1/1 (1 TERMINATED)


2021-10-12 22:03:21,358	INFO tune.py:550 -- Total run time: 26.04 seconds (25.84 seconds for the tuning loop).
run_same_policy: ok.

mannyv · October 12, 2021, 1:54pm

Hi @RickLan,

Do you have tensorboardX installed?

RickLan · October 12, 2021, 1:54pm

Yes, the following code works.

from tensorboardX import SummaryWriter

writer = SummaryWriter()

for i in range(10):
  writer.add_scalar('y', i**2, i)

writer.close()

RickLan · October 14, 2021, 7:59am

After upgrading to Ray 1.6.0 solves the problem.
Upgrading to Ray 1.7.0 also.

Topic		Replies	Views
Tensorboard doesn't give any output RLlib	8	1494	September 27, 2021
Ray tune not outputting events file for tensorboard	8	1408	August 16, 2021
Tune.run is not writing tensorboard log file RLlib	4	833	August 24, 2021
Tensorboard did not work with rays_results RLlib	1	420	April 5, 2023
[Tune Class API + PyTorch] Possible to add more custom scalars+weights+biases to Tensorboard events file? Ray Tune	11	2652	March 31, 2021

Tensorboard events missing from rock-paper-scissors example?

Related topics