Shouldn’t a function that calls itself validate_config
operate passively on the data and simply provide feedback as to whether the configuration is valid or not? The problem is currently in validate_config
for R2D2
where the data to be validated is changed so that two successive calls result in a crash on the second call.
Lets say you have a config
which is valid and you do:
validate_config(config)
This will alter the config so a second call
validate_config(config)
will cause a crash.
Minimum example:
import ray.rllib.agents.dqn as dqn
config = dqn.r2d2.DEFAULT_CONFIG.copy()
dqn.r2d2.validate_config(config)
dqn.r2d2.validate_config(config) ### This will cause the crash
Unfortunately, it is not currently possible to use R2D2 with the Tune API. For unknown reasons, Tune (or whoever) calls the validate_config
function multiple times when you start Tune as follows:
tune.run(
"R2D2",
config=config,
...
)