[BUG] Heavy logic problem in validate_config for R2D2

Shouldn’t a function that calls itself validate_config operate passively on the data and simply provide feedback as to whether the configuration is valid or not? The problem is currently in validate_config for R2D2 where the data to be validated is changed so that two successive calls result in a crash on the second call.

Lets say you have a config which is valid and you do:
validate_config(config)
This will alter the config so a second call
validate_config(config)
will cause a crash.

Minimum example:

import ray.rllib.agents.dqn as dqn
config = dqn.r2d2.DEFAULT_CONFIG.copy()
dqn.r2d2.validate_config(config)
dqn.r2d2.validate_config(config) ### This will cause the crash

Unfortunately, it is not currently possible to use R2D2 with the Tune API. For unknown reasons, Tune (or whoever) calls the validate_config function multiple times when you start Tune as follows:

tune.run(
    "R2D2",
    config=config,
    ...
)