Restoring agent with Simplex action space

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

Hi everyone,

I have an agent that interacts with an environment that has Simplex action space.
I’ve successfully trained it and saved to my storage.
I’m encountering a problem with restoring the agent, where I’m getting the following error:

Traceback (most recent call last):
  File "/Users/ofircohen/Library/Caches/pypoetry/virtualenvs/carbon-release-optimization-v1Tg65LV-py3.10/lib/python3.10/site-packages/typer/main.py", line 326, in __call__
    raise e
  File "/Users/ofircohen/Library/Caches/pypoetry/virtualenvs/carbon-release-optimization-v1Tg65LV-py3.10/lib/python3.10/site-packages/typer/main.py", line 309, in __call__
    return get_command(self)(*args, **kwargs)
  File "/Users/ofircohen/Library/Caches/pypoetry/virtualenvs/carbon-release-optimization-v1Tg65LV-py3.10/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/Users/ofircohen/Library/Caches/pypoetry/virtualenvs/carbon-release-optimization-v1Tg65LV-py3.10/lib/python3.10/site-packages/typer/core.py", line 723, in main
    return _main(
  File "/Users/ofircohen/Library/Caches/pypoetry/virtualenvs/carbon-release-optimization-v1Tg65LV-py3.10/lib/python3.10/site-packages/typer/core.py", line 193, in _main
    rv = self.invoke(ctx)
  File "/Users/ofircohen/Library/Caches/pypoetry/virtualenvs/carbon-release-optimization-v1Tg65LV-py3.10/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/ofircohen/Library/Caches/pypoetry/virtualenvs/carbon-release-optimization-v1Tg65LV-py3.10/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/ofircohen/Library/Caches/pypoetry/virtualenvs/carbon-release-optimization-v1Tg65LV-py3.10/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/Users/ofircohen/Library/Caches/pypoetry/virtualenvs/carbon-release-optimization-v1Tg65LV-py3.10/lib/python3.10/site-packages/typer/main.py", line 692, in wrapper
    return callback(**use_params)
  File "/Users/ofircohen/Projects/carbon-release-optimization/src/evaluate/main.py", line 54, in evaluate_rl
    agent.from_checkpoint(checkpoint_dir)
  File "/Users/ofircohen/Library/Caches/pypoetry/virtualenvs/carbon-release-optimization-v1Tg65LV-py3.10/lib/python3.10/site-packages/ray/rllib/algorithms/algorithm.py", line 357, in from_checkpoint
    return Algorithm.from_state(state)
  File "/Users/ofircohen/Library/Caches/pypoetry/virtualenvs/carbon-release-optimization-v1Tg65LV-py3.10/lib/python3.10/site-packages/ray/rllib/algorithms/algorithm.py", line 387, in from_state
    new_algo.__setstate__(state)
  File "/Users/ofircohen/Library/Caches/pypoetry/virtualenvs/carbon-release-optimization-v1Tg65LV-py3.10/lib/python3.10/site-packages/ray/rllib/algorithms/algorithm.py", line 2942, in __setstate__
    self.workers.local_worker().set_state(state["worker"])
  File "/Users/ofircohen/Library/Caches/pypoetry/virtualenvs/carbon-release-optimization-v1Tg65LV-py3.10/lib/python3.10/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1463, in set_state
    self.policy_map[pid].set_state(policy_state)
  File "/Users/ofircohen/Library/Caches/pypoetry/virtualenvs/carbon-release-optimization-v1Tg65LV-py3.10/lib/python3.10/site-packages/ray/rllib/policy/torch_policy.py", line 771, in set_state
    super().set_state(state)
  File "/Users/ofircohen/Library/Caches/pypoetry/virtualenvs/carbon-release-optimization-v1Tg65LV-py3.10/lib/python3.10/site-packages/ray/rllib/policy/policy.py", line 1041, in set_state
    policy_spec = PolicySpec.deserialize(state["policy_spec"])
  File "/Users/ofircohen/Library/Caches/pypoetry/virtualenvs/carbon-release-optimization-v1Tg65LV-py3.10/lib/python3.10/site-packages/ray/rllib/policy/policy.py", line 168, in deserialize
    action_space=space_from_dict(spec["action_space"]),
  File "/Users/ofircohen/Library/Caches/pypoetry/virtualenvs/carbon-release-optimization-v1Tg65LV-py3.10/lib/python3.10/site-packages/ray/rllib/utils/serialization.py", line 324, in space_from_dict
    space = gym_space_from_dict(d["space"])
  File "/Users/ofircohen/Library/Caches/pypoetry/virtualenvs/carbon-release-optimization-v1Tg65LV-py3.10/lib/python3.10/site-packages/ray/rllib/utils/serialization.py", line 319, in gym_space_from_dict
    return space_map[space_type](d)
  File "/Users/ofircohen/Library/Caches/pypoetry/virtualenvs/carbon-release-optimization-v1Tg65LV-py3.10/lib/python3.10/site-packages/ray/rllib/utils/serialization.py", line 289, in _simplex
    return Simplex(**__common(d))
  File "/Users/ofircohen/Library/Caches/pypoetry/virtualenvs/carbon-release-optimization-v1Tg65LV-py3.10/lib/python3.10/site-packages/ray/rllib/utils/spaces/simplex.py", line 32, in __init__
    concentration.shape == shape[:-1]
AssertionError: (12,) vs ()

During the training I didn’t pass any concentration to the action space in the environment, so it uses the default (uniform concentration), that is being saved to the state of the agent.

Does anyone know why it happens? and how I can solve it?

Thanks in advance!

OK, so I did some workaround where I’ve set the concentration to None before calling the function Algorithm.from_state(state)