How severe does this issue affect your experience of using Ray?
Medium: It contributes to significant difficulty to complete my task, but I can work around it.
I am using Ray 2.2.0 in a multi-agent custom environment. After the reset method, the action dictionary coming from the policy network is empty.
I am leaving a working example below. It is a very simple environment trained with Tune and Air. I might be doing something wrong but don’t know where.
I have added a print statement on the environment to print out whether the action_dict is empty. As follows:
def step(self, action_dict):
self.t +=1
if not action_dict:
print("EMPTY ACTION DICT!!!")
print('self.t =', self.t)
The preferred format for action- and observation space is a mapping from agent
ids to their individual spaces. If that is not provided, the respective methods'
observation_space_contains(), action_space_contains(),
action_space_sample() and observation_space_sample() have to be overwritten.
In your example, you should rewrite the obs and action spaces to be Dict() or modify the methods listed above.