Empty action_dict coming from policy net

Username1 · May 1, 2023, 4:58pm

How severe does this issue affect your experience of using Ray?

Medium: It contributes to significant difficulty to complete my task, but I can work around it.

I am using Ray 2.2.0 in a multi-agent custom environment. After the reset method, the action dictionary coming from the policy network is empty.

I am leaving a working example below. It is a very simple environment trained with Tune and Air. I might be doing something wrong but don’t know where.

I have added a print statement on the environment to print out whether the action_dict is empty. As follows:

    def step(self, action_dict):
        
        self.t +=1
        
        if not action_dict:
           print("EMPTY ACTION DICT!!!")
           print('self.t =', self.t)

The code can be found here:

github.com

lcipolina/Ray_tutorials/blob/main/RLLIB_MARL_Empty_action_Ray_2_2_0.ipynb

{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "provenance": [],
      "authorship_tag": "ABX9TyMzcF06Hlsmv6dYc1TLy5h7",
      "include_colab_link": true
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    },
    "language_info": {
      "name": "python"
    }
  },
  "cells": [
    {
      "cell_type": "markdown",

This file has been truncated. show original

Rohan138 · May 22, 2023, 11:35pm

from ray/multi_agent_env.py at master · ray-project/ray · GitHub

  The preferred format for action- and observation space is a mapping from agent
  ids to their individual spaces. If that is not provided, the respective methods'
  observation_space_contains(), action_space_contains(),
  action_space_sample() and observation_space_sample() have to be overwritten.

In your example, you should rewrite the obs and action spaces to be Dict() or modify the methods listed above.

Username1 · May 26, 2023, 12:16pm

Thank you @Rohan138, I can’t work with dictionaries because it makes no sense for my environment. So I’ll have to go with Tuples.

How do I implement the methods mentioned:

observation_space_contains(), action_space_contains(),
  action_space_sample() and observation_space_sample()

Is there any example of how they should be set up?

Thanks!

Topic		Replies	Views
Initial action for Dict action space RLlib	5	1392	July 23, 2021
[rllib] wrong action dimensions when using dictionary action space RLlib	3	586	July 15, 2021
Using random action policy with dict action space RLlib	0	282	April 12, 2021
Algo.train() calls env.step() with empty action object RLlib	1	241	December 21, 2023
Initialization of multiagent envs RLlib	8	507	August 31, 2022

Empty action_dict coming from policy net

Related topics