Using a CNN to train on a gridworld environment

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

I’m a newbie to RLlib. I developed a few RLlib environments before and successfully used PPO to train agents on them.

Now I developed a GridWorld-style environment where agents roam around a 2D environment and conquer territory (basically mark some cells as belonging to them.)

I represent the observation as a 5-dimensional box of booleans: Two dimensions for the width and height of the box, one dimension for “Am I observing myself or other agents”, one dimension for “which of the other agents I’m observing” and one dimension for “am I observing an agent or the territory that an agent conquered?”

I want the agents to train on it and I want them to have a CNN. I’ve never used a CNN with RLlib before so I’m looking for the most straightforward and simple way to do that. How can I do that?

Also: I was told I need to get the CNN shape to be in the shape of my observation space, is that true?

Thanks for your help,
Ram Rachum.