Attribute Error when using Custom gymnasium Environment

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

I am learning how to use Ray and the book I am using was written using an older version or Ray. I am currently running into an issue with RLlib where the problem seems to be stemming from using a Custom Environment. The code errors out with a AttributeError: 'NoneType' object has no attribute 'build_encoder'. It seems this is an issue with building a catalog but at this stage of my learning of ray I don’t really know what that is. Any help would be appreciated. Here is the code I am using:

import gymnasium as gym
from gymnasium.spaces import Discrete
import os
from ray.tune.logger import pretty_print
from ray.rllib.algorithms.dqn import DQNConfig
from ray.tune.registry import register_env

class MyEnvironment(gym.Env):

    seeker, goal = (0, 0), (4, 4)
    info = {'seeker': seeker, 'goal': goal}

    def __init__(self,  config=None):
        self.action_space = Discrete(4)
        self.observation_space = Discrete(5*5)

    def reset(self, seed=None, options=None):
        """Reset seeker and goal positions, return observations."""
        self.seeker = (0, 0)
        self.goal = (4, 4)

        return self.get_observation(), {}

    def get_observation(self):
        """Encode the seeker position as integer"""
        return 5 * self.seeker[0] + self.seeker[1]

    def get_reward(self):
        """Reward finding the goal"""
        return 1 if self.seeker == self.goal else 0

    def is_done(self):
        """We're done if we found the goal"""
        return self.seeker == self.goal

    def step(self, action):
        """Take a step in a direction and return all available information."""
        if action == 0:  # move down
            self.seeker = (min(self.seeker[0] + 1, 4), self.seeker[1])
        elif action == 1:  # move left
            self.seeker = (self.seeker[0], max(self.seeker[1] - 1, 0))
        elif action == 2:  # move up
            self.seeker = (max(self.seeker[0] - 1, 0), self.seeker[1])
        elif action == 3:  # move right
            self.seeker = (self.seeker[0], min(self.seeker[1] + 1, 4))
        else:
            raise ValueError("Invalid action")

        return self.get_observation()[0], self.get_reward(), self.is_done(), False, self.info

    def render(self, *args, **kwargs):
        """Render the environment, e.g. by printing its representation."""
        os.system('cls' if os.name == 'nt' else 'clear')
        try:
            from IPython.display import clear_output
            clear_output(wait=True)
        except Exception:
            pass
        grid = [['| ' for _ in range(5)] + ["|\n"] for _ in range(5)]
        grid[self.goal[0]][self.goal[1]] = '|G'
        grid[self.seeker[0]][self.seeker[1]] = '|S'
        print(''.join([''.join(grid_row) for grid_row in grid]))

config = (
    DQNConfig()
      .environment(MyEnvironment, env_config={})
      .env_runners(
          num_env_runners = 2, 
          create_env_on_local_worker = True, 
      )
      .api_stack(
          enable_rl_module_and_learner=True, 
          enable_env_runner_and_connector_v2=True
      )
)

algo = config.build_algo()

@Dekermanjian,

Would it be easier to downgrade to the same version of Ray the book uses?

Thank you for the quick response, @mannyv.

Yes that would certainly be an option and it would be the simplest option. But then that would mean that I would always have to use a downgraded version of Ray because I don’t know how to work around the error with a current version of Ray.

@Dekermanjian,

I guess it sorry of depends on your learning objectives. If you want to learn about reinforcement learning, in a hands on way, and ticket with how the pieces got together then it really doesn’t matter which version you use.

On the other hand, the RLLIB implementation and core abstractions have changed drastically in the past two years and are still partway through that migration. You could expect to spend a lot of time figuring out how the new pieces match up with the old ones and how to convert everything. There is also a lot of valuable learning and skills to develop in doing that but those are not really reinforcement learning related.

Thank you @mannyv for that information. The book I am going through is about how to scale workloads with Ray. It walks through the different components of Ray and discusses how they work and provides some nice simple examples. The focus of the book is not reinforcement learning. The focus is on how to use Ray to scale up your workload. Which is why I am trying to go through the book with a newer version of Ray.

It seems like with the changes that have been made to Ray RLlib, the issue I am facing is not an easy to get working. Is that correct?