ValueError: `RLModule(config=[RLModuleConfig])` has been deprecated- New API Stack

High: Completely blocks me.

Hello,
During the training my costume environment in gym with Rllib(new API Stack) I get this error and couldn’t solve it up to now :slightly_frowning_face: I have defined my observations as Dictionary. It works well by old API Stack but not with new one. It completely blocks me to go forward. In order to have a better and simpler understanding I started to train the example in Rllib github in below link:

Unfortunately I get the same error. as clarification I define the config as below :

config = (
PPOConfig()
.api_stack(
enable_rl_module_and_learner=True,
enable_env_runner_and_connector_v2=True,
)
.environment(
env=“CartPoleWithDictObservationSpace”,
)
.training(
lr = agent_params[“lr”], # Learning rate
entropy_coeff= agent_params[“entropy_coeff”], # Encourage exploration with entropy regularization
)
.env_runners(num_env_runners = agent_params[“num_env”]) # Number of parallel environments
.framework(“torch”)

.rl_module(
rl_module_spec=RLModuleSpec(
    module_class=PPOTorchRLModule,
    inference_only=False,
    learner_only=False,
    observation_space=env_instance.observation_space,
    action_space=env_instance.action_space,
    model_config={"hidden_sizes": [128, 128]},  # Optional, passed to RLModule
)
)

Environment:

  • Ray version: 2.43.0
  • Python version: 3.12.9

thanks

Hi Ali! Have you had a chance to look at the old → new API migration guide? New API stack migration guide — Ray 2.44.0

I took a look and it seems like someone on Github opened up a similar issue, is it similar to the one you’re experiencing? RLlib new API stack false deprecation warning / MultiRLModuleSpec · Issue #51630 · ray-project/ray · GitHub :thinking:

1 Like

Hi Christina ,
firstly , thanks for your reply.
yes, I’ve seen that lots of times. As you see I arranged the rlmodule according the new API’s arrangement and also warning deprecation. I think New API module can not handle the dictionary observation. Actually I couldn’t find some resources for training the agent with dict observation and New API stack :slightly_frowning_face: . For some reasons I need to define my observations as dictionary and I’m eager to train it with new API stack. I would be so appreciable if you give me some hints or resources to address the issue more direct.
Thanks

Hi. Have you tried to remove that

observation_space=env_instance.observation_space,
action_space=env_instance.action_space,

from the RLModuleSpec() definition? It seems that those configurations can be figured out by the RLModuleSpec itself.

Hey @Ali_Zargarian , could you provide some minimal reproduction script, so we can debug this problem on our end?
Thanks!

Hi @sven1977 ,
Thanks for rapid reply :slightly_smiling_face:
As I said I took the environment from ray example Github as below:

and my training script is as below:

from cartpole_with_dict_observation_space import CartPoleWithDictObservationSpace

def env_creator(config):
    return CartPoleWithDictObservationSpace()

# Register the custom environment
register_env("CartPoleWithDictObservationSpace", env_creator)


env_instance = CartPoleWithDictObservationSpace()
config = (
    PPOConfig()
    .api_stack(
        enable_rl_module_and_learner=True,
        enable_env_runner_and_connector_v2=True,
    )
    .environment(
        env="CartPoleWithDictObservationSpace",
    )
    .training(
        lr = 0.0001,  # Learning rate
        entropy_coeff= 0.01,  # Encourage exploration with entropy regularization
    )
    .env_runners(num_env_runners = 1)  # Number of parallel environments
    .framework("torch")

)
RLModule(
    rl_module_spec=RLModuleSpec(
        module_class=PPOTorchRLModule,
        inference_only=False,
        learner_only=False,
        observation_space=env_instance.observation_space,
        action_space=env_instance.action_space,
        model_config={"hidden_sizes": [128, 128]},  # Optional, passed to RLModule
        catalog_class=PPOCatalog,
        )
    )

tuner = tune.Tuner(
    "PPO",
    param_space=config,
    run_config=tune.RunConfig(
        stop={"training_iteration": 5},  # Specify when to stop training
        checkpoint_config=train.CheckpointConfig(checkpoint_at_end=True),
    ),
)

# Run the tuner
results = tuner.fit()

# Get the best result based on a particular metric
best_result = results.get_best_result(
    metric="env_runners/episode_return_mean", mode="max"
)

Thanks

Unfortunately, it doesn’t work. Also it should be there according to deprecation warning.

Hello @sven1977 ,
any update? :wink:

Hello @sven1977
is there any update? As I’m working on another issue related to masking some actions, something as below :

it uses Dict observations again. The example explains that is suitable just to train with Old API stack .Because I don’t want to use Old API stack, I’ve been blocked again!!

Thanks in advanced

@Ali_Zargarian , this is indeed the old example. The example for the new API stack can be found here: ray/rllib/examples/rl_modules/action_masking_rl_module.py at master · ray-project/ray · GitHub

@sven1977
Ah.., Nice,
could you please take a look at example for train(or tune) the observation as dictionary(DictObservationSpace) with New API stack?

Thanks

@Ali_Zargarian , I don’t understand what you mean exactly. Could you elaborate? The observation_space in action masking is always dictionary. Do you mean using a dictionary observation space under the observations key?

No, I don’t mean the action_masking. Actually I mean training the environment with dictionary observation like this below example with New API stack(some chats in this topic before today’s chat):

some additional information from my side:

  • ray Version: 2.41.0
  • new version of RLlib API
  • tune.fit

Thanks