Model isn't changing throw config

agent = ppo.PPOTrainer(config)
policy = agent.get_policy()
print(policy.model) # Prints the model summary

where config looks the following:

  "batch_mode": "truncate_episodes",
  "clip_param": 0.3,
  "entropy_coeff": 0.0,
  "entropy_coeff_schedule": null,
  "env": "CityFlows",
  "env_config": {
    "config_path": "examples/1x1/config.json",
    "reward_func": "delay_from_opt",
    "steps_per_episode": 1000
  "evaluation_interval": 3,
  "evaluation_num_episodes": 20,
  "framework": "torch",
  "grad_clip": null,
  "kl_coeff": 0.2,
  "kl_target": 0.01,
  "lambda": 1.0,
  "lr": 5e-05,
  "lr_schedule": null,
  "model": {
    "_disable_action_flattening": false,
    "_disable_preprocessor_api": true,
    "_use_default_native_models": false,
    "conv_activation": "relu",
    "conv_filters": null,
    "custom_model": null,
    "custom_model_config": {}
  "num_sgd_iter": 30,
  "rollout_fragment_length": 200,
  "seed": 123,
  "sgd_minibatch_size": 128,
  "shuffle_sequences": true,
  "train_batch_size": 4000,
  "use_critic": true,
  "use_gae": true,
  "vf_clip_param": 10.0,
  "vf_loss_coeff": 1.0
With Actor: PPO
With Model: CNN
With ENV: CityFlows

(using torch as framework)
However, when I print the model I’m getting FullyConnectedNetwork.
Any Idea how the config[model] not effect the agent model?

Hey @dev0guy , thanks for posting this question. It looks like RLlib produced a default FullyConnected network under the hood for you, probably based on your observation space (can you post your observation space here?).

Since you don’t havea custom_model defined, RLlib will only use a CNN model by default if your observation space is something like: Box(-1.0, 1.0, shape=(a, b, c)) (3D tensor).

1 Like