Hi!
I’m working with Server-Client configuration in RLlib to run a simulator as a ExternalEnv.
The server side it’s running okay with a DQN algorithm.
In the client side I programed a loop of episodes wich contains star_episode()
, get_action()
, log_returns()
and end_episode()
modules of the Client class
to reach learning in the server and find the optimal policy.
I get a good comunication between server and client, but I can not register the end of an episode with end_episode()
command. It is executed, but don’t make any difference and in the log of data I have allways zero episode (when there are many others).
Anybody had the same problem?
If someone need more information, I can share my scripts. Thanks!!
I attached two images, one showed the total episodes and other the mean_td_error.
How is your env setup? Are the rewards reported to tensorboard normally? When running the cartpole client/server examples everything works fine for me but i’m having troubles reporting rewards from my multi agent environment. I’m not using an external env though so i’d be interested to see your setup.
Hi @Blubberblub, I have no experience with multi-agent environments, but I am interested in this topic for my experiments.
Perhaps I have made a mistake when saying that I programmed an ExternalEnv, because what I have really done has been to configure the server and the client, where the client hosts this ExternalEnv. My mistake was perhaps because the difference is not very clear in the documentation (I still think that maybe it is the same, since configuring the client or the ExternalEnv is very similar).
In ExternalEnv you don’t need to define server and client configurations. In the same ExternalEnv configuration, ray.run() is called and requests actions and logs their rewards (get_action, log_return, star_episode, end_episode) for learning in the Trainer.
You can see my repository with the client configuration (the ExternalEnv) here:
Best!