How severe does this issue affect your experience of using Ray?
- High: It blocks me to complete my task.
Hello, I’m currently using RLlib to manage a hierarchical system of agents. In this step I need to collect observation from an external source and use this data in the training.
I’ve implemented a custom ExternalMultiAgentEnv
, with all the necessary functions like run()
, get_action()
, start_episode()
, end_episode()
and log_returns()
.
In particular, I interact with the environmet trough an external script. In a call of the script I need to start the episode, get the action, apply it and in the following call I can compute the reward so log the returns and end the episode.
In the following scheme, let a for iteration be a call of the external script. This shows what I need to implement.
for i in range( episode_number):
... receive new data (needed to compute the reward)
if i > 0:
rewards = env.compute_cost(env._data)
env.log_returns(env.episode_id,rewards,env._get_info())
# end episode
env.end_episode(env.episode_id, env._state)
env.start_episode()
env.get_action()
... apply the action outside the script
I tried to use trainer.train()
as the documentations suggest, but doing like this I cannot separate the get_action
from the log_returns
in two successive calls. In fact it requires a full iteration to be completed inside a call.
I need to know how to implement this and still train the system of agents.