Marwil : Postprocessing of multi-agent data not implemented yet

How severe does this issue affect your experience of using Ray?

  • Medium: It contributes to significant difficulty to complete my task, but I can work around it.

In MARWIL, it is said on main page that multi-agent is supported. However, when I feed a MultiAgentBatch, I get the following exception with _postprocess_if_needed :

raise NotImplementedError(
“Postprocessing of multi-agent data not implemented yet.”)

Beta is obviously > 0 (=0 is just BC), and when you want to work with a positive beta, it is mandatory to have a postprocessing. Is there a way to fix this easily? For now, I replaced locally the error with “return batch” and it seems to be working fine… My advantages are already calculated in my JSONs after all?

Thanks

If anyone one day wants to hack the current rllib implementation, you can use this in json_reader.py, l. 135 :

    else:
        # TODO(ekl) this is trickier since the alignments between agent
        #  trajectories in the episode are not available any more.
        # raise NotImplementedError(
        #     "Postprocessing of multi-agent data not implemented yet.")
        policy_batches = {}
        for agent in list(batch.policy_batches.keys()):
            out = []
            for sub_batch in batch.policy_batches[agent].split_by_episode():
                out.append(self.ioctx.worker.policy_map.get(agent).postprocess_trajectory(sub_batch))
            policy_batches[agent] = SampleBatch.concat_samples(out)

        processed_batch = MultiAgentBatch(policy_batches, env_steps=batch.count)
        return processed_batch

seems to be doing ok.