[SOLVED] Installing 2.0.0dev but I don't see compute_actions correct parameters


I have installed ray with the following command in my docker image:

RUN pip install pandas matplotlib opencv-python gym atari-py mpmath torch numpy dm_tree tabulate lz4 -U https://s3-us-west-2.amazonaws.com/ray-wheels/latest/ray-2.0.0.dev0-cp38-cp38-manylinux2014_x86_64.whl

it properly installed it, I can check from inside the container but if I inspect the compute_actions function, I can see:

My expectation is to see the following definition with prev_action_batch and prev_reward_batch:

Any good hint for this problem?

Hi @mg64ve,

You are comparing the compute_actions in the trainer (ray.rllib.agents.trainer.Trainer) with the compute actions in the policy (ray.rllib.policy.policy.Policy). They have different function signatures. You can get the policy from the trainer with trainer.get_policy()

Thanks @mannyv , the following code works as expected:

from ray.rllib.agents.ppo import PPOTrainer
trainer = PPOTrainer(env="CartPole-v0", config={"framework": "torch", "num_workers": 0})
policy = trainer.get_policy()