Hello,
I have installed ray with the following command in my docker image:
RUN pip install pandas matplotlib opencv-python gym atari-py mpmath torch numpy dm_tree tabulate lz4 -U https://s3-us-west-2.amazonaws.com/ray-wheels/latest/ray-2.0.0.dev0-cp38-cp38-manylinux2014_x86_64.whl
it properly installed it, I can check from inside the container but if I inspect the compute_actions function, I can see:
My expectation is to see the following definition with prev_action_batch and prev_reward_batch:
Any good hint for this problem?
Hi @mg64ve,
You are comparing the compute_actions in the trainer (ray.rllib.agents.trainer.Trainer) with the compute actions in the policy (ray.rllib.policy.policy.Policy). They have different function signatures. You can get the policy from the trainer with trainer.get_policy()
Thanks @mannyv , the following code works as expected:
from ray.rllib.agents.ppo import PPOTrainer
trainer = PPOTrainer(env="CartPole-v0", config={"framework": "torch", "num_workers": 0})
policy = trainer.get_policy()
print(inspect.getsource(policy.compute_actions))