Run ONLY on local driver for train()

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

Hello, I’m trying to get a minimum working version running and somehow with my call to algo.train() my environment’s reset() is called twice, which creates an error in a simulation software I’m using. I have confirmed it visually by logging statements in the reset() function. I’m trying to run it only on my local driver. So everything should be executed only once, right? So I’m very confused as to what is happening. Is there something wrong with my config? Please advise.

env = MyMultiAgentEnv(exp_config)

tune.register_env('marlenv', lambda exp_config: MyMultiAgentEnv(exp_config))

config = PPOConfig() \
    .python_environment() \
    .resources(
        num_gpus=1,
    ) \
    .framework(
        framework='tf',
        eager_tracing=False,
    ) \
    .environment(
        env='marlenv',
        env_config=exp_config,
    ) \
    .rollouts(
        num_rollout_workers=0,
    ) \
    .training(
        train_batch_size=1000,
    ) \
    .exploration(

    ) \
    .multi_agent(
        policies=policies,
        policy_mapping_fn=lambda agent_id, episode, worker, **kwargs: 'my_ppo',
        count_steps_by='env_steps',
    ) \
    .offline_data(

    ) \
    .evaluation(
        evaluation_num_workers=0,
    ) \
    .reporting(

    ) \
    .debugging(
        log_level='DEBUG',
    )