Hello,
I’m trying to run MAML on real world environment which control the industrial robot. The env is a wrapper for ROS2 which initialize communication with the robot and controls it. At the moment, I have only one robot available for the experiments. When running MAML, two envs are spawned (I supposed, one for meta training and the other for meta testing). And here is a problem, the second env is trying to established connection with already connected robot.
How can I configure or change MAML to use only one environment?
I have not used MAML but I faced a similar issue with other agents. My environment could accept multiple connections but the total number was limited by the number of cpus. In my case I was using envs_per_worker = m > 1 and noticed that there where always m connections sitting idle.
It turned out that rllib was creating multiple sets of environments. It would create 1 * m for the driver (where the trainer is running) and 1 * m for each worker specified by num_workers for a total of m + n*m instantiated environments. In your case I would guess that m=1, n=1 which results in 2 total connections. The default for “evaluation_num_workers” should be 0 but if you increase that it will also create environments for those too.
In my case, I determined that although the driver was instantiating the environment it was not using it and never called reset. The workaround that I settled on was to move the connection logic from the environment constructor to the reset method that way instantiating the environment did not automatically create a connection.
I will try to move env initialization to the reset() method. However, I can run PPO and SAC with no problem, when I set “num_workers = 1”. I think that, the problem is with option “create_env_on_driver”, but I’m not sure.
@Souphis thanks for pointing out “create_env_on_driver”. That option did not exist when I initially implemented my environment. It looks like MAML is using the environment on the driver as the “meta_env” to sample and store the tasks before sending them to the workers. This is why create_env_on_driver must be True.